This article describes the implementation of software that provides a visual demonstration of a neural network as it learns to emulate different boolean functions, which can be useful to understand the learning process. The software that is built in this article works like this:
- The user sets a boolean function to be emulated by the network by inputting a truth table;
- The model of the neural network is built;
- The neural network learning starts, and allows the user to observe changes in the various parameters through the learning process.
This article will describe:
- The neural network model choice,
- The main software elements, written in JavaScript,
- The neural network model implementation, allowing the user to set the model parameters and to watch the learning process.
1. The neural network model choice.
The main problem when designing any neural network model is that there is limited theoretical basis about the specific required number of layers and neurons in each layer, though it is true that larger networks generally are capable of learning more complex functions. In practice, therefore, it’s common to experiment with several different architectures, trying each in turn until good performance is achieved.
The neural network described and implemented in this article was meant to emulate complex boolean functions, so it was decided to design the network architecture with these points in mind:
- Every boolean function can be represented in the Conjunctive Normal Form (CNF);
- The neural network models of the boolean operators
NOT
,AND
, andOR
can each be represented as a single neuron.
From these points, it is hypothesised that the neural network model should contain three layers:
- The first layer should perform the
NOT
operator; - The second layer should perform the
AND
operator; - The third layer should perform the
OR
operator.
The sigmoid function was chosen as an activation function. It has the formula S(x) = 1/(1+e-x),
meaning that:
- It will always output values between
0
and1
; S(x) < 0.1
whenx < -2.2
;S(x) > 0.9
whenx > 2.2
;S(x) + S(-x) = 1
for any value ofx
, thereforeS(x)
andS(-x)
are equidistant from1
and0
respectively.
The neural network learns using the backpropagation method. The output values from each neuron are calculated during the forward pass, and the weights of each connection are updated as the error is propagated backwards through the network during the backwards pass.
The structure of the neural network will have the following features:
- The network has as many inputs as the boolean function that it is to emulate has arguments - this number is denoted as
N
; - The first layer will have 2N neurons;
- The second layer will have 2N neurons;
- The third layer will have 1 neuron;
- The network has 1 output.
Implementing the NOT
operator with a single neuron. The NOT
operator inverts the input, i.e. f(0) = 1
and f(1) = 0
. It can be implemented using a single neuron, as shown in figure 1. It has one input with the value x
and weight ω
, and one input with the value 1
and weight ω0
.
To implement the NOT
operator, a few conditions must be satisfied:
- When
x = 0, f(x) = S(ω0) > 0.9
. Solving the equation of the neuron and the sigmoid activation gives thatω0 > 2.2
; - When
x = 1, f(x) = S(ω + ω0) < 0.1
. Solving the neuron and sigmoid activation equation again gives thatω + ω0 < -2.2
.
It was assumed that when x = 0
and when x = 1
, results of the f(x)
are equidistant from 1
and 0
respectively. Therefore ω + ω0 = -ω0
or ω = -2ω0
. Thus for the implementation of the neuron of the NOT
operator and equidistant values, the neuron can have weights that satisfy:
Implementing the OR
operator with a single neuron. The OR
operator can be implemented as a single neuron, as shown in figure 2. It has N inputs with values xi
and weights ω
, and one input with the value 1
and weight ω0
.
To implement an OR
operator, these conditions must be met:
- When all values
xi = 0, f(x) = S(ω0) < 0.1
. Solving the equation of the neuron and the sigmoid activation gives thatω0 < -2.2
; - When any one of the values
xi = 1, f(x) = S(ω + ω0) > 0.9
. Solving the equation of the neuron and the sigmoid activation gives thatω + ω0 > 2.2
.
It was assumed that when only one of the xi = 1
and when all xi = 0
, results of the f(x)
are equidistant from 1
and 0
respectively. Therefore ω + ω0 = -ω0
or ω = -2ω0
. Thus for the implementation of the neuron of the OR
operator with equidistant values, the neuron can have weights that satisfy:
Implementing the AND
operator with a single neuron. The AND
operator can be implemented as a single neuron, as shown in figure 3. It has N inputs with values xi
and weights ω
, and one input with the value 1
and weight ω0
.
To implement the AND
operator, these conditions must be met:
- When one of the values
xi = 0, f(x) = S(ω(N - 1) + ω0) < 0.1
. Solving the equation of the neuron and the sigmoid activation gives thatω(N - 1) + ω0 < -2.2
; - When all values
xi = 1, f(x) = S(ωN + ω0) > 0.9
. Solving the equation of the neuron and the sigmoid activation gives thatωN + ω0 > 2.2
.
It was assumed that when only one of the xi = 0
and when all xi = 1
, results of the f(x)
are equidistant from 0
and 1
respectively. Therefore ωN - ω + ω0 = -ωN - ω0
or ω0 = 0.5ω(1 - 2N)
. Thus for the implementation of the neuron of the AND
operator with equidistant values, the neuron can have weights that satisfy:
2. Main software elements written in JavaScript.
An application was built that implements a neural network to emulate a boolean expression, and that allows the user to set parameters and watch the learning process. The software has two main parts:
- A module that implements the neural network itself and carries out its learning;
- A module that displays the state of elements of the neural network model to the user.
The module that implements the neural network contains five types of components:
- Connection - implements connections between neurons;
- CommonNode - implements the neuron itself (the weighted sum function);
- InputNode - Implements an input to the network;
- OutputNode - Implements an output from the network;
- OneNode - Implements offset neurons (neurons that always output a value of 1).
Connection. The Connection object is shown in figure 4.
It has these fields and methods:
- Field weight - contains the value of the weight coefficient;
- Field fromNode - points to the neuron from which the connection goes;
- Field toNode - points to the neuron to which the connection goes;
- Method adjustWeight - calculates the weight coefficient at the end of the learning iteration with the use of the fromNode.nodeResult and toNode.delta fields.
CommonNode. The CommonNode object is shown in figure 5.
It has these fields and methods:
- Field inputConnections - the list of all input connections;
- Field outputConnections - the list of all output connections;
- Field nodeResult - contains the neuron output value;
- Field delta - contains the neuron’s error value;
- Method predictResult - calculates the neuron’s output. The weight and fromNode fields are taken from each input connection in the inputConnections list. Each weight is then multiplied by its corresponding fromNode.nodeResult, and the results of these multiplications are summed together. This sum is then run through the activation function, and the final result is written to the field nodeResult.
- Method calculateDelta - calculates the neuron’s error value. First, the weight and toNode fields are taken from each output connection in the outputConnections. Each weight is then multiplied with the corresponding toNode.delta, and the results are summed together. The final sum is multiplied by nodeResult(1 - nodeResult) and written to the field delta.
InputNode. The InputNode object is inherited from CommonNode; it is identical to CommonNode except for the method:
- Method predictResult - sets the field nodeResult to the value of the input to the node.
OutputNode. The OutputNode object is inherited from CommonNode; it is identical to CommonNode except for the method:
- Method calculateDelta - calculates the difference between the field nodeResult and the expected output of the network. This difference is written to the field delta.
OneNode. The OneNode object is inherited from CommonNode; it is identical to this except for these methods:
- Method predictResult - sets the value of the nodeResult field to 1.
- Method calculateDelta - empty.
The module that models the neural network works in three stages:
- The building stage (building the neural network from parameters);
- The neural network learning stage;
- The end-of-learning stage.
The building stage is performed in seven steps and is shown in figure 6:
- The empty Nodes list is created.
- The empty Connections list is created.
- N objects of the InputNode type are created and added to the Nodes list one by one.
- To build the first hidden layer, 2N OneNode and 2N CommonNode objects are created and added to the Nodes list one by one. Connection objects, which describe all connections between the input and the first layers, are created and added to the Connections list.
- To build the second hidden layer, 2N OneNode and 2N CommonNode objects are created and added to the Nodes list one by one. Connection objects, which describe all connections between the first and the second layers, are created and added to the Connections list.
- To build the third hidden layer, one OneNode object and one CommonNode object are created and added to the Nodes list. Connection objects, which describe all connections between the second and the third layers, are created and added to the Connections list.
- An OutputNode object is created and added to the Nodes list. A Connection object, which describes the connection between the third layer and the output, is created and added to the Connections list.
The neural network learning stage is performed by iterative calls by the function setTimeout. Every iteration contains three steps:
- The data is prepared - the expected result of the boolean function being emulated is manually calculated from the input data, to compare the output of the network to.
- The predictResult method is called for each element in the Nodes list from the first element in ascending order. The predictResult method will always use already calculated data (i.e. predictResult always operates on the output of the previous node’s predictResult, so that the input propagates forwards through the network from input to output).
- The calculateDelta method is called for each element in the Nodes list from the last element in descending order. The adjustWeight method is called for every input Connection object before moving to the next element of the Nodes list (i.e. weight adjustments propagate through the network backwards, from output to input).
The end-of-learning stage occurs when the error value of the neural network output is less than the 0.1 for any input value.
The module that displays the state of each element of the neural network model performs the following actions:
- It determines the number of frames needed to fully animate a forward and backward pass during the neural network building stage;
- The state of the neural network model is fixed while the first frame is prepared and drawn;
- The setInterval function draws every frame;
- When the last frame is reached, the display module returns to the first frame again.
3. Visualising the neural network and the learning process, and allowing the user to set parameters.
This section discusses the implementation of the visualisation of the neural network model.
Elements:
- Input neurons are situated on the left. Every input neuron displays the value that it will feed to the network;
- Hidden layer neurons are represented as rectangles. Each of these neurons displays its output value, which is calculated by summing the weighted product of the outputs from each neuron in the previous layer. The weights for these are taken from the connections to the previous layer’s neurons;
- The “1” neurons are neurons that always have a value of 1. The output of each “1” neuron is linked to the corresponding neuron of the hidden layer;
- The output neuron is shown on the right of the visualisation. It displays the value calculated by the neural network for the given input values;
- Connections are situated between pairs of neurons and link the output of one neuron to the input of another. Every connection has a weight, and will display with a different color depending on the value of this weight. The colors range from red (negative) to green (positive), with yellow being used to represent a value of zero.
Additional elements:
- The visualisation of the neural network includes a selector with a set of input values. Changing these allows the user to see the visualisation of the network change in response to these input values. This choice has no influence on the learning process. If the neural network is learning then the change will happen at the end of the currently displayed iteration. If the neural network has finished learning then the change will happen immediately;
- The color scale is situated in the left-bottom corner of the display. It shows how the color of the connections changes with the weight of that connection;
- The iteration counter is shown in the bottom-right corner of the presentation. It shows the number of completed learning iterations. Usually, not all iterations of learning are displayed in the visualisation: several hundred iterations will happen for each iteration that is displayed.
Presentation:
- All weights have random values from 0 to 1 initially, making the color of all connections yellow to yellow-green;
- The animation displays the calculation order in the neural network model;
- The animation of the forward propagation displayed from left to right - the output value of the neuron is calculated from the values of the previous layer neurons. The calculations in each layer happen sequentially from top to bottom;
- The neuron displays/changes its value after the calculation animation;
- The animation of the backward propagation is displayed from right to left. The weights will appear to change sequentially, and the connection colors will change after the animation.
To initially test the finished software, it was confirmed that the software can be used to build a network that can emulate simple boolean functions: NOT
, OR
, and AND
.
The visualisation of a single neuron learning to emulate the NOT function is shown below. The weights of the neuron connections are also displayed. Press “Start” to launch the learning process.
It can be seen that the values of the connection weights satisfy the requirements described earlier: ω0 > 2.2, ω + ω0 < -2.2
.
The visualisation of a single neuron learning to emulate the OR
function is shown below. The weights of the neuron connections are also displayed. Press “Start” to launch the learning process.
The software shows that the values of the connection weights satisfy the requirements described earlier: ω0 < -2.2, ω + ω0 > 2.2
.
The visualisation of a single neuron learning to emulate the AND
function is shown below. The weights of the neuron connections are also displayed. Press “Start” to launch the learning process.
As with the other examples, it may be seen that the values of the connection weights end up satisfying the requirements described earlier: ω + ω0 < -2.2, 2ω + ω0 > 2.2
.
Something more complex was then attempted. The user can define a three-input boolean function, and visualise a neural network as it learns to mimic this function. The weights of the neuron connections will not be displayed this time, but each neuron in the hidden layers will display the calculated error value. The error value could be seen to update during the backward propagation animation.
To define the boolean function that the network will emulate, enter the desired output for each input in table 1. Click on an item in the “Result” column to change its value to the opposite. Press “Start” to launch the learning process. The outputs from the neural network will be shown in table 2, along with the expected output based on the values you entered in table 1.
Summary. This article has discussed the process of implementing software to visualise the learning of a neural network. The neural network model choice is described and justified, and the main elements of the software are detailed. The final piece of software is then demonstrated, first with the simple OR
, AND
, and NOT
functions, and then with a more complicated three-input boolean function described using a truth table.