Sunday, 1 July 2018

Coding matters

Or back to basics, having been prompted by something or other to think about them.

It is often said that one can do everything one might want to do in the way of computation with an appropriate combination of the logical operations AND, OR and NOT. Purists will point to the more tricky operations of joint denial and alternative denial, each of which is sufficient alone, but we do not bother with them here.

Turning to neurons, we then have the question of whether they are sufficient in the same way. Can neurons do everything that one might want to do in the way of computation?

Stepping systems

We first look at a simple class of system, which, in addition to stepping, are feed forward and binary.

Figure 1: a gate
Such a system is composed of a finite number of gates, each gate having one or more inputs and exactly one output, a simple, replicable function of the inputs, defined by its truth table, that is to say a table giving the output for every possible combination of inputs, the sort of thing that could easily be written down in an Excel worksheet. Same answer every time. One such gate is illustrated above, green inputs left, blue gate in the middle and green output right. Sample truth table, far right, with the fourth row being shown, in the green, left.

While the logic of the system is expressed in the composition, in the arrangement of all the gates, the data is the inputs and outputs. This data is all binary, that is to say each input or output takes the value zero or one. We allow the output of any gate to be input for any number of other gates, and we allow any gate to have any number of inputs, but this composition is done in such a way that there are no loops, one cannot get from the output of any gate back to an input for that same gate. Original inputs are inputs which are not the output of any gate. Final outputs are outputs which are not the input of any gate.

We start by setting the original inputs to whatever their values are to be and all other inputs and outputs to zero. By convention, the original inputs hold their values; they are given this value at each step which follows.

At each step we compute the output of each gate from the then current values of its inputs. We keep going for just as long as something changes, just as long as at least one output is set to a new value.

It can probably be proved that such a system will always terminate after a while, with the final outputs being what the system has calculated for the original inputs.

Figure 2: a simple layered system
In practise, many such systems are also layered, with the gates arranged in layers, typically quite a small number of layers, say less than ten, and with computation moving through one layer at each step.

A small such system, with two layers of blue gates and three layers of green data is shown above, with the steps moving from left to right. Original inputs left, final outputs right.

We note in passing that the ‘deep’ bit of deep learning refers to the number of layers, with the big break through being learning how to train the input weights (see below) when one had lots of layers. With one of the more successful workers here being one Demis Hassabis, once of Deep Mind, now of Google.

We can apply the further restriction that we have just three sorts of gate, logical and, logical or and logical not. It might, in addition, be pictorially convenient to allow a fourth gate, the identity gate which just passes its one input across to its one output.

It can be proved that a great deal of computation can be done with such a system, in fact just the same computations as could have been done by any more elaborate collection of gates – with the catch that the network of these basic gates can get very complicated. One gets a more easily understood and more easily tested system if one does have such an elaborate collection of gates, doing something a bit more complicated than the basic logical operations, perhaps something like the machine code of ICL computers of the 1970’s. Put another way, moderately complicated times moderately complicated is easier to understand than very complicated times simple.

A great deal more can be done if we do allow loops – with the catch being that the computation might go on for a lot longer and may not terminate at all. It may get into a loop or it may just wander on forever, perhaps calculating the successive binary digits of a transcendental number.

Neurons

For our purposes, we use a very simple model of a neuron, with many inputs and one output. Inputs and outputs take the values zero and one. Such neurons can be assembled into the stepping systems just described, with the neurons as the gates, with the wrinkle that each input to each neuron has been assigned a weight, a real number which might be positive or negative.

We define the gate function of a neuron by saying that the output of a neuron is one if the sum of the weighted input values is greater than or equal to one, zero otherwise. From which definition we can, if so desired, write out the truth table for the neuron in question.

It may be easy enough to show whether or not any truth table, of the sort illustrated above, can be expressed in weights in this way. We have not made the attempt.

We note in passing that there are plenty of much more complicated, much more realistic models of neurons out there. Real neurons are very complicated bits of machinery which comes in lots of shapes and sizes and what we have here is an elementary abstraction drawn from the fact that suitably activated neurons fire. In what follows we also use the word ‘poke’.

Figure 3: doing basic logic with neurons
The diagram above illustrates one way to do the three logic operations. Left, we require both inputs to fire to get the output neuron to fire. Middle, we require at least one. Right, we introduce the device in brown of an input neuron which always fires, while the input neuron has a negative weight, with the result that the output neuron fires in the case that the input neuron does not fire. In a real neuron this would be called an inhibitory postsynaptic potential, as opposed to an excitatory postsynaptic potential.

Both AND and OR generalise in the obvious way to three or more inputs. But even then, we are only using a fraction of the capability of our neurons. Much more elaborate functions can be implemented by using many inputs and many weights.

Computed gotos

Apart from loops, an important idea in a regular computer is that of the computed goto, by which we mean something like goto location X, fetch the contents of location X and put them in location Y or write the contents of location Y to location X, where X is expressed by a number. In the beginning one had it that there were few locations Y (central) but lots of locations X (peripheral). So the computer had to be able to convert a number standing for X to a place, where the number of such places used to be in tens of thousands and is now in ten of millions. Such ideas are now, for example, hidden in the arrays which one uses in languages like Visual Basic: ‘Dim X (1 to 10000) As Integer’, for example, is an array of 10,000 integer numbers.

So how does one do this sort of thing in the world of neurons?

Figure 4: option 1
Here the idea is that our number is held as the number of bottom layer neurons which are firing and that every bottom layer neuron pokes every top layer neuron, a many to many array of synaptic connections. The top layer neurons then fire if they have enough input, inhibiting the ones to their right if they do fire, so that we only have one answer at the top layer. An idea which will get a bit stretched, rounding errors will creep in, if we want to address arrays with more than around 10 elements.

The blue spots show the top layer neurons which will fire at the first step for the number 8. At the second step, all the numbers less than 8, all the neurons to the right of 8, will be and will remain inhibited, and the process stops there.

Figure 5: option 2
Here the idea is that our number is held in its binary form by the bottom layer neurons, each of which pokes around half of the neurons in the top layer. Inhibition the same, but now each top layer neuron is only poked by the logarithm of the number of elements in our array, which would probably work better than the first idea.

So the answer seem to be that yes, one can do it, but it would require a lot of connections and might get a bit clumsy, when numbers get large.

Conclusions 

All of which suggests that everything can indeed be done using a neural network, rather in the way that everything can be done using the Turing machine, invented by Alan Turing back in 1936 and described at reference 1.

While allowing that neural networks might have different strengths and weaknesses from those of a Turing machine, or from those of a regular computer. One might well be able to cycle from Land’s End to John o’Groats, but most people prefer to take the bus. Or to vary a well used phrase, the devil is in the scaling up.

A rather different problem is that of how a brain might – or might not – evolve to do this sort of thing.

References

Reference 1: https://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/turing-machine/one.html. To make a change from Wikipedia!

Reference 2: https://en.wikipedia.org/wiki/Demis_Hassabis.

No comments:

Post a Comment