History of the Perceptron

The evolution of the artificial neuron has progressed through several stages. The roots of which, are firmly grounded within neurological work done primarily by Santiago Ramon y Cajal and Sir Charles Scott Sherrington . Ramon y Cajal was a prominent figure in the exploration of the structure of nervous tissue and showed that, despite their ability to communicate with each other, neurons were physically separated from other neurons. With a greater understanding of the basic elements of the brain, efforts were made to describe how these basic neurons could result in overt behaviors, to which William James was a prominent theoretical contributor.

The McCulloch-Pitts neuron worked by inputting either a 1 or 0 for each of the inputs, where 1 represented true and 0 false. Likewise, the threshold was given a real value, say 1, which would allow for a 0 or 1 output if the threshold was met or exceeded. Thus, in order to represent the “and” function, we set the threshold at 2.0 and come up with the following truth table:

Input x₁	Input x₂	Output
0	0	0
0	1	0
1	0	0
1	1	1

This table shows the basic “and” function such that, if x1 and x2 are both false, then the output of combining these two will also be false. Likewise, if x1 is true or equal to 1 and x2 is true or equal to 1, then the threshold of 2 will be met and the output will be 1.

This follows also for the “or” function, if we switch the threshold value to 1. The table for the “or” function being,

Input x₁	Input x₂	Output
0	0	0
0	1	1
1	0	1
1	1	1

This type of artificial neuron could also be used to solve the “not” function, which would have only one input, as well as, the NOR and NAND functions. The McCulloch-Pitts neuron, therefore, was very instrumental in progressing the artificial neuron, but it had some serious limitations. In particular, it could solve neither the “exclusive or” function (XOR), nor the “exclusive nor” function (XNOR). Limited to binary code, the following truth tables could not be accurately solved using this early artificial neuron.

Input x₁	Input x₂	Output
0	0	0
0	1	1
1	0	1
1	1	0

Input x₁	Input x₂	Output
0	0	1
0	1	0
1	0	0
1	1	1

One of the difficulties with the McCulloch-Pitts neuron was its simplicity. It only allowed for binary inputs and outputs, it only used the threshold step activation function and it did not incorporate weighting the different inputs.

In 1949, Donald Hebb would help to revolutionize the way that artificial neurons were perceived. In his book, The Organization of Behavior, he proposed what has come to be known as Hebb’s rule. He states, “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.” [1] Hebb was proposing not only that, when two neurons fire together the connection between the neurons is strengthened, but also that this activity is one of the fundamental operations necessary for learning and memory.

For the artificial neuron, this meant that the McCulloch-Pitts neuron had to be altered to at least allow for this new biological proposal. The method used was to weight each of the inputs. Thus, an input of 1 may be given more or less weight, relative to the total threshold sum.

Frank Rosenblatt, using the McCulloch-Pitts neuron and the findings of Hebb, went on to develop the first perceptron. This perceptron, which could learn in the Hebbean sense, through the weighting of inputs, was instrumental in the later formation of neural networks. He discussed the perceptron in his 1962 book, Principles of Neurodynamics. A basic perceptron is represented as follows:

This perceptron has a total of five inputs a1 through a5 with each having a weight of w1 through w5. [2] Each of the inputs are weighted and summed at the node. If the threshold is reached, an output results. Of great importance is that each of the inputs may not be given equal weight. The perceptron may have “learned” to weight a1 more than a2 and so on.

The summation formula for determining whether or not the threshold (θ) is met for the artificial neuron with N inputs (a₁, a₂…a_N) and their respective weights of w₁, w₂,…w_N is:

The activation function used by McCulloch and Pitts was the threshold step function. However, other functions that can be used are the Sigmoid, Piecewise Linear and Gaussian activation functions. These functions are shown below. [3] (See the glossary attached to this applet for the corresponding mathematical formulas.)

Despite the many changes made to the original McCulloch-Pitts neuron, the perceptron was still limited to solving certain functions. Unfortunately, Rosenblatt was overly enthusiastic about the perceptron and made the ill-timed proclamation that:

"Given an elementary α-perceptron, a stimulus world W, and any classification C(W) for which a solution exists; let all stimuli in W occur in any sequence, provided that each stimulus must reoccur in finite time; then beginning from an arbitrary initial state, an error correction procedure will always yield a solution to C(W) in finite time…” [4]

With these types of remarks Rosenblatt had drawn a line in the sand between those in support of perceptron styled research and the more traditional symbol manipulation projects being performed by Marvin Minsky . As a result, in 1969, Minsky co-authored with Seymour Papert , Perceptrons: An Introduction to Computational Geometry. In this work they attacked the limitations of the perceptron. They showed that the perceptron could only solve linearly separable functions. Of particular interest was the fact that the perceptron still could not solve the XOR and NXOR functions. Likewise, Minsky and Papert stated that the style of research being done on the perceptron was doomed to failure because of these limitations. This was, of course, Minsky’s equally ill-timed remark. As a result, very little research was done in the area until about the 1980’s.

What would come to resolve many of the difficulties was the creation of neural networks. These networks connect the inputs of artificial neurons with the outputs of other artificial neurons. As a result, the networks were able to solve more difficult problems, but they have grown considerably more complex. However, many of the artificial neural networks in use today still stem from the early advances of the McCulloch-Pitts neuron and the Rosenblatt perceptron.

[1] Hebb, Donald O. (1949). The Organization of Behavior. New York: Wiley, pg. 62.

[2] The diagram is from, http://www.neuroscience.com/Technologies/nn_history.htm

[3] Graph diagrams of the functions are from, http://home.cc.umanitoba.ca/~umcorbe9/neuron.html#Theory

[4] Rosenblatt, Frank (1962). Principles of neurodynamics. New York: Spartan. Cf. Rumelhart, D.E., J. L. McClelland and the PDP Research Group (1986). Parallel Distributed Processing vol. 1&2. Cambridge: MIT.

		_N
b	=	(∑w_ja_j)	+	θ
		^j=1

Threshold Step	Sigmoid	Piecewise Linear	Gaussian