A formal neuron is a mathematical and data-processing representation of a biological Neuron. The formal neuron generally has several entries and an exit which correspond respectively to the dendrites and with the Cône of emergence of the biological neuron (starting point of the Axone). The exiting and inhibiting actions of the Synapse S are represented, most of the time, by numerical coefficients (synaptic weights) associated with the entries. The numerical values of these coefficients are adjusted in a phase of training. In its simplest version, a formal neuron calculates the balanced sum of the received entries, then applies to this value a function of activation, generally nonlinear. The end value obtained is the exit of the neuron.
The formal neuron is the basic unit of the networks of artificial neurons in which it is associated with its similar to calculate arbitrarily complex functions, used for various applications in Artificial intelligence.
Mathematically, the formal neuron is a function with several Variable S and real values .
McCulloch and Pitts studied in fact the analogy between the human brain and the universal data-processing machines. They showed in particular that a network (buckled) made up of the formal neurons of their invention to the same computing power as a Machine of Turing.
In spite of simplicity of this modeling, or perhaps thanks to it, the formal neuron called of McCulloch and Pitts remain today a basic element of the networks of artificial neurons. Many alternatives were proposed, more or less biologically plausible, but being generally based on the concepts invented by the two authors. It is known nevertheless today that this model is only one approximation of the functions filled by the real neuron and, that in no way, it can be useful for a major comprehension of the nervous system.
In the model of McCulloch and Pitts, at each entry is associated a synaptic weight, i.e. a noted numerical value of for the entry until for the entry . The first operation carried out by the formal neuron consists of a sum of the sizes received in entries, balanced by the synaptic coefficients, i.e. the sum
With this size a threshold is added. The result is then transformed by a nonlinear function of activation (sometimes called function of exit), . The exit associated with the entries with is thus given by
,
In the formulation of origin of McCulloch and Pitts, the function of activation is the Fonction of Heaviside (function in stair ), whose value is 0 or 1. Sometimes in this case, one prefers to define the exit by the following formula
,
who justifies the name of threshold given to the value . Indeed, if the sum exceeds the exit of the neuron is 1, whereas it is worth 0 in the contrary case: is thus the threshold of activation of the neuron, if it is considered that exit 0 corresponds to an “extinct” neuron.
When the neurons are combined in a network of formal neurons, it is important for example that the function of activation of some of them is not a Polynôme subject limiting the computing power of the network obtained. A caricatural case of limited power corresponds to the use of a linear function of activation , like the function identity: in such a situation the total calculation carried out by the network is him-also linear and it is thus perfectly useless to use several neurons, only one giving of the strictly equivalent results.
However, the functions of the sigmoid type are generally limited. In certain applications, it is important that the exits of the network of neurons are not limited a priori : certain neurons of the network must then use a function of not limited activation. The function identity is generally chosen.
It is as useful in practice as the function of activation presents a certain form of regularity. To calculate the Gradient error made by a network of neurons, at the time of sound training, it is necessary that the function of activation is derivable. To calculate the Matrice hessienne of the error, which is useful for certain analyzes of error, it is necessary that the function of activation is derivable twice. As they generally comprise singular points, the linear functions per pieces are used relatively little in practice.
,
have the important properties evoked previously (it is not polynomial and is indefinitely continuously derivable). Moreover, a simple property makes it possible to accelerate the calculation of its derivative, which reduces time calculation necessary to the training of a network of neurons. One has indeed
.
One can thus calculate the derivative of this function in a very effective point of way starting from his value in this point.
Moreover, the sigmoid function is with values in the interval , which makes it possible to interpret the exit of the neuron like a probability. It is also related to the logistic model of Régression and appears naturally when one considers the problem of the optimal separation of two classes of Gaussian distributions with same the matrix of covariance.
However, the numbers with which work the computers make this function difficult to program.
Indeed, , and with the precision of the numbers with comma of the computers, , and thus .
When you try to code a network of neuron while making this error, after training, some is the values that you in entry of your network, you will put will always obtain the same result.
There exists fortunately of many solutions for using all the same in your program, and some of enters are dependant on a language. It very amount using a representation of the floating numbers, more precise than the IEEE754. Sometimes, to standardize the entries between 0 and 1 can be enough to regulate the problem. .
,
also is very much used in practice, because it shares with the sigmoid function certain practical characteristics:
One cannot however give him such a clear probabilistic interpretation.
Like the neuron of McCulloch and Pitts, the neurons presented in this section have entered numerical.
Each entry is thus associated with a value . The comparison enters the two sets of values is generally made within the meaning of the euclidian norm. More precisely, the neuron starts by calculating the following size
The neuron transforms then the value obtained thanks to a function of activation. Its exit thus is finally given by
In practice, it is very current to use a function of Gaussian activation defined by
Let us consider for example the case of two entries. The exit of a neuron of McCulloch and Pitts is written in the form , whereas that of a neuron Sigma-pi is given by
In the general case with entered, one obtains an exit of the following form
The formulation used watch which there exist many possibilities to build a neuron Sigma-pi for a given number of entries. This is related to the exponential growth with the number of entries of the number of subsets of entries usable to build a neuron Sigma-pi: there exists indeed possible combinations (by regarding the empty combination as that corresponding to the threshold ). In practice, when becomes large (for example from 20), it becomes quasi-impossible to use all the possible terms and the products thus should be chosen to be privileged. A traditional solution consists in being restricted with subsets of entered for a low value of .
The general formulation of these neurons is also at the origin of the name Sigma-pi which refers to the Greek letters capital Sigma (Σ) and pi (Π) used in mathematics respectively to represent the sum and the product.
If the is positive or null, the exit of the neuron is worth always 1 (in agreement with the definition of the function of Heaviside).
If , the table becomes: and the neuron thus calculates a OR logical.
If , the table becomes: and the neuron thus calculates a AND logical.
Lastly, if , the neuron always gives a null result.
By the same type of reasoning, one notes that a neuron at a entry can not have any effect (neuron identity) or carry out a NOT logic.
By linearity of the balanced sum, if one multiplies at the same time the synaptic weights and the threshold of a neuron by a positive number unspecified, the behavior of the neuron is unchanged, and the modification is completely indistinguishable. On the other hand, if one multiplies all by a negative number, the behavior of the neuron is reversed, since the function of activation is increasing.
By combining neurons of McCulloch and Pitts, i.e. by using the exits of unquestionable neurons like entries for other neurons, one can thus carry out any Switching function. When one authorizes moreover connections forming of the loops in the network, one obtains a system with the same power as a machine of Turing.
| Random links: | Nobuhiro Watsuki | Media Access Control | Shigeru Kanno | Demography of Slovenia | Romuald Peiser |