Algorithm with estimate of distribution
The algorithms with estimate of distribution (“Estimate off Algorithms Distribution”, EDA , in English) train a family of Métaheuristique S inspired by the genetic algorithms. They are used to solve problems of optimization, via the handling of a sampling of the function describing the quality of the possible solutions. Like all the métaheuristiques ones using a population of points, they are iterative.
Contrary to the algorithms évolutionnaires " classiques" , the heart of the method consists in estimating the relations between the various variables of a problem of optimization, thanks to the estimate of a probability distribution, associated with each point of the sample. They thus do not employ operators of crossing or change, the sample being directly built starting from the parameters of distribution, estimated with the preceding iteration.
Algorithm
The vocabulary related to the algorithms with estimate of distribution is borrowed from that of the algorithms évolutionnaires, one will thus speak about “population of individuals” rather than of “sample of points”, or of “fitness” rather than of “function objective”, nevertheless, all these terms have the same significance.
General algorithm
The basic algorithm proceeds in the following way for each iteration:
- random pulling of a whole of points according to a given probability distribution,
- selection of the best points,
- extraction of the parameters of the probability distribution describing the distribution of these points.
More precisely, the algorithm proceeds as follows:
- To draw M individuals randomly, to form a population D .
- I = 0
- As long as a criterion of stop is not checked:
- I = 0
- I = I + 1
- To select NR individuals (with NR < M ) in the preceding population (), to form the population: .
- To estimate a probability distribution , describing the pattern of the settlement .
- To draw randomly M individuals in .
- To select NR individuals (with NR < M ) in the preceding population (), to form the population: .
- Fine of the loop.
Example for the problem “one max”
In the problem of the “one max”, one seeks to maximize the number of 1 on a number of dimensions given. For a problem with 3 dimensions, a solution X = {1,1,0} will thus have a better quality than a solution X = {0,1,0} , the optimum being I
. One thus seeks to maximize a function , where X can take the value 0 or 1 .
The first stage consists in randomly drawing the individuals, with for each variable, a chance on two to draw one 1 or one 0. Differently say, one uses a probability distribution , where is the probability that each element is equal to 1. The distribution is thus factorized like a product of 3 univariant marginal distributions of Bernoulli, of parameter 0.5.
Example of pulling of the population , with a population of 6 individuals, the last line indicates the probability for each variable:
The following stage consists of the selection of the best individuals, to form . In our example, it is simply a question of keeping only the 3 best individuals:
It is noted that the three parameters () characterizing the probability distribution () changed after the selection. By using this new distribution, one can draw a new population:
And so on until checking a criterion of stop (for example when all the individuals are with the optimum, like individual 1 of the table above).
Behavior
It was shown (generally using model of Markov or dynamic systems) that the majority of the versions for combinative optimization are convergent (i.e. it can find the optimum in a finished time). For the alternatives treating of optimization in real variables, convergence is even easier to show, for little that the models of distributions used allow the ergodicity (i.e. it is then possible to reach any solution with each movement), but one is often satisfied quasi-ergodicity (if the métaheuristique one can reach any solution in a finished number of movements).
Models of distributions
The behavior of the algorithms with estimate of distribution rests mainly on the choice of the model of distribution used to describe the state of the population. Pedro Larranaga and his/her colleagues propose to classify these models according to their degree of taking into account of the dependences between the variables:
- model without model dependences,
- with dependences Bi-alternatives,
- model with dependences multi-alternatives.
In the case of the models without dependences, the probability distribution is built starting from a whole of distributions defined on only one variable. Say differently, the distribution is factorized starting from univariant, independent distributions on each variable.
The example given higher for the problem of the “one max” returns in this category: the probability of having one 1 in variable X influences the probability of having one 1 in variable X, it does not have there no correlation between the two variables.
The models without dependences are simple to handle, but they have the defect to be not very representative of the problems of difficult optimization, where the dependences are often numerous. It is possible to treat the dependences apart from the model of distribution, but the algorithm can then become more delicate to handle.
In the type of the models with dependences Bi-alternatives, one will be able to use distributions Bi-alternatives as bases. Larranaga and Al then propose to classify the training in the concept of structure .
In the models with dependences multi-alternatives, all the dependences are taken into account in the model.
The world of the estimate of distribution
History
-
1965 : Rechenberg designs the first using algorithm of the Stratégies of evolution .
- 1975: working on the cellular automats, Holland proposes the genetic first algorithms .
- 1990: The algorithms of colony of ants are proposed by Marco Dorigo, in its thesis of doctorate.
- 1994: S. Baluja takes as a starting point the algorithms évolutionnaires and proposes the incremental Apprentissage with population (“Population Based Incremental Learning”, PBIL).
- 1996: Mühlenbein and Paaß propose the algorithm with estimate of distribution .
Alternatives
The most known alternatives of the estimate of distribution are the incremental training with population (“Population Based Incremental Learning”, PBIL), the algorithm with univariée marginal distribution (“Univariate Marginal Algorithm Distribution”, UMDA) or the compact genetic algorithm (“Compact Genetic Algorithm”, CGA).
There exist also alternatives using of the mechanisms of Partitionnement of data for multimode optimization, of the adaptations to parallel calculation, etc
Of share the central place of with dimensions probabilist, the estimate of distribution shares many common points with the Stratégies of evolution, one of first metaheuristic proposed, and the algorithms of colony of ants. But one can also point the similarities with the Recuit simulated (which uses the function objective like probability distribution to build a sample) and the genetic algorithms, from which the algorithms with estimate of distribution result, and of which they always use the operators of selection.
In the same way, one finds many common points between these métaheuristiques of optimization and the tools of the machine Learning, like the methods using of the decision trees or the model of Gaussian mixtures. The difference is sometimes difficult to specify; one can indeed meet Métaheuristique S carrying out of the tasks of training, the methods of training solving of the problems of difficult Optimization, or of the tools of training used within metaheuristic.
References
Sources
-
Pedro Larrañaga and Jose A. Lozano (editors), Estimate off Algorithms Distribution: In New Tool for Evolutionary Computation (Genetic Algorithms and Evolutionary Computation) , ED. Kluwer Academic Publishers, 416 pages, 2001. ISBN 0792374665.
- Johann Dréo, Alain Petrowski, Eric Taillard and Patrick Siarry, Métaheuristiques for difficult optimization , ED. Eyrolles, Paris, September 2003, Stitched, 356 pages, ISBN 2-212-11368-4.
| Random links: | Economic scenes | TeorÃa de BCS | Debrecen VSC | Park of Montville | To marble Morrison | Behavioral imitation | Parasitoid |