Method of least squares
The method of least squares , independently worked out by Gauss and Legendre, makes it possible to compare given experimental, generally sullied with errors of measurement to a Mathematical model supposed to describe these data.
This model can take various forms. It can be a question of laws of conservation that the measured quantities must respect. The method of least squares then makes it possible to minimize the impact of the experimental errors in “adding information” in the process of measurement.
In the case more running, the ideal model is a family of functions ƒ ( X ; ) of one or more dummy variables X , indexed by one or more unknown parameters . The method of least squares makes it possible to select among these functions, that which reproduces best the experimental data. One speaks in this case about adjustment by the method of least squares . If the parameters have a direction Physique the procedure of adjustment also gives a indirect estimate of the value of these parameters.
The method consists of a regulation (initially empirical) which is that the function ƒ ( X ; ) which describes “best” the data is that which minimizes the quadratic sum of the deviations measurements with the predictions of ƒ ( X ; ). If for example, we have NR measurements, ( yi ) I = 1, NR the “optimal” parameters within the meaning of the method of least squares is those which minimize the quantity:
where the are the residues with the model, i.e the differences between the points of measurement and the model . can be regarded as a measurement of the distance between the experimental data and the ideal model which predicts these data. The regulation of least squares orders than this distance is minimal.
If, as it is generally the case, one has an estimate of the standard deviation σ I of the noise which affects each measurement yi , one uses it “to weigh” the contribution of measurement to the χ ². A measurement will all the more have weight which its uncertainty will be weak:
Its extreme simplicity makes that this method is very usually used nowadays in applied sciences. A current application is the smoothing experimental data by an empirical function (linear function, polynomials or splines). However its most important use is probably the measurement of physical quantities starting from experimental data. In many cases, the quantity which one seeks to measure is not observable and seems only indirectly parameter of an ideal model F ( X , ). In this last case of figure, it is possible to show that the method of least squares makes it possible to build a Estimateur , which checks certain conditions of optimality. In particular, when the model F ( X , ) is linear according to , the Théorème of Gauss-Markov guarantees that the method of least squares makes it possible to obtain the not-skewed estimator less dispersed. When the model is a non-linear function of the parameters the estimator is generally skewed. In addition, in all the cases, the estimators obtained are extremely sensitive to the aberrant points: one translates this fact by saying that they are nonrobust. Several techniques however allow of “robustifier” the method.
History
The day of the New year of 1801, the Italian astronomer Giuseppe Piazzi discovered the asteroid Cérès. It then could follow its trajectory during 40 days. During this year, several scientists tried to predict his trajectory on the basis of observation of Piazzi (at that time, the resolution of the nonlinear equations of Kepler of the Cinématique is a very difficult problem). The majority of the predictions were erroneous; and only sufficiently precise calculation to allow Zach, a German astronomer, to locate Cérès with the end of the year again, was that of Carl Friedrich Gauss, then 24 years old (it had already carried out the development of the fundamental concepts in 1795, when it was then 18 years old). But its method of least squares was published only in 1809, when it appeared in volume 2 of its work on the Celestial mechanics , Theoria Motus Corporum Coelestium in sectionibus conicis solem ambientium . The French mathematician Adrien-Marie Legendre independently developed the same method in 1805.
In 1829, Gauss could give the reasons of the effectiveness of this method; indeed, the method of least squares is precisely optimal with regard to many criteria. This argument is now known under the name of the Théorème of Gauss-Markov.
Formalism
Two simple examples
Average of series of measure independent
The simplest example of adjustment by the method of least squares is probably the calculation of the average of a set of measures independent sullied with Gaussian errors. The regulation of least squares amounts minimizing quantity:
This quantity is a positive definite quadratic form. Its minimum is calculated by differentiation: . That gives the traditional formula:
In other words, the estimator by least squares of the average of series of measure sullied with Gaussian errors (known) is their weighed average, i.e their average empirical in which each measurement is balanced by the reverse of the square of its uncertainty. The theorem of Gauss-Markov guarantees that it is about the best not-skewed estimator of .
The estimated average fluctuates according to the series of measure carried out. As each measurement is affected of a random error, it is conceived that the average of a first series of NR measurements will differ from the average of one second series of NR measurements, even if those are carried out under identical conditions. It is important to be able to quantify the amplitude of such fluctuations, because that determines the precision of the determination of the average Mr. Each measurement can be regarded as a realization of a random variable , of average and of standard deviation . The estimator of the average obtained by the method of least squares, combination linear of random variables, is itself a random variable:
- .
The standard deviation of the fluctuations of is given by (linear combination of independent random variables):
Without much surprise, the precision of the average of a series of measurements is thus determined by the number of measurements, and the precision of each one of these measurements. If each measurement is affected same uncertainty the preceding formula is simplified in:
linear Regression
Another example is the adjustment of a linear law of the type to independent measures, function of a known parameter . This type of situation meets for example when one wants to gauge a simple measuring device (ammeter, thermometer) whose operation is linear. is then instrumental measurement there (deviation of a needle, many steps of a ADC,…) and physical size that the apparatus is supposed to measure, generally better known, if a reliable source of calibration is used. The method of least squares then makes it possible to measure the law of calibration of the apparatus, to estimate the adequacy of this law with measurements of calibration ( i.e. in this case, the linearity of the apparatus) and to propagate the errors of calibration to the future measurements taken with the gauged apparatus. It should be noted that as a general, the errors (and correlations) bearing to the measures and measurements must be taken into account. We will treat this case in the following section.
The regulation of least squares is written for this type of model:
The minimum of this expression is reached for , which gives:
Determination of the parameters " optimaux" (within the meaning of least squares) and is thus reduced to the resolution of a system of linear equations. It acts there of a very interesting property, related to the fact that it model itself is linear. One speaks about adjustment or linear regression. In the general case, the determination of the minimum of the is a more complicated, and generally expensive problem in computing times (cf following sections).
The value of the parameters and depends on measurements realized. As these measurements are sullied with error, it is conceived well that if one repeats time the measurements of calibration, and which one carries out at the conclusion of each series the adjustment describes higher, one will obtain values numerically different from and . Parameters of the adjustment can thus be regarded as random variable , whose law is function of the adjusted model and the law of the .
It is shown that the dispersion which affects the values of and depends on the number of points of measurement, , and dispersion which affects measurements (less them measurements are precise, more and will fluctuate). By elsewhere, and are generally not independent variables . They are generally correlated, and them correlation depends on the adjusted model (we supposed the independent ).
Adjustment of an unspecified linear model
A model is linear, if its dependence in is linear. Such a model is written:
If we have NR measurements, , the can be written in the form:
We can exploit the linearity of the model to express the in a simpler matric form. Indeed, while defining:
By differentiating the relation above compared to each , one obtains:
and the minimum of the is of which reached for equal to:
One finds the remarkable property of the linear problems, which is that the optimal model perhaps obtained in only one operation, namely the resolution of a system .
Adjustment of non-linear models
In many cases, the dependence of the model in is non-linear. For example, if , or . In this case, the formalism describes with the preceding section cannot be applied directly. The approach generally employed consists then starting from an estimate of the solution, to linearize the in this point, to solve the linearized problem, then to reiterate. This approach is equivalent to the algorithm of minimization of Gauss-Newton. Other techniques of minimization exist. Some like the Algorithm of Levenberg-Marquardt, are refinements of the algorithm of Gauss-Newton. Others are applicable when the derivative of the are difficult or expensive to calculate.
One of the difficulties of the problems of non-linear least squares is the frequent existence of several minimas local. A systematic exploration of the space of the parameters can then appear necessary.
Adjustment under constraints
Adjustment of implicit models
Statistical interpretation
The criterion of the χ ²
Optimality of the method of least squares
Robustness
Sensitivity to the aberrant points
Techniques of robustification
Related articles
- Law of the χ ²
- Training supervised
- Simulation of profile
| Random links: | City-Marie (Montreal) | Theodor Hendrik van of Velde | Gabrielle Russier | James F. Buchli | Kleine Beerze | Facile_à_utiliser |