Category: Information theory
The entropy of Shannon , due to Claude Shannon, is a mathematical function which corresponds to the quantity of Information contained or delivered by an information source. This source can be a language, an electrical signal, or an unspecified electronic file. The definition of the entropy of a source according to Shannon is such as more the source is redundant, less it contains information within the meaning of Shannon. In the absence of particular constraints, the entropy is thus maximum for a source of which all the symbols are equiprobable.
This definition is used in numerical electronics to digitize a source by using the possible minimum of bits without loss of information.
Lastly, it is used to know on how much bits at least one can code a file, which is very useful to know which limit can hope to reach the algorithms of compression which do not lose information (types Zip, LZW or RLE, but not JPEG or MP3). There exist such algorithms known as optimal, i.e. which compress the file in a file of minimal entropy.
The term entropy was suggested in Shannon as of 1947 by the mathematician John Von Neumann, on a joke, according to Myron Tribus: Your mathematical formulated is similar to one used in statistical mechanics. People C not really understand entropy so yew you uses it in year argument, you will win every time, hands down (" Your formula resembles much that used in statistical mechanics. The entropy being a concept badly controlled by the audiences, if you use this term like argument, it will be décisif"). An analogy with the concept of Entropie existing in Thermodynamique (and used later in Theory of chaos) existed besides well: the entropy of a source thus has properties similar to the definitions in thermodynamics, in particular the Additivité.
In 1957, Edwin Thompson Jaynes will show the existing formal bond between the macroscopic entropy introduced by Clausius in 1847, the microscopic one introduced by Gibbs, and the mathematical entropy of Shannon. This discovery was qualified by Tribus of " last revolution inaperçue" in this document (pdf).
Intuitively, the entropy of Shannon can be seen like measuring the quantity of uncertainty related to a random event, or more precisely on its Distribution. Another manner of seeing is to speak about the quantity of information carried by the signal: the furnished information by each new event is function of uncertainty on this event.
For example, let us imagine a ballot box containing several balls of various colors, from which one draws a ball randomly (with replacement). If all the balls have different colors, then our uncertainty on the result of a pulling is maximum. In particular, if we must bet on the result of a pulling, we could not privilege a choice rather than another. On the other hand, if certain color is more represented than others (for example if the ballot box contains more red balls), then our uncertainty is slightly reduced: the drawn ball is likely more to be red. If we must absolutely bet on the result of a pulling, we miserions on a red ball. Thus, to reveal the result of a pulling on average provides more information in the first case that in the second, because the entropy of the " signal" (calculable starting from the statistical distribution) is higher.
Let us take another example: let us regard a French text coded as a chain of letters, spaces and punctuations (our signal is thus a Character string). As the frequency of certain characters is not very important (ex: “Z”), while others are very common (ex: “E”), the character string is not so random only that. Of another side, as long as one cannot predict which is the following character, in a certain manner, this chain is random. The entropy is a measurement of this random suggested by Shannon in its article of 1948.
Shannon gives a definition of the entropy which checks the following assertions:
measurement must be proportional (continuous) - i.e. a weak modification of a probability must result in a weak change from the entropy.
In 1948, while he worked with the Laboratoires Beautiful, the engineer in electronic engineering Claude Shannon mathematically formalized the statistical nature of " information perdue" in the signals of the phone lines. With this intention, it developed the general concept of entropy of information, the fundamental angular stone with the information theory. Initially, it does not seem that Shannon was particularly with the current of the close relationship between its new measurement and preceding work in thermodynamics. In 1949, while he worked with his equations since one moment, he returned visit to the mathematician John von Neuman. Two different sources bring back their remarks about what Shannon would have called " measure incertitude" or attenuation in the signal of a telephone line. Here the first:
My larger concern was how to call it. I thought of calling it " information" , but the word was used too much, then I decided to call it " incertitude". When I discussed it with John von Neumann, it had a better idea. Von Neuman says to me, " You should call it entropy, for two reasons. Firstly, your function of uncertainty was used in statistical mechanics under another name, therefore that has already a name. Secondly, and most important, nobody knows what is really the entropy, therefore in a debate you would always have the avantage."
According to the other source, when von Neumann required of him how it went with its information theory, Shannon answered:
the theory was in excellent form, except that it needed a good name for " information perdue". " Why don't you call it entropy? " , von Neumann suggested. " Firstly, a mathematical development resembling the tien extremely already exists in the mechanical statistics of Boltzmann, and secondly, nobody does not include/understand the entropy really well, therefore in a discussion you would be in a position avantageuse.
The entropy of the information of Shannon is a concept more general than the entropy in thermodynamics. The entropy of information is present each time there are unknown quantities only being able to be described in terms of probabilities of distribution. Thus in general, there is no bond with the entropy in thermodynamics. However, as E.T. Jaynes supported it in a series of articles in 1957, the statistical entropy of thermodynamics can be seen like a particular application of the entropy of the information of Shannon.
Entropy of Shannon of a Random variable discrete X , with N states possible, 1. N , and definite as follows:
where indicates the mathematical Espérance.
One can notice certain characteristics of this formula:
the value of H is maximum for a uniform distribution, i.e. when all the states have the same probability.
The general character of the formula of entropy of the information of Shannon makes it possible to apply it to various fields:
For the selection from the best point of view of an object in three dimensions:
| Random links: | Canteloup (Apple-brandy) | Constantius Gallus | Suzaku (emperor) | Luther Johnson Jr. | Archipelago Alexandre |