Chain of Markov

In Mathematical, a chain of Markov is a stochastic Processus having the Markovian Propriété. In such a process, the prediction of the future as from the present does not require the knowledge of the past. They took the name of their discoverer, Andrei Markov.

A chain of Markov in discrete time is a sequence X 1, X 2, X 3,… of random variable. The whole of their possible values is called the space of states , the value X N being the state of the process at the time n.

If the conditional probability distribution of X N +1 on the last states is a function of X N only, then:

P (X_ {n+1} =x|X_0, X_1, X_2, \ ldots, X_n) = P (X_ {n+1} =x|X_n). \,

where X is an unspecified state of the process. The identity above identifies the Markovian probability .

Andrei Markov published the first results of these processes in 1906.

A generalization with a countable space of states infinite was given by Kolmogorov in 1936.

The chains of Markov are related on the Brownian Movement and the ergodic Hypothèse, two subjects of Physique statistics which were very important at the beginning of the XXe century.

Properties of the Chains of Markov

A chain of Markov is characterized by the conditional distribution

P (X_ {n+1}| X_n) \,

who is also called probability of transition from a step from the process. The probability of transition for two , three not or more results from the probability of transition from a step, and property of Markov:

P (X_ {n+2}|X_n) = \ int P (X_ {n+2}, X_ {n+1}|X_n) \, dX_ {n+1}

= \ int P (X_ {n+2}|X_ {n+1}) \, P (X_ {n+1}|X_n) \, dX_ {n+1}

In the same way,

P (X_ {n+3}|X_n) = \ int P (X_ {n+3}|X_ {n+2}) \ int P (X_ {n+2}|X_ {n+1}) \, P (X_ {n+1}|X_n) \, dX_ {n+1} \, dX_ {n+2}

These formulas spread with an arbitrarily remote future N   +  K by multiplying the probabilities of transition and by integrating K time.

The Loi of marginal distribution P ( X N ) is the law of distribution of the states at time N . The initial distribution is P ( X 0). The evolution of the process after a step is described by

P (X_ {n+1}) = \ int P (X_ {n+1}|X_n) \, P (X_n) \, dX_n

This is a version of the equation of Frobenius-Perron. It can exist one or more distributions of states π such as

\ pi (X) = \ int P (X|Y) \, \ pi (Y) \, dY

where Y is an arbitrary name for the variable of integration. Such a distribution π is called a stationary distribution . A stationary distribution is a clean Fonction of the law of conditional distribution, associated with the eigenvalue 1.

Certain properties of the process determine if there exists or not a stationary distribution, and if it is single or not.

  • Irreducible : any state is accessible starting from any other state.
  • Recurring positive : for each state, the hope of the duration before the return on this state is finished.
When the space of the states of a Chain of Markov is not irreducible, it can be partitionné in a whole of communicating classes irreducible. The problem of classification has its importance in the mathematical study of the chains of Markov and the stochastic Processus S.

If a chain of Markov is recurring, then there exists a stationary distribution.

If a chain of Markov is recurring and irreducible, then:

  • there exists a single stationary distribution,
  • and the process built by taking the stationary distribution as initial distribution is ergodic.

Therefore, the average of a function F on the authorities of the chain of Markov is equal to its average according to its stationary distribution,

\ lim_ {N \ rightarrow \ infty} \; \ frac {1} {N} \; \ sum_ {k=0} ^ {n-1} F (X_k)

= \ int F (X) \, \ pi (X) \, dX

It is true in particular when F is the function identity.

The average of the value of the authorities is thus, on the long run, equal to the hope of the stationary distribution.

Moreover, this equivalence on the averages also applies if F is the indicating Fonction of a subset has space of the states.

\ lim_ {N \ rightarrow \ infty} \; \ frac {1} {N} \; \ sum_ {k=0} ^ {n-1} \ chi_A (X_k)

= \ int_A \ pi (X) \, dX = \ mu_ {\ pi} (A)

where μπ is the measurement induced by π.

That makes it possible to approximate the stationary distribution by a Histogramme of a particular sequence.

Chains of Markov with discrete space of states

If the space of the states is finished , then the probability distribution can be represented by a stochastic Matrice called matrix of transition , whose it (I, J) ème element is worth

P (X_ {n+1} =j \ mid X_n=i) \,

If the space of the states is finished, then the integrals for the probabilities of transition for K not become sums, which can be calculated by raising the matrix of transition to the power K . If P is the matrix of transition for 1 pas, then P k is the matrix of transition for K not.

P being the matrix of transition, a stationary distribution is a vector \ pi^* which checks the equation

\pi^* P= \pi^*.

In this case, the stationary distribution \ pi^* is a clean vector matrix of transition, associated with the eigenvalue 1.

If the matrix of transition P is irreducible and aperiodic, then P K converges towards a matrix whose each line is the single stationary distribution \ pi^*, with

\ lim_ {K \ rightarrow \ infty} \ pi P^k= \ pi^*,

independently of the initial distribution \ pi. That is proven by the Théorème of Perron-Frobenius.

A matrix of transition of which all the elements are strictly positive is irreducible and aperiodic.

Classification of the states

It is said that I and J communicates if and only if there exists n_1 and n_2 such as P (X_ {n_1} =i|X_0=j) >0 and P (X_ {n_2} =j|X_0=i) >0. It is a Relation of equivalence. One calls class, any class for this relation.

A class is known as final, if it does not lead to any other, if not, it is known as transient.

That is to say N_ {ij} = \ {n/P (X_n=j|X_0=i) >0 \} . The period of a class is defined by pgcd (N_ {II}) (the value is same whatever the I inside the same class). If the period is worth 1, the class is known as aperiodic.

Notation

In the formulas which precede, the element ( I , J ) is the probability of the transition from I to J . The sum of the elements of a line is worth always 1 and the stationary distribution is given by the left clean vector of the matrix of transition.

One meets sometimes matrices of transition in which the term ( I, J ) is the probability of transition from J towards I , in which case the matrix of transition is simply the transposed of that described here. The sum of the elements of a column is worth 1 then. Moreover, the stationary distribution of the system is then given by the clean vector right of the matrix of transition, instead of the left clean vector.

Example: Doudou the hamster

Doudou the lazy hamster knows only 3 places in its cage: the chips where he sleeps, the manger where he eats and the wheel where he makes exercise. Its days are rather similar the ones to the others, and its activity is represented easily by a chain of Markov. Every minute, it can either change activity, or to continue that which it was making. Name process without memory is not exaggerated at all to speak about Doudou.

  • When he sleeps, he has 9 chances out of 10 not to awake the following minute.
  • When it awakes, there is 1 chance on 2 qu ' it will eat and 1 chance on 2 qu ' it leaves to make exercise.
  • the meal lasts only one minute, after it makes another thing.
  • After having eaten, there are 3 chances on 10 qu ' it leaves to run in its wheel, but especially 7 chances on 10 qu ' it turns over to sleep.
  • Courir is tiring; it thus has 80  % of chance to turn over to sleep at the end of one minute. If not it continues by forgetting that it is tired already a little.

Diagrams

The diagrams can show all the arrows, each one representing a probability of transition. However, it is more readable if:

  • One does not draw the arrows of probability zero (impossible transition)
  • One does not draw the loops (arrow of a state towards itself) . However they exist; their probability is implied because it is known that the sum of the probabilities of the arrows on the basis of each state must be equal to 1.

Stamp transition

The matrix of transition from this system is the following one (the lines and the columns correspond in the order to the states to sleep , to eat , to run ):

P = \ begin {bmatrix}

0,9 & 0,05 & 0,05 \ \ 0,7 & 0 & 0,3 \ \ 0,8 & 0 & 0,2 \ \ \end{bmatrix}

Forecasts

Let us adopt the approach that Doudou sleeps at the time of the first minute of the study.

\ mathbf {X} ^ {(0)} = \ begin {bmatrix}
1 & 0 & 0 \end{bmatrix}

At the end of one minute, one can predict:

\ mathbf {X} ^ {(1)} = \ mathbf {X} ^ {(0)} P = \begin{bmatrix} 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0,9 & 0,05 & 0,05 \ \ 0,7 & 0 & 0,3 \ \ 0,8 & 0 & 0,2 \ \ \end{bmatrix}

\begin{bmatrix}

0.9 & 0.05 & 0.05 \end{bmatrix}

Thus, after one minute, there is 90  % of chances that Doudou still sleeps, 5  % which he eats and 5  % which it runs.

\ mathbf {X} ^ {(2)} = \ mathbf {X} ^ {(1)} P = \ mathbf {X} ^ {(0)} P^2 = \begin{bmatrix} 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0,9 & 0,05 & 0,05 \ \ 0,7 & 0 & 0,3 \ \ 0,8 & 0 & 0,2 \ \ \end{bmatrix}^2

= \ begin {bmatrix} 0.885 & 0.045 & 0.07 \end{bmatrix} After 2 minutes, there is 4,5  % of chances that the hamster eats.

In a general way, for N minutes

\ mathbf {X} ^ {(N)} = \ mathbf {X} ^ {(n-1)} P

\ mathbf {X} ^ {(N)} = \ mathbf {X} ^ {(0)} P^n

The theory shows that at the end of a certain time, the law of probability is independent of the initial law. Let us note the Q :

\ mathbf {Q} = \ lim_ {N \ to \ infty} \ mathbf {X} ^ {(N)}

One obtains convergence if and only if the chain is aperiodic and irreducible. It is the case in our example, one can thus write:

\begin{matrix} P & = & \ begin {bmatrix} 0,9 & 0,05 & 0,05 \ \ 0,7 & 0 & 0,3 \ \ 0,8 & 0 & 0,2 \ \ \end{bmatrix} \ \ \ mathbf {Q} P & = & \ mathbf {Q} & \ mbox {(} \ mathbf {Q} \ mbox {is the invariant law compared to} P \ mbox {.)} \ \ & = & \ mathbf {Q} I \ \ \ mathbf {Q} (I - P) & = & \ mathbf {0} \ \ & = & \ mathbf {Q} \ left (\ begin {bmatrix} 1 & 0 & 0 \ \ 0 & 1 & 0 \ \ 0 & 0 & 1 \ \ \end{bmatrix} - \ begin {bmatrix} 0,9 & 0,05 & 0,05 \ \ 0,7 & 0 & 0,3 \ \ 0,8 & 0 & 0,2 \ \ \end{bmatrix} \right) \ \ & = & \ mathbf {Q} \ begin {bmatrix} 0.1 & -0.05 & -0.05 \ \ -0.7 & 1 & -0.3 \ \ -0.8 & 0 & 0.8 \ \ \end{bmatrix} \end{matrix}

\begin{bmatrix} q_1 & q_2 & q_3 \end{bmatrix} \begin{bmatrix} 0.1 & -0.05 & -0.05 \ \ -0.7 & 1 & -0.3 \ \ -0.8 & 0 & 0.8 \ \ \end{bmatrix}

\begin{bmatrix} 0 & 0 & 0 \end{bmatrix}

Knowing that q_1 + q_2 + q_3 = 1, one obtains:

\begin{bmatrix}
q_1 & q_2 & q_3 \end{bmatrix}

\begin{bmatrix}

0.884 & 0.0442 & 0.0718 \end{bmatrix} Doudou passes 88,4  % of its time to sleep!

Illustration of the impact of the model

The purpose of the example which follows is to show the importance of the modeling of the system. A good modeling makes it possible to answer complex questions with simple calculations.

One studies a civilization (fictitious) made up of several social classes, and in which the individuals can pass from a class to the other. Each stage will represent one year. One will consider a line rather than an individual, to avoid obtaining citizens bicentenaries. The various social statuses are 4:

  • Free Slave
  • Citizen
  • High-civil servant

In this company: The slaves can remain slaves or become free men (by buying their freedom or while being freed liberally by their Master). The free men can remain free or sell their freedom (to pay their debts, etc) or to become citizens (there still by merit or by buying the title of citizen). The citizens are citizens with life and transmit their citizenship to their line (One could believe that the number of citizens tends to increase and that at the end of a certain time, all are citizens but historically, in civilizations which followed this diagram, the citizens are decimated by the wars and of new slaves arrive regularly from abroad). They can also stand as a candidates at the time them annual elections in order to become top-civils servant (magistrates). At the end of their mandate, they can be re-elected or to become again of ordinary citizens.

To complicate the example a little and to thus show the extent of the applications of the chains of Markov, we will consider that the civils servant are elected for several years. Consequently, the future of an individual civil servant depends on the time since which he is civil servant. We are thus in the case of a nonhomogeneous chain of Markov. Fortunately, we can easily bring back to us to a homogeneous chain. Indeed, it is enough to add an artificial state for each year of the mandate. Instead of having a state 4: Civil servant, we will have a state

  • 4: Civil servant at the beginning of mandate
  • 5: Civil servant in second year of mandate
  • etc
Probabilities connecting two consecutive artificial states (third and fourth year for example) are of value 1 because it is considered that any started mandate finishes (One could model the opposite by changing the value of these probabilities). let us fix the term of the offices at two years, the quota of the civils servant being renewable per half each year. There is then the following graph:

Random links:Lexicografía | Sufism | Royal house of Portugal | Ernst Moerman | Coverdale-page | Age of corn | Emma_Watson