Fundamental Theorem of Markov Chains, PageRank

In this lecture, we will prove the Fundamental Theorem of Markov Chains and discuss the PageRank algorithm. In order to prove the Fundamental Theorem of Markov Chains, we need to review some concepts from linear algebra.

Linear Algebra Review

Eigenvalues, Eigenvectors, and Spectral Radius

Given a square matrix ARn×n, a scalar λC is called an eigenvalue of A if there exists a non-zero unit vector vCn (that is, |v|2=1) such that Av=λv. The vector v is called an eigenvector of A corresponding to the eigenvalue λ.

The eigenvalues of a matrix A are the roots of the characteristic polynomial det(AtI)=0, where I is the identity matrix of size n×n. The characteristic polynomial is a univariate polynomial of degree n in the variable t.

The eigenspace corresponding to an eigenvalue λ is the set of all eigenvectors corresponding to λ, together with the zero vector. The eigenspace corresponding to an eigenvalue λ is a subspace of Cn.

There are two ways of defining the multiplicity of an eigenvalue λ:

  1. The algebraic multiplicity of an eigenvalue λ is the multiplicity of λ as a root of the characteristic polynomial.
  2. The geometric multiplicity of an eigenvalue λ is the dimension of the eigenspace corresponding to λ.

These two notions of multiplicity are equal for symmetric matrices (by the spectral theorem), but can differ for non-symmetric matrices. For instance, the matrix A=[1101] has a single eigenvalue λ=1 with algebraic multiplicity 2 and geometric multiplicity 1.

The spectral radius of a matrix A is defined as ρ(A)=max|λ|:λ is an eigenvalue of A.

The Frobenius norm of a matrix A is defined as $$|A|F = \sqrt{\sum{i=1}^n \sum_{j=1}^n A_{ij}^2} = \sqrt{\text{trace}(A^T A)}.$$

Note that the Frobenius norm of a matrix upper bounds the spectral radius of the matrix, i.e., ρ(A)|A|F. One can see this as follows: Let λ be an eigenvalue of A with eigenvector v. Then, we have |λ|2=|Av|22=Av,Av=v,ATAv=trace(vTATAv)==trace(ATAvvT)trace(ATAI)=|A|F2.

Note that the above argument also shows that the following inequality holds for any unit vector v: |Av|2|A|F.


Proposition 1 (Gelfand’s formula): For any matrix ARn×n, we have ρ(A)=limk|Ak|F1/k.


For two vectors u,vRn, we say that uv if uivi for all i[n]. We say that u>v if uv and uv. With this definition at hand, we have the following easy lemma.


Lemma 2 (Positivity Lemma): Let ARn×n be a positive matrix, i.e., Aij>0 for all i,j[n]. Let u,vRn be distinct vectors such that uv. Then, we have Au>Av. Moreover, there is ε>0 such that Au>(1+ε)Av.


Proof: Since uv, and uv we have uv0 and uv0. Let α:=mini,j[n]Aij. Then, we have (A(uv))i=j=1nAij(ujvj)αj=1n(ujvj)>0. Therefore, Au>Av. The moreover part follows from taking a small enough ε.

We are now ready to state and prove the main tool that we will use to prove the Fundamental Theorem of Markov Chains.

Perron-Frobenius Theorem

We begin with Perron’s theorem for positive matrices.


Theorem 3 (Perron’s Theorem): Let ARn×n be a positive matrix. Then, the following hold:

  1. The spectral radius ρ(A) is an eigenvalue of A, and it has a positive eigenvector vR>0n.
  2. ρ(A) is the only eigenvalue of A in the complex circumference zC:|z|=ρ(A).
  3. ρ(A) has geometric multiplicity 1.
  4. ρ(A) is simple, i.e., its algebraic multiplicity is 1.

Proof: By definition of ρ(A), there exists an eigenvalue λC of A such that |λ|=ρ(A). Let vCn be an eigenvector corresponding to λ. Let uRn be defined as ui:=|vi| for all i[n]. Then, we have (Au)i=j=1nAijuj|j=1nAijvj|=|λvi|=ρ(A)|vi|=ρ(A)ui. Therefore, Auρ(A)u. If the inequality is strict, then by Lemma 2 we have A2u>ρ(A)Au, and there is some ε>0 such that A2u>(1+ε)ρ(A)Au. By induction, we have Ak+1u>(1+ε)kρ(A)kAu for all kN. Hence, by Gelfand’s formula, by setting w:=Au/Au2, we have ρ(A)=limkAkF1/klimkAkw21/klimk((1+ε)kρ(A)k)1/k=(1+ε)ρ(A), which is a contradiction. Therefore, the inequality Auρ(A)u must be an equality, and u is a non-negative eigenvector of A corresponding to ρ(A). However, since A is positive, the eigenvector u must be positive, as ρ(A)ui=(Au)i=j=1nAijuj>0 for all i[n]. This proves the first part of the theorem.

To prove the second part, let λC be an eigenvalue of A such that |λ|=ρ(A), but λρ(A). Let zCn be an eigenvector corresponding to λ. Let wRn be defined as wi:=|zi| for all i[n]. Then, by the above discussion, we must have Aw=ρ(A)wj=1nAijwj=ρ(A)wi=ρ(A)|zi|=|λzi|=|j=1nAijzj| for all i[n].

From the above conditions, we can deduce that there is αC such that αz=w (as the triangle inequality must be an equality). But in this case, we have λαz=αAz=Aw=ρ(A)w=ρ(A)αzλ=ρ(A), which is a contradiction. This proves the second part of the theorem.

Now we are ready to prove item 3: the geometric multiplicity of ρ(A) is 1.

Suppose, for the sake of contradiction, that the geometric multiplicity of ρ(A) is greater than 1. Let u,vRn be linearly independent eigenvectors corresponding to ρ(A) (by the above discussion, we know that such eigenvectors must be real vectors). Let β>0 be such that uβv0 and at least one of the components of uβv is zero. Note that uβv0 as u and v are linearly independent. Then, by Lemma 2, we have ρ(A)(uβv)=A(uβv)>0 which contradicts the fact that uβv has a zero component. This proves the third part of the theorem.

Finally, we prove the fourth part of the theorem: the algebraic multiplicity of ρ(A) is 1.

Let vRn be a positive eigenvector corresponding to ρ(A), and let uRn be a positive eigenvector of AT, corresponding to ρ(AT) (which is equal to ρ(A), by Gelfand’s formula). We know u exists by the first part of the theorem.

Claim: the space u:=xRn:uTx=0 is invariant under A.

Proof of Claim: Let xu. Then, we have uTAx=(ATu)Tx=ρ(AT)uTx=0.

Note that u is a subspace of Rn of dimension n1, and vu, as uTv>0, since both vectors are positive. Hence, we have that Rn is the direct sum of u and span(v). Let w2,,wn be a basis of u, and BRn×n be the matrix whose columns are v,w2,,wn.

By the above, B is invertible, and we have that BAB1 leaves the subspaces B1span(v)=span(e1) and B1u=span(e2,,en) invariant. Thus, B1AB is a block matrix of the form B1AB=[ρ(A)0 0C].

Since A and B1AB are similar, they have the same eigenvalues. Moreover, we have det(AtI)=det(B1ABtI)=det(CtI)(ρ(A)t). Thus, if ρ(A) had algebraic multiplicity greater than 1, then C would have ρ(A) as an eigenvalue, and therefore A would have ρ(A) as an eigenvalue with geometric multiplicity greater than 1, which is a contradiction. This proves the fourth part of the theorem.


The Perron-Frobenius theorem is a generalization of Perron’s theorem to non-negative matrices.


Theorem 4 (Perron-Frobenius Theorem): Let ARn×n be a non-negative matrix, which is irreducible and aperiodic. Then, the following hold:

  1. The spectral radius ρ(A) is an eigenvalue of A, and it has a positive eigenvector vR>0n.
  2. ρ(A) is the only eigenvalue of A in the complex circumference zC:|z|=ρ(A).
  3. ρ(A) has geometric multiplicity 1.
  4. ρ(A) is simple, i.e., its algebraic multiplicity is 1.

Proof: By Lemma 1 of Lecture 9, we know that there is a positive integer m such that Am is positive. Apply Perron’s theorem to Am, and note that the eigenvalues of Am are the m-th powers of the eigenvalues of A, with the same eigenvectors.

Fundamental Theorem of Markov Chains

We are now ready to prove (most of) the Fundamental Theorem of Markov Chains.


Theorem 5 (Fundamental Theorem of Markov Chains): Let P be the transition matrix of a finite, irreducible and aperiodic Markov chain. Then, the following statements hold:

  1. There exists a unique stationary distribution π of the Markov chain, where πi>0 for all i[n], where n is the number of states of the Markov chain.
  2. For any initial distribution p0, we have limtΔTV(Ptp0,π)=0.
  3. The stationary distribution π is given by πi=limtPiit=1τii.

Proof: We will prove items 1 and 2 of the theorem. As P is the transition matrix of an irreducible and aperiodic Markov chain, we know that P is non-negative, irreducible, and aperiodic. By the Perron-Frobenius theorem, we know that there exists a unique positive eigenvector vR>0n of P corresponding to the spectral radius ρ(P). Moreover, we know that ρ(P)=1, since for any non-negative vector uRn with |u|1=1, we have |Pu|1=1, as it is the probability distribution of the next state of the Markov chain. Hence π:=v/|v|1 is the unique stationary distribution of the Markov chain.

To prove item 2, let B the the change of basis matrix used in the proof of Perron’s theorem. Then, we have that B1PB is a block matrix of the form B1PB=[100C], where C is a matrix of size (n1)×(n1) with eigenvalues strictly inside the unit circle, which implies that limtCt=0. Thus, we have that Pt=B[100Ct]B1 and therefore limtPt=B[1000]B1.

PageRank Algorithm

Previous
Next