Pages

Basic properties of Markov Matrices

Image result for markov chains

What's up people? A few minutes ago, while I was browsing AoPS, I saw a post about Markov matrices. Being the curious guy that I am, I immediately jumped on Wikipedia to see what they are all about. Apparently, a Markov matrix, also known as a Stochastic matrix (you've got to admit though, Markov matrix sounds a little bit more complex) is a matrix used in probability and statistics in order to find out how a system develops. A Markov matrix usually encodes a Markov chain (this guy surely has many things named after him). Initially, Andrej Markov wanted to use these matrices for linguistic analysis and card shuffling, yet they ended up being used in many other fields. In this article, I want to discuss several interesting properties of these matrices. 

First of all, let me define a Markov matrix. A matrix \(A=(a_{i,j})_{0<i,j<n+1}\) with positive elements is a right stochastic matrix if the sum of the elements on each row is 1, and A is a left stochastic matrix if the sum of the elements on each column is 1. A first, rather obvious property, would be the following:

1. If \(A∈M_n(R)\) is a right stochastic matrix, then \(A^T\) is a left stochastic matrix.

The proof of this property is rather simple and straightforward.

Since the Markov matrix represents how a system goes from one state to another in the Markov chain, the product of two stochastic matrices is also a stochastic matrices.

2. If \(A, B\) are right/left stochastic matrices, then \(AB\) is also a right/left stochastic matrix.

We shall prove that the sum of the elements on the first row is 1. One can observe that the other rows have the same property. 

Let \(C=AB\). Thus, \(c_{11}=a_{11}b_{11}+a_{12}b_{21}+...+a_{1n}b_{n1}\), and, in the end, after summing all the elements and taking the common factor, we'll get exactly what we wanted, that the sum of the elements on the row is 1.

Now, we shall get to the eigenvalues of a Markov matrix.

3. Let \(A\) be a right stochastic matrix. Then all its eigenvalues have absolute value smaller or equal to 1.

Let us consider an eigenvector \(v\) and let \(v_i\) be its greatest element in absolute value. We know that \(Av=av\), where \(a\) is an eigenvalue of \(A\). By finding the value of the product on the i-th column, we get that \(a_{i1}v_1+...+a_{in}v_n=av_i\). We shall write this as a module, and, thus, \(|a_{i1}v_1+...+a_{in}v_n|=a|v_i|≤a_{i1}|v_1|+...+a_{in}|v_n|≤|v_i|\), so \(|a|≤1\), as required.

In fact, I believe that a stronger result can be established.

4. Let \(A\) be both a right and a left stochastic matrix. If \(a\) is an eigenvalue for which there is an eigenvector with the property that the sum of its elements is different than 0. Then, \(a=1\).

Yet again, we look at \(Av=av\). For every row, we have that (a_{i1}v_1+...+a_{in}v_n=av_i\). We sum up these relations for each row, and we get that \(v_1+v_2+...+v_n=a(v_1+v_2+...+v_n)\), because the sum on columns is equal to 1. Since the sum of the elements of the eigenvector is different from 0, we can just divide by it and obtain that \(a=1\).

This property is really interesting, because it tells us that the Markov matrices have one known eigenvalue, if such an eigenvector exists. In fact...

5. 1 is always an eigenvalue of a Markov matrix.

In order to prove this, let us first observe that, if \(A\) is a left stochastic matrix, then \(A^T\) is a right stochastic matrix. By hand, we can check that \([1 1 ... 1]^T\) is an eigenvector of \(A^T\) with eigenvalue 1. Since \(A\) and \(A^T\) have the same eigenvalues, we conclude that \(A\) has eigenvalue 1.

The property that \(A\) and \(A^T\) have the same eigenvalues is well known, yet I believe that a short explanation is in order. Take the characteristic polynomial of \(A\), \(p(X)\). We know that \(p(A)=O\), and, by transposing, \(p(A^T)=O\). Thus, we know that not only does \(A^T\) have the same eigenvalues as \(A\), but the eigenvalues also have the same multiplicities.

This was but an introduction in  the study of Markov matrices. I believe that the properties I described are good enough for a first impression on this topic. I don't know too much probability (you may have seen that I only tackled the STEP 2 probability problems, because I don't know advanced probability), but when I find out more, I'll be sure to discover more properties of these matrices. Tune in next time for probably something regarding the IMO or the IMC (since the IMC starts this week), or something completely random!

Cris.




Matrix problem regarding rational points and a number theory lemma

Image result for competencia iberoamericana interuniversitaria de matemáticas 2018

Eyy there. I was gonna write an article about Markov matrices (which is almost complete, btw), but then, while scrolling through undergrad contests, I found a really neat solution for a problem at the CIIM from last year. The CIIM is a competition for Latin American undergrad students, similar to IMC, Putnam, Seemous and Traian Lalescu. From what I've seen, the problems are a bit harder than those at IMC, but, then again, my opinion isn't necessarily a very competent one. These are the problems (great for you if you know Spanish).

Image result for competencia iberoamericana interuniversitaria de matemáticas 2018

Today, we shall solve the first one (because you know I like matrices). For those that don't know Spanish, here's what it says:

Problem: Prove that there exists a \(2*2\) matrix of order 6 with rational entries, such that the sum of its entries is 2018.

Solution: Let \(A\) be a matrix with the required properties. Since \(A^6=I_2\), we get that \((A-I_2)(A+I_2)(A^2+A+I_2)(A^2-A+I_2)=O_2\). The minimal polynomial of \(A\) is, thus, one of those appearing in the decomposition, since \(A\) is rational. If \(A+I_2=O_2\), then the order of \(A\) is 2. If \(A-I_2=O_2\), then the order of \(A\) is 1. If \(A^2+A+I_2=O_2\), then \(A^3=I_2\). Thus, the only possibility is that \(A^2-A+I_2=O_2\), so \(Tr(A)=1\), \(det(A)=1\). Let \(a, b, c, d\) be the elements of \(A\), in that order. Thus, we know that \(a+d=1\), and \(ad-bc=1\). By writing \(d=1-a\), we get that \(a(1-a)-bc=1\), so \(a-a^2-bc=1\). We want to prove that there exist matrices with these properties such that \(a+b+c+d=2018\), which is equivalent to \(c=2017-b\). Thus, our end goal is proving that \(a-a^2-2017b+b^2=1\) has rational solutions.

Now, we shall write this as the equation of a hyperbola. \(\frac{1}{4}+a-a^2-2017b+b^2+2017^2=\frac{5}{4}+2017^2\), and, by multiplying by 4, we get that \((2b-4034)^2-(2a-1)^2=5+4*2017^2\). However, we know that any odd number can be written as a difference of squares (\(2k+1=(k+1)^2-k^2\), so there exist two real squares for which we get that conclusion. In fact, we can even calculate \(a\) and \(b\), but that is left as an exercise for the reader. Since we found that \(a\) and \(b\) exist, and we found all the rational entries of the matrix according to the rational numbers \(a\) and \(b\), we conclude that there is indeed a matrix with the required properties.

As a fun fact, initially, I made a calculation mistake. Instead of writing the equation as a hyperbola, I wrote it as the equation of a circle. Thus, I got that \((2b-4034)^2+(2a-1)^2=5+4*2017^2\). I do not know whether the above equation has rational solutions, but, while trying to solve it, I remembered of an old problem from Gazeta Matematica 11/2017, proposed by Marian Andronache, which shall be treated as a lemma:

Lemma: Let \(n\) be a natural number bigger than 1 with the sum of its divisors not divisible by 4. Prove that \(n\) can be written as the sum of two squares.

Proof: We shall use the well-known 'sum of two squares theorem', which states that a number can be written as a sum of two squares if and only if any prime divisor of the form \(4k+3\) appears in the decomposition at an even power. I won't get into details regarding the solution to this, but you can see a proof of this theorem here: https://www.math.uchicago.edu/~may/VIGRE/VIGRE2008/REUPapers/Bhaskar.pdf

Getting back to the lemma, let us presume by absurd that \(n\) can't be written as a sum of two squares. Thus, there is an odd number of apparitions of a prime number\(p_i\) of the form \(4m+3\). We know that the sum the divisors is \(S(n)=\prod_{k=1}^{w(n)} \frac{p_k^{a_k+1}-1}{p_k-1}\), where \(a_k+1\) is the power at which a prime number appears in the decomposition. The term from the product associated with \(p_i\) is equal to \(1+p_i+p_i^2+...+p_i^{a_i}\), and, because \(a_i\) is odd, we get that the sum is a multiple of 4, which contradicts our choice of \(n\). The sum is a multiple of 4, because \(p^{2k}\) is a multiple of 4 plus 1, while \(p^{2k+1}\) is a multiple of 4 plus 3, and we can group the terms together.

I was a bit sad to find out that the above lemma doesn't work on the problem. Indeed, the sum of all divisors of \(5+4*2017^2\) is a multiple of 4. It does, however, work on, for example, \(1+4*2017^2\). I was just 4 units away. The lemma that I wrote has some interesting applications, and I might search for some problems to solve using it for the next articles. Thinking about that, I haven't posted any number theory problem so far. I might post something after I'm done with the Markov matrices article. Tune in next time for some properties of stochastic matrices, or something completely random!

Cris.