The Cayley-Hamilton Theorem

As promised in class, I will present a proof of the Cayley-Hamilton Theorem. This theorem holds over any commutative ring (don’t worry if you don’t know what that is) in particular it holds over \mathbb{R} and \mathbb{C}. I’ll go ahead and prove it over \mathbb{C}.

First a quick reminder of the characteristic polynomial, and the method for computing the eigenvalues/vectors of a matrix you learned in class.

We want to solve Av = \lambda v and hence we rearrange to get

Av-\lambda v =0

(A-\lambda I)v=0

Now the equation Mx=0 has a non-zero solution if and only if det M =0 (exercise!), and hence we set

det(A-\lambda I)=0

We have seen that the determinant is a polynomial in the entries of the matrix, and so this equation above is a polynomial, called the characteristic polynomial, with a variable \lambda that we are going to try and find. This \lambda will be an eigenvalue of the matrix A, and the corresponding v (which we can find by substituting for \lambda) is an eigenvector.

e.g. Let’s do a quick example.

Let A=\left(\begin{array}{cc}2 & 1 \\1 & 2\end{array}\right) so the characteristic polynomial is given by

det\left(\begin{array}{cc}2 & 1 \\1 & 2\end{array}\right) = (2-\lambda)(2-\lambda)-1=\lambda^2-4\lambda + 3=0

Let’s give this polynomial a name, write p(x)=x^2-4x+3. We see the roots are 1 and 3, so these are the eigenvalues, and we could go ahead and use these to find their corresponding eigenvectors (exercise). Look what happens when we feed our matrix A into this polynomial, we get,

A^2-4A+3I=\left(\begin{array}{cc}5 & 4 \\4 & 5\end{array}\right)-4\left(\begin{array}{cc}2 & 1 \\1 & 2\end{array}\right)+3I=0

So the matrix A satisfied it’s own characteristic polynomial, p(A)=0. It turns out that this is true in general, and that is the statement of the Cayley-Hamilton theorem.

Theorem. (Cayley-Hamilton) Let A be an n \times n square matrix with characteristic polynomial p(x). Then p(A)=0.

The proof of this theorem will go in the following way. First we’ll show it for diagonalizable matrices. This case will be easy. We will then use the fact that we can approximate all matrices by diagonalizable ones, that is, we can find a sequence of diagonalizable matrices that converge to A.

Pf. Suppose A is a diagonalisable matrix, that is, there exists P such that


with D diagonal of the form diag(\lambda_1, \ldots, \lambda_n) where \lambda_i are the eigenvalues of A. We know then (or else, exercise) that p(A) = Pp(D)P^{-1}, and that p(D)=diag(p(\lambda_1),\ldots,p(\lambda_n))=0 since \lambda_i are the roots of the characteristic polynomial. Hence we have shown that p(A)=0 for all diagonalizable matrices.

Now let A be any n \times n complex matrix. I claim that there exists a sequence of matrices \{A_k\} with the following properties:

  1. Each A_k is diagonalisable
  2. \lim_{k \rightarrow \infty} A_k = A that is, the (pointwise) limit (meaning, fix an entry of the matrix, say the upper left entry, and consider the sequence of complex numbers defined by \lim A_k, this sequence converges to the upper left entry in A)

There are a few ways to see why this should be true. Something you have probably heard me say before is that the set of diagonalizable matrices is dense in the set of matrices. Points 1 and 2 are exactly what that means. Once we learn about the Jordan-Normal form for matrices we can prove this precisely, but for now let me justify this as follows: a matrix is diagonalizable if it has distinct eigenvalues. The idea is, if we don’t have distinct eigenvalues, we can make small perturbations (that is, jiggle around the entries a tiny bit) until we get one with distinct entries.

Okay, if we believe 1 and 2 (again, don’t worry if you don’t right now, we will make this precise later) we can complete the proof. Approximate A = \lim_{k\rightarrow \infty} A_k. Let p_k(x) be the characteristic polynomial of A_k. We have \lim_{k \rightarrow \infty} p_k(A_k) = p(A). But p_k(A_k)=0 (since they are diagonalizable by assumption), and thus we have that p(A)=0. This completes the proof of the Cayley-Hamilton theorem.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s