As promised in class, I will present a proof of the Cayley-Hamilton Theorem. This theorem holds over any commutative ring (don’t worry if you don’t know what that is) in particular it holds over and . I’ll go ahead and prove it over .
First a quick reminder of the characteristic polynomial, and the method for computing the eigenvalues/vectors of a matrix you learned in class.
We want to solve and hence we rearrange to get
Now the equation has a non-zero solution if and only if (exercise!), and hence we set
We have seen that the determinant is a polynomial in the entries of the matrix, and so this equation above is a polynomial, called the characteristic polynomial, with a variable that we are going to try and find. This will be an eigenvalue of the matrix , and the corresponding (which we can find by substituting for ) is an eigenvector.
e.g. Let’s do a quick example.
Let so the characteristic polynomial is given by
Let’s give this polynomial a name, write . We see the roots are 1 and 3, so these are the eigenvalues, and we could go ahead and use these to find their corresponding eigenvectors (exercise). Look what happens when we feed our matrix into this polynomial, we get,
So the matrix satisfied it’s own characteristic polynomial, . It turns out that this is true in general, and that is the statement of the Cayley-Hamilton theorem.
Theorem. (Cayley-Hamilton) Let be an square matrix with characteristic polynomial . Then .
The proof of this theorem will go in the following way. First we’ll show it for diagonalizable matrices. This case will be easy. We will then use the fact that we can approximate all matrices by diagonalizable ones, that is, we can find a sequence of diagonalizable matrices that converge to .
Pf. Suppose is a diagonalisable matrix, that is, there exists such that
with diagonal of the form where are the eigenvalues of . We know then (or else, exercise) that , and that since are the roots of the characteristic polynomial. Hence we have shown that for all diagonalizable matrices.
Now let be any complex matrix. I claim that there exists a sequence of matrices with the following properties:
- Each is diagonalisable
- that is, the (pointwise) limit (meaning, fix an entry of the matrix, say the upper left entry, and consider the sequence of complex numbers defined by , this sequence converges to the upper left entry in )
There are a few ways to see why this should be true. Something you have probably heard me say before is that the set of diagonalizable matrices is dense in the set of matrices. Points 1 and 2 are exactly what that means. Once we learn about the Jordan-Normal form for matrices we can prove this precisely, but for now let me justify this as follows: a matrix is diagonalizable if it has distinct eigenvalues. The idea is, if we don’t have distinct eigenvalues, we can make small perturbations (that is, jiggle around the entries a tiny bit) until we get one with distinct entries.
Okay, if we believe 1 and 2 (again, don’t worry if you don’t right now, we will make this precise later) we can complete the proof. Approximate . Let be the characteristic polynomial of . We have . But (since they are diagonalizable by assumption), and thus we have that . This completes the proof of the Cayley-Hamilton theorem.