description/proof of Sylvester's law of inertia of signature of Hermitian matrix
Topics
About: matrices space
The table of contents of this article
Starting Context
- The reader knows a definition of Hermitian matrix.
- The reader admits the proposition that the Hermitian conjugate of the product of any complex matrices is the product of the Hermitian conjugates of the constituents in the reverse order.
- The reader admits the Laplace expansion of the determinant of any square matrix over any commutative ring holds and its corollary.
- The reader admits the proposition that for any invertible complex matrix, the inverse of the Hermitian conjugate of the matrix is the Hermitian conjugate of the inverse of the matrix.
- The reader admits the proposition that the rank of any matrix over any field is conserved by multiplying any invertible matrices from left and right.
- The reader admits the rank-nullity law for linear map between finite-dimensional vectors spaces.
- The reader admits the proposition that for any vectors space over any field and any square matrix over the field with dimension equal to or smaller than the dimension of the vectors space, the matrix is invertible if it maps a linearly-independent set of vectors to a linearly-independent set of vectors, and if the matrix is invertible, it maps any linearly-independent set of vectors to a linearly-independent set of vectors.
Target Context
- The reader will have a description and a proof of Sylvester's law of inertia of signature of Hermitian matrix.
Orientation
There is a list of definitions discussed so far in this site.
There is a list of propositions discussed so far in this site.
Main Body
1: Structured Description
Here is the rules of Structured Description.
Entities:
\(M\): \(\in \{\text{ the } d \times d \text{ Hermitian matrices }\}\)
\((p, n, z)\): \(= \text{ the signature of } M\)
//
Statements:
\(\forall N \in \{\text{ the invertible matrices }\} \text{ such that } N^* M N = P \text{ where } P \text{ is the diagonal matrix with any diagonal elements, } (\rho_1, ..., \rho_d) ((\rho_1, ..., \rho_d) \text{ has } p \text{ positives, } n \text{ negatives, and } z \text{ } 0 \text{ s })\)
//
2: Note
In fact, there are some such \(N\) s, because any Hermitian matrix can be diagonalizied with real diagonal elements by an invertible matrix, as is well known.
As an immediate corollary, for \(M' = N'^* M N'\) where \(N'\) is any invertible matrix (\(M'\) is Hermitian, because \(M'^* = (N'^* M N')^* = N'^* M^* {N'^*}^* = N'^* M N' = M'\)), \(M'\) has the signature of \(M\), because for any diagonalization, \(\Lambda' = O'^* M' O'\), \(\Lambda' = O'^* N'^* M N' O' = (N' O')^* M N' O'\), so, \(\Lambda'\) has some \(p\) positives, some \(n\) negatives, and some \(z\) \(0\) s, by this proposition, which means that \(M'\) has the signature, \((p, n, z)\), by this proposition.
3: Proof
Whole Strategy: Step 1: see that the elements of \((\rho_1, ..., \rho_d)\) are all reals; Step 2: see that we can assume without loss of generality that \((\rho_1, ..., \rho_{p'})\) are positive, \((\rho_{p' + 1}, ..., \rho_{p' + n'})\) are negative, \((\rho_{p' + n' + 1}, ..., \rho_{p' + n' + z'})\) are \(0\); Step 3: let the eigenvalues of \(M\) be \((\lambda_1, ..., \lambda_p, \lambda_{p + 1}, ..., \lambda_{p + n}, \lambda_{p + n + 1}, ..., \lambda_{p + n + z})\) and see that there is an invertible, \(O\), such that \(O^* M O = \Lambda\) where \(\Lambda\) is the diagonal matrix with the diagonal elements, \((\lambda_1, ..., \lambda_d)\); Step 4: see that \(z' = z\); Step 5: suppose that \(p \lt p'\), and find a contradiction, by taking the matrix, \(S := (O_1, ..., O_p, N_{p' + 1}, ..., N_{p' + n'})\), the map, \(f: \mathbb{C}^d \to \mathbb{C}^{p + n'}, v \mapsto S^* M v\), and a \(v_0 \in Ker (f)\), and seeing \({v_0}^* M v_0\) is negative and positive.
Step 1:
Let us see that the elements of \((\rho_1, ..., \rho_d)\) are indeed all reals.
\(P^* = (N^* M N)^* = N^* M^* {N^*}^*\), by the proposition that the Hermitian conjugate of the product of any complex matrices is the product of the Hermitian conjugates of the constituents in the reverse order, \(= N^* M N = P\).
Especially, \({P^*}^j_j = P^j_j\), which means that \(\overline{\rho_j} = \rho_j\), which means that \(\rho_j\) is real.
Step 2:
If \((\rho_1, ..., \rho_d)\) is not in any order such that \((\rho_1, ..., \rho_{p'})\) are positive, \((\rho_{p' + 1}, ..., \rho_{p' + n'})\) are negative, \((\rho_{p' + n' + 1}, ..., \rho_{p' + n' + z'})\) are \(0\), there is a permutation, \(\sigma: \{1, ..., n\} \to \{1, ..., n\}\), such that \((\rho_{\sigma_1}, ..., \rho_{\sigma_d})\) is in such a order.
Let us see that there is an invertible matrix, \(N'\), such that \((N N')^* M N N' = P'\) where \(P'\) is the diagonal matrix with the diagonal elements, \((\rho_{\sigma_1}, ..., \rho_{\sigma_d})\).
Let \(N'\) be such that \(N'^j_l = \delta_{j, \sigma_l}\), which means that the \(l\)-th column is such that only the \(\sigma_l\)-th row is \(1\) with others \(0\), which is invertible, because by iteratively using the Laplace expansion of the determinant of any square matrix over any commutative ring holds and its corollary, as the 1st column of \(N'\) has the single \(1\), \(det N' = M_{\sigma_1, 1}\), but as the 1st column of the \((\sigma_1, 1)\) minor has the single \(1\), \(M_{\sigma_1, 1}\) is expanded likewise, and so on, after all, the last minor is \(1\), so, \(det N'\) is \(1\) or \(-1\).
\((N N')^* M N N' = N'^* N^* M N N' = N'^* P N'\), and \((N'^* P N')^j_l = {N'^*}^j_o P^o_p N'^p_l = \delta_{o, \sigma_j} P^o_p \delta_{p, \sigma_l} = P^{\sigma_j}_{\sigma_l} = \rho_{\sigma_j} \delta_{j, l}\), which means that \((N N')^* M N N' = P'\).
So, if \(P\) exists, \(P'\) exists, and if the proposition holds for \(P'\), the proposition holds for \(P\), because \((\rho_1, ..., \rho_d)\) is just a permutation of \((\rho_{\sigma_1}, ..., \rho_{\sigma_d})\).
So, we can prove the proposition only for the case such that \((\rho_1, ..., \rho_{p'})\) are positive, \((\rho_{p' + 1}, ..., \rho_{p' + n'})\) are negative, \((\rho_{p' + n' + 1}, ..., \rho_{p' + n' + z'})\) are \(0\).
Step 3:
Let the eigenvalues of \(M\) be positive \((\lambda_1, ..., \lambda_p)\), negative \((\lambda_{p + 1}, ..., \lambda_{p + n})\), and \(0\) \((\lambda_{p + n + 1}, ..., \lambda_{p + n + z})\).
There is an invertible matrix, \(O\), such that \(O^* M O = \Lambda\) where \(\Lambda\) is the diagonal matrix with the diagonal elements, \((\lambda_1, ..., \lambda_d)\), because while there is a unitary matrix, \(U\), such that \(U^* M U\) is a diagonal matrix with the diagonal elements as an order of \((\lambda_1, ..., \lambda_d)\), as is well known, the order can be changed to \((\lambda_1, ..., \lambda_d)\) by an invertible matrix, as in Step 2.
Step 4:
So, we have \(N^* M N = P\) and \(O^* M O = \Lambda\), and \(P = N^* {O^*}^{-1} \Lambda O^{-1} N = N^* {O^{-1}}^* \Lambda O^{-1} N\), by the proposition that for any invertible complex matrix, the inverse of the Hermitian conjugate of the matrix is the Hermitian conjugate of the inverse of the matrix, \(=(O^{-1} N)^* \Lambda O^{-1} N\).
So, the rank of \(P\) equals the rank of \(\Lambda\), by the proposition that the rank of any matrix over any field is conserved by multiplying any invertible matrices from left and right.
But \(Rank (P) = p' + n'\) and \(Rank (\Lambda) = p + n\), and as \(d = p' + n' + z' = p + n + z\), \(z = z'\).
Step 5:
Let us suppose that \(p \lt p'\).
Let us take the \(d \times (p + n')\) matrix, \(S := (O_1, ..., O_p, N_{p' + 1}, ..., N_{p' + n'})\) where \(O_j\) is the \(j\)-th column of \(O\) and \(N_j\) is the \(j\)-th column of \(N\).
\(S^* = \begin{pmatrix} {O_1}^* \\ ... \\ {O_p}^* \\ {N_{p' + 1}}^* \\ ... \\ {N_{p' + n'}}^* \end{pmatrix}\).
Let us take the map, \(f: \mathbb{C}^d \to \mathbb{C}^{p + n'}, v \mapsto S^* M v\).
\(f\) is linear.
So, \(Nullity (f) = d - Rank (f)\), by the rank-nullity law for linear map between finite-dimensional vectors spaces.
But \(Rank (f) \le p + n' \lt p' + n'\).
So, \(z = z' = d - (p' + n') \lt d - Rank (f) = Nullity (f)\).
As \(O\) is invertible, \(\{O_1, ..., O_d\}\) is a basis for \(\mathbb{C}^d\), by the proposition that for any vectors space over any field and any square matrix over the field with dimension equal to or smaller than the dimension of the vectors space, the matrix is invertible if it maps a linearly-independent set of vectors to a linearly-independent set of vectors, and if the matrix is invertible, it maps any linearly-independent set of vectors to a linearly-independent set of vectors: \((O_1, ..., O_d)\) is the image of \((\begin{pmatrix} 1 \\ 0 \\ ... \\ 0 \end{pmatrix}, ..., \begin{pmatrix} 0 \\ 0 \\ ... \\ 1 \end{pmatrix})\), which is obviously linearly independent.
As \(N\) is invertible, \(\{N_1, ..., N_d\}\) is a basis for \(\mathbb{C}^d\), likewise.
So, for each \(v \in \mathbb{C}^d\), \(v = r^j O_j = s^j N_j\).
There is a nonzero \(v_0 = r^j O_j = s^j N_j \in Ker (f)\) such that there is a nonzero \(r^j\) for a \(1 \le j \le p + n\) and there is a nonzero \(s^j\) for a \(1 \le j \le p' + n'\), because otherwise, \(Ker (f) \subseteq Span (\{O_{p + n + 1}, ..., O_d\})\), which would mean that \(Nullity (f) \le z\), a contradiction, and likewise for \(Span (\{N_{p' + n' + 1}, ..., N_d\})\).
\(f (v_0) = 0\), but \(f (v_0) = S^* M v_0 = S^* M r^j O_j = r^j S^* M O_j = \sum_{j \in \{1, ..., d\}} r^j \begin{pmatrix} {O_1}^* M O_j \\ ... \\ {O_p}^* M O_j \\ {N_{p' + 1}}^* M O_j \\ ... \\ {N_{p' + n'}}^* M O_j \end{pmatrix} = \sum_{j \in \{1, ..., d\}} r^j \begin{pmatrix} \lambda_j \delta_{1, j} \\ ... \\ \lambda_j \delta_{p, j} \\ {N_{p' + 1}}^* M O_j \\ ... \\ {N_{p' + n'}}^* M O_j \end{pmatrix}\).
So, for each \(1 \le l \le p\), \(\sum_{j \in \{1, ..., d\}} r^j \lambda_j \delta_{l, j} = 0\), but \(= r^l \lambda_l\), but as \(0 \lt \lambda_l\), \(r^l = 0\).
As \(r^j \neq 0\) for a \(1 \le j \le p + n\), \(r^j \neq 0\) for a \(p + 1 \le j \le p + n\).
Likewise, \(0 = f (v_0) = S^* M v_0 = S^* M s^j N_j = s^j S^* M N_j = \sum_{j \in \{1, ..., d\}} s^j \begin{pmatrix} {O_1}^* M N_j \\ ... \\ {O_p}^* M N_j \\ {N_{p' + 1}}^* M N_j \\ ... \\ {N_{p' + n'}}^* M N_j \end{pmatrix} = \sum_{j \in \{1, ..., d\}} s^j \begin{pmatrix} {O_1}^* M N_j \\ ... \\ {O_p}^* M N_j \\ \rho_j \delta_{p' + 1, j} \\ ... \\ \rho_j \delta_{p' + n', j} \end{pmatrix}\).
So, for each \(1 \le l \le n'\), \(\sum_{j \in \{1, ..., d\}} s^j \rho_j \delta_{p' + l, j} = 0\), but \(= s^{p' + l} \rho_{p' + l}\), but as \(\rho_{p' + l} \lt 0\), \(s^{p' + l} = 0\).
As \(s^j \neq 0\) for a \(1 \le j \le p' + n'\), \(s^j \neq 0\) for a \(1 \le j \le p'\).
Now, take \({v_0}^* M v_0 = (r^j O_j)^* M (r^l O_l) = \overline{r^j} r^l {O_j}^* M O_l\), but \({O_j}^* M O_l\) is in fact the \((j, l)\) component of \(O^* M O = \Lambda\), which is \(\lambda_j \delta_{j, l}\), so, \(= \sum_{j, l} \overline{r^j} r^l \lambda_j \delta_{j, l} = \sum_{j \in \{1, ..., d\}} \overline{r^j} r^j \lambda_j\), but as \(r^j = 0\) for each \(1 \le j \le p\), \(= \sum_{j \in \{p + 1, ..., d\}} \overline{r^j} r^j \lambda_j\), which is negative.
On the other hand, \({v_0}^* M v_0 = (s^j N_j)^* M (s^l N_l) = \overline{s^j} s^l {N_j}^* M N_l\), but \({N_j}^* M N_l\) is in fact the \((j, l)\) component of \(N^* M N = P\), which is \(\rho_j \delta_{j, l}\), so, \(= \sum_{j, l} \overline{s^j} s^l \rho_j \delta_{j, l} = \sum_{j \in \{1, ..., d\}} \overline{s^j} s^j \rho_j\), but as \(s^j = 0\) for each \(p' + 1 \le j \le p' + n'\), \(= \sum_{j \in \{1, ..., p'\}} \overline{s^j} s^j \rho_j\), which is positive.
So, we have found a contradiction.
So, \(p' \le p\).
But, by symmetry, \(p \le p'\).
So, \(p' = p\).
So, \(n' = n\).