2025-12-28

1522: Antisymmetric Real Matrix Can Be Block-Diagonalized by Orthogonal Matrix

<The previous article in this series | The table of contents of this series | The next article in this series>

description/proof of that antisymmetric real matrix can be block-diagonalized by orthogonal matrix

Topics


About: matrices space

The table of contents of this article


Starting Context



Target Context


  • The reader will have a description and a proof of the proposition that any antisymmetric real matrix can be block-diagonalized by an orthogonal matrix.

Orientation


There is a list of definitions discussed so far in this site.

There is a list of propositions discussed so far in this site.


Main Body


1: Structured Description


Here is the rules of Structured Description.

Entities:
\(M\): \(\in \{\text{ the } n \times n \text{ real antisymmetric matrices } \}\)
//

Statements:
\(\exists O \in \{\text{ the orthogonal matrices }\} (O^t M O = \begin{pmatrix} 0 & \sqrt{\lambda_1} & 0 & ... & & & & & & & 0 \\ - \sqrt{\lambda_1} & 0 & ... & & & & & & & & 0 \\ 0 & ... & 0 & \sqrt{\lambda_2} & 0 & ... & & & & & 0 \\ 0 & 0 & - \sqrt{\lambda_2} & 0 & ... & & & & & & 0 \\ ... \\ 0 & ... & & && & 0 & \sqrt{\lambda_{2 m}} & 0 & ... & 0 \\ 0 & ... & & & & 0 & - \sqrt{\lambda_{2 m}} & 0 & ... & & 0 \\ 0 & ... & & & & & & & & & 0 \\ ... \\ 0 & ... & & & & & & & & & 0 \end{pmatrix})\), where \(\{\lambda_1, ..., \lambda_{2 m}, 0, ..., 0\}\) are the eigenvalues of \(M^t M\) where \(0 \lt \lambda_j\)
//


2: Note


"block-diagonalized" means that the result has some diagonal blocks (which means blocks at diagonal positions, not blocks having diagonal shapes), each of which is \(\begin{pmatrix} 0 & \sqrt{\lambda_j} \\ - \sqrt{\lambda_j} & 0 \end{pmatrix}\) or \(\begin{pmatrix} 0 \end{pmatrix}\), with the other components \(0\).

Obviously, any nonzero antisymmetric matrix cannot be diagonalized, because the diagonal components of the antisymmetric matrix are all \(0\), so, \(O^t M O\) would be the \(0\) matrix (\(O^t M O\) is antisymmetric), and \(M = O O^t M O O^t = 0\).


3: Proof


Whole Strategy: Step 1: see that \(M^t M\) is symmetric and has the decreasing eigenvalues (with any duplications), \((\lambda_1, ..., \lambda_k, 0, ..., 0)\), where \(0 \lt \lambda_j\), with some eigenvectors, \(e_1, ..., e_n\); Step 2: see that \(M e_j\) is an eigenvector for \(\lambda_j\) orthogonal to \(e_j\), so, \((e_j, M e_j)\) is a pair for the same \(\lambda_j\); Step 3: take an orthonormal eigenvectors of \(M^t M\), \((O_1, ..., O_{2 m}, O_{2 m + 1}, ..., O_n)\), with the eigenvalues, \((\lambda_1, ..., \lambda_{2 m}, 0, ..., 0)\); Step 4: take \(O\) as \(\begin{pmatrix} O_1 & ... & O_n \end{pmatrix}\); Step 5: see that \(O^t M O\) is as is demanded.

Step 1:

\(M^t M\) is symmetric, because \((M^t M)^t = M^t {M^t}^t\), by the proposition that for any commutative ring, the transpose of the product of any matrices is the product of the transposes of the constituents in the reverse order, \(= M^t M\).

So, \(M^t M\) has the eigenvalues (ordered decreasingly for our convenience), \((\lambda_1, ..., \lambda_n)\), with any duplications, with some eigenvectors, \((e_1, ..., e_n)\), as is well known.

Let us see that \(0 \le \lambda_j\) for each \(j \in \{1, ..., n\}\).

\(M^t M e_j = \lambda_j e_j\).

\({e_j}^t M^t M e_j = (M e_j)^t M e_j = \Vert M e_j \Vert^2\), which is non-negative.

But \({e_j}^t M^t M e_j = {e_j}^t \lambda_j e_j = \lambda_j {e_j}^t e_j = \lambda_j \Vert e_j \Vert^2\).

So, \(0 \le \lambda_j \Vert e_j \Vert^2\), which implies that \(0 \le \lambda_j\).

So, \((\lambda_1, ..., \lambda_n) = (\lambda_1, ..., \lambda_k, 0, ..., 0)\) where \(0 \lt \lambda_j\), where the "\(0, ..., 0\)" part does not really exist when \(k = n\).

Step 2:

Let \(j \in \{1, ..., k\}\) be any.

Let us see that \(M e_j\) is an eigenvector for \(\lambda_j\) orthogonal to \(e_j\).

\(M^t M (M e_j) = - M^t (- M) (M e_j) = - (- M) (M^t) (M e_j) = M (M^t M e_j) = M (\lambda_j e_j) = \lambda_j (M e_j)\).

On the other hand, \(M (M e_j) = - (- M M e_j) = - (M^t M e_j) = - \lambda_j e_j \neq 0\), which implies that \(M e_j \neq 0\).

\(e_j = - 1 / \lambda_j M (M e_j)\).

\({e_j}^t (M e_j) = (- 1 / \lambda_j M (M e_j))^t M e_j = - 1 / \lambda_j (M (M e_j))^t M e_j = - 1 / \lambda_j (M e_j)^t M^t M e_j = - 1 / \lambda_j (M e_j)^t \lambda_j e_j = - (M e_j)^t e_j = - ((M e_j)^t e_j)^t\), because the transpose of any scalar is the scalar, \(= - {e_j}^t ((M e_j)^t)^t = - {e_j}^t (M e_j)\), which implies that \({e_j}^t (M e_j) = 0\).

So, \(M e_j\) is an eigenvector for \(\lambda_j\) orthogonal to \(e_j\).

So, \(\{e_j, M e_j\}\) is linearly independent, and \((e_j, M e_j)\) forms a pair of eigenvectors for \(\lambda_j\).

Step 3:

\(O_1 := e_1 / \Vert e_1 \Vert\) is a normal eigenvector for \(\lambda_1\).

Let us take \(O_2 := - 1 / \sqrt{\lambda_1} M O_1\), which is an eigenvector for \(\lambda_1\) orthogonal to \(O_1\) by Step 2.

\({O_2}^t O_2 = (- 1 / \sqrt{\lambda_1} M O_1)^t (- 1 / \sqrt{\lambda_1} M O_1) = 1 / \lambda_1 (M O_1)^t M O_1 = 1 / \lambda_1 {O_1}^t M^t M O_1 = 1 / \lambda_1 {O_1}^t \lambda_1 O_1 = {O_1}^t O_1 = 1\), so, \(O_2\) is a normal eigenvector for \(\lambda_1\) orthogonal to \(O_1\).

Note that \(M O_2 = M (- 1 / \sqrt{\lambda_1} M O_1) = - 1 / \sqrt{\lambda_1} M M O_1 = \sqrt{\lambda_1} O_1\), by Step 2.

If there is no more duplication of \(\lambda_1\), \((\lambda_1, \lambda_2 = \lambda_1)\) will be the duplications of \(\lambda_1\).

Let us suppose that there is another duplication of \(\lambda_1\).

A normal eigenvector, \(O_3\), can be taken to be orthogonal to \((O_1, O_2)\), by the definition of Gram-Schmidt orthonormalization of countable subset of vectors space with inner product.

Then, let us take \(O_4 := - 1 / \sqrt{\lambda_1} M O_3\), a normal eigenvector for \(\lambda_1\) orthogonal to \(O_3\), as before.

Let us see that \(O_4\) is orthogonal also to \(O_1\) and \(O_2\).

For \(j \in \{1, 2\}\), \({O_j}^t O_4 = {O_j}^t (- 1 / \sqrt{\lambda_1} M O_3) = - 1 / \sqrt{\lambda_1} {O_j}^t M O_3 = - 1 / \sqrt{\lambda_1} {O_j}^t {M^t}^t O_3 = - 1 / \sqrt{\lambda_1} (M^t O_j)^t O_3 = - 1 / \sqrt{\lambda_1} (- M O_j)^t O_3 = 1 / \sqrt{\lambda_1} (M O_j)^t O_3\), but \(M O_j\) is a scalar multiple of \(O_1\) or \(O_2\), so, \(= 0\).

And so on, after all, \(\lambda_1\) has some even duplications, \((\lambda_1, \lambda_2 = \lambda_1, ..., \lambda_{2 l - 1} = \lambda_1, \lambda_{2 l} = \lambda_1)\), with the orthonormal eigenvectors, \((O_1, O_2, ..., O_{2 l - 1}, O_{2 l})\).

Doing likewise for each eigenvalue-positive-duplications, we have the eigenvalues, \((\lambda_1, ..., \lambda_{2 m})\) with the orthonormal eigenvectors, \((O_1, ..., O_{2 m})\): any 2 eigenvectors with different eigenvalues, \(O_j, O_l\), are inevitably orthogonal to each other, because \((\lambda_l - \lambda_j) {O_j}^t O_l = \lambda_l {O_j}^t O_l - \lambda_j {O_j}^t O_l = {O_j}^t M^t M O_l - (M^t M O_j)^t O_l = ((M^t M)^t O_j)^t O_l - (M^t M O_j)^t O_l = (M^t M O_j)^t O_l - (M^t M O_j)^t O_l = 0\), which implies that \({O_j}^t O_l = 0\).

For the eigenvalue-0-duplications, we take any orthonormal eigenvectors, by the definition of Gram-Schmidt orthonormalization of countable subset of vectors space with inner product.

So, we have the eigenvalues \((\lambda_1, ..., \lambda_{2 m}, 0, ..., 0)\) with the orthonormal eigenvectors, \((O_1, ..., O_{2 m}, O_{2 m + 1}, ..., O_n)\): any 2 eigenvectors with different eigenvalues, \(O_j, O_l\), are inevitably orthogonal to each other, as before.

Step 4:

Let us take \(O := \begin{pmatrix} O_1 & ... & O_n \end{pmatrix}\).

\(O\) is an orthogonal matrix, because \((O_1, ..., O_n)\) is orthonormal: \((O_1, ..., O_n)\)'s being orthonormal is nothing but \(O^t O = I\).

Step 5:

Let us see that \(O^t M O\) is as is demanded.

For each \(2 m \lt j\), \(M O_j = 0\), because \(M^t M O_j = 0\), so, \({O_j}^t M^t M O_j = 0\), but the left hand side is \((M O_j)^t M O_j = \Vert M O_j \Vert^2\), so, \(\Vert M O_j \Vert^2 = 0\), which implies that \(M O_j = 0\).

Let us see that \((O^t M O)^j_l = {O_j}^t M O_l\).

\((O^t M O)^j_l = (O^t)^j (M O)_l\), where \((O^t)^j\) denotes the \(j\)-th row of \(O^t\) and \((M O)_l\) denotes the \(l\)-th column of \(M O\).

\((O^t)^j = {O_j}^t\).

\((M O)_l = M O_l\).

So, \((O^t M O)^j_l = {O_j}^t M O_l\).

For each \(j = 2 r + 1\) for each \(r \in \{0, ..., m - 1\}\), for \(l = j + 1\), \({O_j}^t M O_l = {O_j}^t \sqrt{\lambda_j} O_j = \sqrt{\lambda_j}\), and for any other \(l\), \({O_j}^t M O_l = 0\), because when \(l \le 2 m\), \(M O_l\) is a scalar multiple of the other in the pair to which \(O_l\) belongs, and when \(2 m \lt l\), \(M O_l = 0\).

For each \(j = 2 r + 2\) for each \(r \in \{0, ..., m - 1\}\), for \(l = j - 1\), \({O_j}^t M O_l = {O_j}^t (- \sqrt{\lambda_j} O_j) = - \sqrt{\lambda_j}\), and for any other \(l\), \({O_j}^t M O_l = 0\), because when \(l \le 2 m\), \(M O_l\) is a scalar multiple of the other in the pair to which \(O_l\) belongs, and when \(2 m \lt l\), \(M O_l = 0\).

For each \(j\) such that \(2 m \lt j\), for each \(l \in \{1, .., n\}\), \({O_j}^t M O_l = 0\), because when \(l \le 2 m\), \(M O_l\) is a scalar multiple of the other in the pair to which \(O_l\) belongs, and when \(2 m \lt l\), \(M O_l = 0\).

That meas that \(O^t M O\) is as is demanded.


References


<The previous article in this series | The table of contents of this series | The next article in this series>