description/proof of that Smith normal form theorem for rectangle matrix over principal integral domain
Topics
About: ring
The table of contents of this article
- Starting Context
- Target Context
- Orientation
- Main Body
- 1: Structured Description
- 2: Natural Language Description
- 3: Proof
Starting Context
- The reader knows a definition of principal integral domain.
- The reader admits the proposition that any principal integral domain is a greatest common divisors domain, and for each 2 elements on the principal integral domain, each of the greatest common divisors of the 2 elements is a one by which the sum of the principal ideals by the 2 elements is the principal ideal.
- The reader admits the proposition that for any rectangle matrix over any principal integral domain, there are some types of rows or columns operations each of which can be expressed as the multiplication by an invertible matrix from left or right.
- The reader admits the proposition that for any principal integral domain, any rectangle matrix over the domain, and any invertible rectangular-matrix-columns-dimensional or rectangular-matrix-rows-dimensional square matrix over the domain, the sum of the principal ideals by the specified-dimensional subdeterminants of the product is the sum of the principal ideals by the same-dimensional subdeterminants of the rectangle matrix.
- The reader admits the proposition that for any principal integral domain and any finite subset, the sum of the principal ideals by the elements of the subset is the principal ideal by any of the greatest common divisors of the subset.
- The reader admits the proposition that for any integral domain, if any principal ideal by any element is also by any another element, the elements are associates with each other, and the principal ideal is by any associate.
Target Context
- The reader will have a description and a proof of the proposition that the Smith normal form theorem holds for any rectangle matrix over any principal integral domain.
Orientation
There is a list of definitions discussed so far in this site.
There is a list of propositions discussed so far in this site.
Main Body
1: Structured Description
Here is the rules of Structured Description.
Entities:
\(R\): \(\in \{\text{ the principal integral domains }\}\)
\(M\): \(\in \{\text{ the m x n matrices over } R\}\)
//
Statements:
\(\exists N \in \{\text{ the n x n invertible matrices over } R\}, \exists O \in \{\text{ the m x m invertible matrices over } R\}\)
(
\(\exists k \in \mathbb{N} \text{ such that } 0 \le k \le min (m, n)\)
(
\(O M N \text{ is such that the top left } k x k \text{ submatrix is diagonal with the nonzero diagonal components, } (d_1, ..., d_k) \text{ such that } d_j \vert d_{j + 1}\), with all the other components \(0\)
\(\land\)
\(\text{ the form is unique up to multiplication of } d_j \text{ by a unit }\)
)
)
//
2: Natural Language Description
For any principal integral domain, \(R\), and any \(m x n\) matrix over \(R\), \(M\), there are an \(n x n\) invertible matrix over \(R\), \(N\), an \(m x m\) invertible matrix over \(R\), \(O\), and a \(k \in \mathbb{N}\) such that \(0 \le k \le min (m, n)\) and \(O M N\) is such that the top left \(k x k\) submatrix is diagonal with the nonzero diagonal components, \((d_1, ..., d_k)\), with all the other components \(0\) such that \(d_j \vert d_{j + 1}\), and the form is unique up to multiplication of \(d_j\) by a unit.
3: Proof
Let us prove it inductively with respect to \(max (m, n)\).
When \(max (m, n) = 1\), the proposition obviously holds: \(N\) or \(O\) is an element of \(R\), and its being invertible means that it is a unit.
Let us suppose that the proposition holds through \(max (m, n) = n'\).
Let us suppose that \(max (m, n) = n' + 1\).
Let each component of \(M\) be denoted as \(M^j_l\).
\(R\) is a greatest common divisors domain, by the proposition that any principal integral domain is a greatest common divisors domain, and for each 2 elements on the principal integral domain, each of the greatest common divisors of the 2 elements is a one by which the sum of the principal ideals by the 2 elements is the principal ideal.
Each element of \(R\) is decomposed with the number of the irreducible elements as factors unique. So, let us think of the map, \(f: R \to \mathbb{N}\), which maps the element to the number of the irreducible elements as factors.
Hereafter, we use the proposition that for any rectangle matrix over any principal integral domain, there are some types of rows or columns operations each of which can be expressed as the multiplication by an invertible matrix from left or right, which guarantees that the rows or columns operations we will use are expressed as multiplications by invertible matrices.
When all the components of \(M\) are \(0\), the proposition trivially holds.
Hereafter, let us suppose that at least 1 of the components of \(M\) is nonzero.
Take any nonzero component of \(M\) such that its value under \(f\) is the minimum among all the nonzero components: if there are some multiple such, choose any one of them.
Let us move the component to the \((1, 1)\) position by swapping the rows and the columns, and let the matrix be denoted as \(M'\).
There are 2 cases: the case 1: all the components are divisible by \(M'^1_1\); the case 2: otherwise.
Let us suppose the case 1.
For each \(M'^j_1\) such that \(1 \lt j\), \(M'^j_1 = q_j M'^1_1\), and let us add \(- q_j\) multiple of the 1st row to the \(j\)-th row. Then, \(M'^j_1 - q_j M'^1_1 = 0\). So, the 1st column is now \(0\) except the \((1, 1)\) component.
For each \(M'^1_l\) such that \(1 \lt l\), \(M'^1_l = q_l M'^1_1\), and let us add \(- q_l\) multiple of the 1st column to the \(l\)-th column. Then, \(M'^1_l - q_l M'^1_1 = 0\). So, the 1st row is now \(0\) except the \((1, 1)\) component. Note that the 1st column has not been disrupted by that.
The remaining \((m - 1) x (n - 1)\) matrix can be transformed to a Smith normal form with the diagonal components, \(d_2, ..., d_k\), by the induction hypothesis. That can be done by the invertible \(m x m\) matrix and the invertible \(n x n\) matrix with the 1st rows and the 1st columns \(0\) except the \((1, 1)\) components \(1\).
Now, naming \(M'^1_1 = d_1\), the result is a Smith normal form with the diagonal components, \(d_1, ..., d_k\), because \(d_1 \vert d_2\), because \(d_2\) is a linear combination of the components of \(M'\) while each component is divisible by \(d_1\).
Let us suppose the case 2.
There is at least \(1\) component that is not divisible by \(M'^1_1\).
There are 3 sub-cases: case 2-1: the 1st column has a non-divisible component; case 2-2: not the case 2-1, but the 1st row has a non-divisible component; case 2-3: not the case 2-1 or the case 2-2.
Let us suppose the case 2-1.
The 1st column has a non-divisible component, \(M'^j_1\), and there are a \(d \in gcd (M'^1_1, M'^j_1)\) and some \(w, x \in R\) such that \(w M'^1_1 + x M'^j_1 = d\) (by the proposition that any principal integral domain is a greatest common divisors domain, and for each 2 elements on the principal integral domain, each of the greatest common divisors of the 2 elements is a one by which the sum of the principal ideals by the 2 elements is the principal ideal), and by replacing the 1st row by the 1st row multiplied by \(w\) plus the \(j\)-th row multiplied by \(x\) and replacing the \(j\)-th row by the 1st row multiplied by \(- M^j_1 / d\) plus the \(j\)-th row multiplied by \(M^1_1 / d\), now, the \((1, 1)\) component is \(d\) and the \((j, 1)\) component is \(0\). Note that \(f (d) \lt f (M'^1_1)\), because as \(M'^1_1\) could not divide \(M'^j_1\), \(d\) contains less irreducible elements as factors.
Now let us return to the cases check (if it is the case 1, the procedure will end).
Let us suppose the case 2-2.
The 1st row has a non-divisible component, \(M'^1_l\), and there are a \(d \in gcd (M'^1_1, M'^1_l)\) and some \(w, x \in R\) such that \(w M'^1_1 + x M'^1_l = d\), and by replacing the 1st column by the 1st column multiplied by \(w\) plus the \(l\)-th column multiplied by \(x\) and replacing the \(l\)-th column by the 1st column multiplied by \(- M^1_l / d\) plus the \(l\)-th column multiplied by \(M^1_1 / d\), now, the \((1, 1)\) component is \(d\) and the \((1, l)\) component is \(0\). Note that \(f (d) \lt f (M'^1_1)\), because as \(M'^1_1\) could not divide \(M'^1_l\), \(d\) contains less irreducible elements as factors.
Now let us return to the cases check (if it is the case 1, the procedure will end).
Let us suppose the case 2-3.
Let us do the procedure of the case 1 for the 1st column and the 1st row, which is possible because all the components of the 1st column and the 1st row are divisible.
Then, if that procedure has eliminated all the non-divisible components, now, let us return to the cases check, it is the case 1, and the procedure will end.
Otherwise, there is a non-divisible component on a \(l\)-th column. Let us add the \(l\)-th column to the 1st column (the \((1, 1)\) component is not disrupted, because the 1st row is \(0\) except the \((1, 1)\) component). Now, let us do the case 2-1.
As \(0 \le f (d) \lt f (M'^1_1)\), eventually, \(f (d)\) will become 0 if the case 1 is not realized before that, and the case 1 will be inevitably realized, because \(f (d) = 0\) implies that \(d\) is a unit.
So, the Smith normal form has been realized as \(O M N\).
Let us see that the form is unique up to multiplication of \(d_j\) by a unit.
By the proposition that for any principal integral domain, any rectangle matrix over the domain, and any invertible rectangular-matrix-columns-dimensional or rectangular-matrix-rows-dimensional square matrix over the domain, the sum of the principal ideals by the specified-dimensional subdeterminants of the product is the sum of the principal ideals by the same-dimensional subdeterminants of the rectangle matrix, for each \(1 \le j \le min (m, n)\), \(I_j (O M N) = I_j (M)\).
Each nonzero \(j x j\) subdeterminant of \(O M N\) is by a diagonal submatrix, because the \((1, 1)\) component has to be nonzero, because while the 1st row has to have a nonzero component, if the 1st nonzero component was the \((1, l)\) component where \(1 \lt l\), the 1st column would be all zero, because no nonzero component could appear on any column more left than the \(l\)-th column, which would mean that the determinant was zero; likewise, the \((2, 2)\) component has to be nonzero, because otherwise, the right-bottom \((j - 1) x (j - 1)\) subdeterminant would be zero, which would make the \(j x j\) determinant zero by the Laplace expansion; and so on.
While \(I_j (O M N)\) is the principal ideal by a greatest common divisor of the \(j x j\) subdeterminants, by the proposition that for any principal integral domain and any finite subset, the sum of the principal ideals by the elements of the subset is the principal ideal by any of the greatest common divisors of the subset, such a greatest common divisor is \(d_1 ... d_j\) when \(1 \le j \le k\) (because \(d_l \vert d_{l + 1}\)) and \(0\) when \(k \lt j\).
Let us suppose that there is another Smith normal form, \(O' M N'\), with the diagonal components, \(d'_1, ..., d'_{k'}\).
Then, \(I_j (O M N) = I_j (O' M N')\). Especially, \(I_1 (O M N) = I_1 (O' M N')\), which is the principal ideal by \(d_1\), which is the principal ideal by \(d'_1\). Then, \(d'_1 = u_1 d_1\) for a unit, by the proposition that for any integral domain, if any principal ideal by any element is also by any another element, the elements are associates with each other, and the principal ideal is by any associate. Likewise, \(d'_1 d'_2 = u_1 d_1 d'_2 = u_2 d_1 d_2\), which implies that \(d'_2 = {u_1}^{-1} u_2 d_2\). And so on. After all, each \(d'_j\) is an associate of \(d_j\). Inevitably, \(k' = k\).