5.2 Cauchy–Binet and related formulas
We work over a commutative ring \(K\). We formalize the Cauchy–Binet formula for determinants of products of non-square matrices, the expansion of \(\det (A+B)\) as a sum over pairs of subsets, and several applications: formulas for \(\det (A+D)\) when \(D\) is diagonal, the characteristic polynomial in terms of principal minors, and the determinant of the Pascal matrix via LU decomposition.
Throughout, \([n] = \{ 1,2,\ldots ,n\} \) denotes the set of the first \(n\) positive integers (in the Lean formalization, indexing is \(0\)-based). For any finite set \(S\) of integers we write \(\operatorname {sum} S := \sum _{s \in S} s\).
5.2.1 Submatrices and minors
Let \(n,m\in \mathbb {N}\). Let \(A\) be an \(n\times m\)-matrix.
Let \(U\) be a subset of \([n]\). Let \(V\) be a subset of \([m]\).
Writing the two sets \(U\) and \(V\) as
with
we set
Roughly speaking, \(\operatorname {sub}_U^V A\) is the matrix obtained from \(A\) by focusing only on the \(i\)-th rows for \(i\in U\) (that is, removing all the other rows) and only on the \(j\)-th columns for \(j\in V\) (that is, removing all the other columns).
This matrix \(\operatorname {sub}_U^V A\) is called the submatrix of \(A\) obtained by restricting to the \(U\)-rows and the \(V\)-columns. If this matrix is square (i.e., if \(|U|=|V|\)), then its determinant \(\det (\operatorname {sub}_U^V A)\) is called a minor of \(A\).
- AlgebraicCombinatorics.CauchyBinet.submatrixOfFinset A U V = A.submatrix ⇑(U.orderEmbOfFin ⋯) ⇑(V.orderEmbOfFin ⋯)
The minor of the empty sets is \(1\): \(\det (\operatorname {sub}_{\varnothing }^{\varnothing } A) = 1\).
The \(0 \times 0\) matrix has determinant \(1\). This is Mathlib.Matrix.det_isEmpty from Mathlib.
5.2.2 The Cauchy–Binet formula
Helpers for the Cauchy–Binet formula
Let \(A\in K^{n\times m}\), \(B\in K^{m\times n}\), and let \(f\colon [n]\to [m]\) be a non-injective function. Then
Since \(f\) is not injective, there exist \(i\ne j\) with \(f(i)=f(j)\). Pairing each \(\sigma \) with \(\sigma \cdot \operatorname {swap}(i,j)\) gives a sign-reversing involution, so the sum cancels to \(0\).
For a fixed \(n\)-element subset \(S\subseteq [m]\),
where \(S_k\) denotes the \(k\)-th element of \(S\) in sorted order.
Split the product \(\prod _i A_{\sigma (i),e(\tau (i))}\, B_{e(\tau (i)),i}\) into a product of \(A\)-factors and a product of \(B\)-factors. The \(B\)-factors depend only on \(\tau \) and can be pulled out of the \(\sigma \)-sum. The remaining \(\sigma \)-sum is a Leibniz expansion for \(\det (\operatorname {cols}_S A)\), and summing the \(\tau \)-terms gives \(\det (\operatorname {rows}_S B)\).
Let \(n,m\in \mathbb {N}\). Let \(A\in K^{n\times m}\) be an \(n\times m\)-matrix, and let \(B\in K^{m\times n}\) be an \(m\times n\)-matrix. Then,
Here, \(\operatorname {cols}_{g_1,\ldots ,g_n} A\) is the \(n\times n\)-matrix obtained from \(A\) by keeping only columns \(g_1,g_2,\ldots ,g_n\), and \(\operatorname {rows}_{g_1,\ldots ,g_n} B\) is the \(n\times n\)-matrix obtained from \(B\) by keeping only rows \(g_1,g_2,\ldots ,g_n\).
Equivalently, the sum runs over all \(n\)-element subsets \(S\) of \([m]\):
The proof expands \(\det (AB)\) via the Leibniz formula:
Expanding the product over \(i\) as a sum over functions \(f\colon [n]\to [m]\) and swapping sums gives
When \(f\) is not injective, the inner sum vanishes by Lemma 5.37.
Therefore only injective \(f\) contribute. Each injective \(f\) has image \(S = \operatorname {im}(f)\) with \(|S|=n\). The injective functions with a given image \(S\) are in bijection with permutations of \([n]\). Partitioning by \(S\) and reindexing via this bijection, the sum transforms into a sum over \(n\)-element subsets \(S\), and for each \(S\) the inner sum factors as \(\det (\operatorname {cols}_S A)\cdot \det (\operatorname {rows}_S B)\) by Lemma 5.38.
When \(m=n\), the Cauchy–Binet formula specializes to \(\det (AB) = \det A \cdot \det B\).
This is Mathlib’s Matrix.det_mul.
5.2.3 The formula for \(\det (A+B)\)
Basic definitions
For a subset \(P\subseteq [n]\), define \(\operatorname {sum}(P) = \sum _{i\in P} i\).
For subsets \(P,Q\subseteq [n]\), define \(\operatorname {Perm}(P,Q) = \{ \sigma \in S_n : \sigma (P)=Q\} \) where \(\sigma (P) = \{ \sigma (i) \mid i \in P\} \) denotes the image of \(P\) under \(\sigma \).
- AlgebraicCombinatorics.CauchyBinet.finsetSumFin P = ∑ i ∈ P, ↑i
- AlgebraicCombinatorics.CauchyBinet.permsMapping P Q = {σ : Equiv.Perm (Fin n) | AlgebraicCombinatorics.CauchyBinet.imageFinset σ P = Q}
- AlgebraicCombinatorics.CauchyBinet.imageFinset σ P = Finset.map { toFun := ⇑σ, inj' := ⋯ } P
If \(|P| \ne |Q|\), then \(\operatorname {Perm}(P,Q) = \varnothing \).
A permutation is a bijection, so \(|\sigma (P)| = |P|\). If \(|P|\ne |Q|\) then \(\sigma (P) \ne Q\) for all \(\sigma \).
Expansion steps
For a permutation \(\sigma \in S_n\) and a subset \(P\subseteq [n]\), \(\sigma (\overline{P}) = \overline{\sigma (P)}\).
Since \(\sigma \) is a bijection on \([n]\), the complement is preserved.
Permutation decomposition infrastructure
The proof of Theorem 5.61 requires decomposing permutations \(\sigma \in S_n\) with \(\sigma (P) = Q\) into independent parts acting on \(P\) and on \(\overline{P}\).
Let \(P,Q\subseteq [n]\) with \(|P|=|Q|=k\) and let \(\sigma \in S_n\) satisfy \(\sigma (P) = Q\). Write the elements of \(P\) in increasing order as \(p_1 {\lt} \cdots {\lt} p_k\), of \(Q\) as \(q_1 {\lt} \cdots {\lt} q_k\), of \(\overline{P}\) as \(p'_1 {\lt} \cdots {\lt} p'_\ell \), and of \(\overline{Q}\) as \(q'_1 {\lt} \cdots {\lt} q'_\ell \).
Define \(\alpha _\sigma \in S_k\) by \(\sigma (p_i) = q_{\alpha _\sigma (i)}\) and \(\beta _\sigma \in S_\ell \) by \(\sigma (p'_i) = q'_{\beta _\sigma (i)}\).
- AlgebraicCombinatorics.CauchyBinet.extractAlpha P Q hcard σ hσ = Equiv.ofBijective (fun (i : Fin P.card) => (Q.orderIsoOfFin ⋯).symm ⟨σ ((P.orderEmbOfFin ⋯) i), ⋯⟩) ⋯
- AlgebraicCombinatorics.CauchyBinet.extractBeta P Q hcard σ hσ = Equiv.ofBijective (fun (i : Fin Pᶜ.card) => (Qᶜ.orderIsoOfFin ⋯).symm ⟨σ ((Pᶜ.orderEmbOfFin ⋯) i), ⋯⟩) ⋯
Given \(P,Q\subseteq [n]\) with \(|P|=|Q|\) and permutations \(\alpha \in S_k\), \(\beta \in S_\ell \) (with \(k=|P|\), \(\ell =|\overline{P}|\)), define \(\sigma \in S_n\) by:
This is the inverse of the map \(\sigma \mapsto (\alpha _\sigma ,\beta _\sigma )\).
- One or more equations did not get rendered due to their size.
The maps \(\sigma \mapsto (\alpha _\sigma , \beta _\sigma )\) and \((\alpha ,\beta ) \mapsto \sigma _{(\alpha ,\beta )}\) are mutual inverses:
Extracting \(\alpha \) from the permutation constructed from \((\alpha ,\beta )\) returns \(\alpha \).
Extracting \(\beta \) from the permutation constructed from \((\alpha ,\beta )\) returns \(\beta \).
Constructing a permutation from the components \((\alpha _\sigma , \beta _\sigma )\) extracted from \(\sigma \) returns \(\sigma \).
By unfolding the definitions and showing that the sorted-order embeddings and their inverses compose to the identity.
For \(\sigma \) with \(\sigma (P) = Q\):
\(\sigma (p_j) = q_{\alpha _\sigma (j)}\) for all \(j\),
\(\sigma (p'_j) = q'_{\beta _\sigma (j)}\) for all \(j\).
By definition of \(\alpha _\sigma \) and \(\beta _\sigma \), using the order-embedding properties of sorted finsets.
For \(\sigma \) with \(\sigma (P) = Q\):
\(\displaystyle \prod _{i\in P} A_{\sigma (i),i} = \prod _{j=1}^{k} A_{q_{\alpha (j)},\, p_j}\),
\(\displaystyle \prod _{i\in \overline{P}} B_{\sigma (i),i} = \prod _{j=1}^{\ell } B_{q'_{\beta (j)},\, p'_j}\).
Reindex the product over sorted elements of \(P\) (resp. \(\overline{P}\)) using Lemma 5.47.
Sign decomposition infrastructure
The sign decomposition (Lemma 5.59) is proved by reducing any \((P,Q)\) to the base case \(P = Q = [k]\) via “left shifts” and “left co-shifts.”
Let \(0 {\lt} k {\lt} n\) and let \(P\subseteq [n]\) with \(|P| = k\) and \(P \ne [k]\). Then there exists \(i\) such that \(i \notin P\) and \(i+1 \in P\) (with \(i+1 {\lt} n\)).
If \(P = [k]\) (the first \(k\) elements), then \(P\) is already the prefix. Otherwise, some element of \(P\) is “too large,” meaning there exists \(j\) with \(p_j {\gt} j\) (the \(j\)-th smallest element is above position \(j\)). A downward-closure argument produces the required adjacent pair.
If \(i\notin P\) and \(i+1\in P\), let \(P' = s_i(P)\) where \(s_i = \operatorname {swap}(i,i+1)\). Then \(\operatorname {sum}(P') = \operatorname {sum}(P) - 1\).
\(P'\) is obtained from \(P\) by replacing \(i+1\) with \(i\), so the sum decreases by \((i+1)-i = 1\).
If \(i\notin P\) and \(i+1\in P\), let \(P' = s_i(P)\) and \(\sigma ' = \sigma \cdot s_i\). Then
\((-1)^{\sigma '} = (-1)^{\sigma }\cdot (-1)^{s_i} = -(-1)^{\sigma }\) (since adjacent transpositions are odd), and \((-1)^{\operatorname {sum} P'} = -(-1)^{\operatorname {sum} P}\) (since \(\operatorname {sum} P' = \operatorname {sum} P - 1\) by Lemma 5.50). The two sign flips cancel.
Under the left shift \(P' = s_i(P)\), \(\sigma ' = \sigma \cdot s_i\):
\((-1)^{\alpha _{\sigma '}} = (-1)^{\alpha _\sigma }\),
\((-1)^{\beta _{\sigma '}} = (-1)^{\beta _\sigma }\).
The sorted order of \(P'\) is \(s_i\) applied to the sorted order of \(P\) (since replacing \(i{+}1\) by \(i\) preserves relative order among all other elements). Therefore \(\sigma '(P'_j) = \sigma (s_i(s_i(P_j))) = \sigma (P_j)\), so \(\alpha _{\sigma '} = \alpha _\sigma \) (and similarly for \(\beta \)).
If the sign decomposition formula holds for \((P', Q, \sigma ')\) (where \(P' = s_i(P)\), \(\sigma ' = \sigma \cdot s_i\)), then it holds for \((P, Q, \sigma )\).
If the sign decomposition formula holds for \((P, Q', \sigma ')\) (where \(Q' = s_i(Q)\), \(\sigma ' = s_i\cdot \sigma \) for \(i\notin Q\), \(i{+}1\in Q\)), then it holds for \((P, Q, \sigma )\).
Analogous to Lemma 5.53: the combined sign is preserved under left co-shifts, and the extracted component permutations \(\alpha \) and \(\beta \) are unchanged.
When \(P = Q = [k]\) (the prefix finset), the sign decomposition formula
holds, where the sign factor vanishes because \(\operatorname {sum}[k] + \operatorname {sum}[k] = k(k-1)\) is even.
Since \(\sigma \) preserves \([k]\), it decomposes as \(\sigma = \alpha \oplus \beta \) (a direct sum of the restrictions to \([k]\) and its complement). The sign of a direct sum satisfies \((-1)^\sigma = (-1)^\alpha \cdot (-1)^\beta \) (this is Equiv.Perm.sign_subtypeCongr from Mathlib). The factor \((-1)^{k(k-1)} = 1\) since \(k(k-1)\) is even.
For \(n\times n\)-matrices \(A\) and \(B\),
Expand \(\det (A+B) = \sum _{\sigma } (-1)^{\sigma } \prod _i (A_{\sigma (i),i} + B_{\sigma (i),i})\) and distribute each factor using the product rule for sums (this is Fintype.prod_add from Mathlib).
Swapping summation order:
By swapping the order of summation (this is Finset.sum_comm from Mathlib).
Partitioning by \(Q = \sigma (P)\):
Split \(\sum _{\sigma \in S_n}\) according to the value of \(Q = \sigma (P)\), using pairwise disjointness of the fibers.
Let \(P,Q\subseteq [n]\) with \(|P|=|Q|\) and let \(\sigma \in S_n\) satisfy \(\sigma (P)=Q\). Write the elements of \(P\) in increasing order as \(p_1{\lt}p_2{\lt}\cdots {\lt}p_k\), the elements of \(Q\) as \(q_1{\lt}q_2{\lt}\cdots {\lt}q_k\), the elements of \(\overline{P}\) as \(p'_1{\lt}\cdots {\lt}p'_{\ell }\), and those of \(\overline{Q}\) as \(q'_1{\lt}\cdots {\lt}q'_{\ell }\).
Define \(\alpha _{\sigma }\in S_k\) by \(\sigma (p_i) = q_{\alpha _{\sigma }(i)}\) and \(\beta _{\sigma }\in S_{\ell }\) by \(\sigma (p'_i) = q'_{\beta _{\sigma }(i)}\).
Then
The proof uses strong induction on \(\operatorname {sum} P + \operatorname {sum} Q\).
Base case. When \(P = Q = [k]\), the formula holds by Lemma 5.55.
Inductive case. If \(P \ne [k]\), Lemma 5.49 provides an index \(i\) with \(i\notin P\) and \(i+1\in P\). The left shift \(P' = s_i(P)\), \(\sigma ' = \sigma \cdot s_i\) reduces \(\operatorname {sum} P\) by \(1\), so the induction hypothesis applies to \((P', Q, \sigma ')\). Lemma 5.53 lifts the result back to \((P, Q, \sigma )\).
Similarly, if \(P = [k]\) but \(Q \ne [k]\), Lemma 5.49 applied to \(Q\) gives an index for a left co-shift, and Lemma 5.54 lifts the result.
For fixed subsets \(P,Q\subseteq [n]\) with \(|P|=|Q|\):
The proof uses the bijection \(\sigma \mapsto (\alpha _{\sigma }, \beta _{\sigma })\) (Definition 5.44) from \(\{ \sigma \in S_n : \sigma (P) = Q\} \) to \(S_k \times S_{\ell }\), with inverse given by the construction in Definition 5.45. The round-trip properties (Lemma 5.46) establish that this is a bijection.
After reindexing by \((\alpha ,\beta )\):
The sign identity (Lemma 5.59) gives \((-1)^{\sigma } = (-1)^{\operatorname {sum} P+\operatorname {sum} Q} \cdot (-1)^{\alpha }\cdot (-1)^{\beta }\).
The products over \(P\) and \(\overline{P}\) factor (Lemma 5.48).
The double sum \(\sum _{\alpha }\sum _{\beta }\) factors into a product of two Leibniz expansions, giving \(\det (\operatorname {sub}_P^Q A)\cdot \det (\operatorname {sub}_{\overline{P}}^{\overline{Q}} B)\).
Let \(n\in \mathbb {N}\). For any subset \(I\) of \([n]\), let \(\widetilde{I} = [n]\setminus I\) denote its complement.
Let \(A\) and \(B\) be two \(n\times n\)-matrices in \(K^{n\times n}\). Then,
5.2.4 Minors of diagonal matrices
Let \(n\in \mathbb {N}\). Let \(d_1,d_2,\ldots ,d_n\in K\). Let
(a) We have \(\det (\operatorname {sub}_P^P D) = \prod _{i\in P} d_i\) for any subset \(P\) of \([n]\).
(b) Let \(P\) and \(Q\) be two distinct subsets of \([n]\) satisfying \(|P|=|Q|\). Then, \(\det (\operatorname {sub}_P^Q D) = 0\).
(a) The submatrix \(\operatorname {sub}_P^P D\) is itself diagonal with entries \(d_i\) for \(i\in P\). Thus its determinant is \(\prod _{i\in P} d_i\).
The submatrix of a diagonal matrix via an order-preserving embedding is again diagonal; then the determinant of a diagonal matrix is the product of its diagonal entries (this is Matrix.det_diagonal from Mathlib).
(b) Since \(P\ne Q\) but \(|P|=|Q|\), there exists \(i\in P\setminus Q\). The row of \(\operatorname {sub}_P^Q D\) corresponding to \(i\) is all zeros (since \(D_{i,j}=0\) for \(j\ne i\) and \(i\notin Q\)), so the determinant is \(0\) (by Matrix.det_eq_zero_of_row_eq_zero from Mathlib).
5.2.5 Determinant of \(A+D\)
Let \(n\in \mathbb {N}\). Let \(A\) and \(D\) be two \(n\times n\)-matrices in \(K^{n\times n}\) such that the matrix \(D\) is diagonal. Let \(d_1,d_2,\ldots ,d_n\) be the diagonal entries of the diagonal matrix \(D\). Then,
The minors \(\det (\operatorname {sub}_P^P A)\) are called the principal minors of \(A\).
Theorem 5.61 (applied to \(B=D\)) yields a double sum. By Lemma 5.62(b), the terms with \(P\ne Q\) vanish (since \(\det (\operatorname {sub}_{\widetilde{P}}^{\widetilde{Q}} D)=0\)). In the surviving terms (\(P=Q\)), the sign factor \((-1)^{\operatorname {sum} P + \operatorname {sum} P} = 1\), and Lemma 5.62(a) gives \(\det (\operatorname {sub}_{\widetilde{P}}^{\widetilde{P}} D) = \prod _{i\in \widetilde{P}} d_i\).
5.2.6 Determinant of \(x + D\)
The matrix \(F\) with entries \(F_{i,j} = x + d_i\, [i{=}j]\) decomposes as \(F = A + D\) where \(A\) is the \(n\times n\) constant matrix with all entries equal to \(x\) and \(D = \operatorname {diag}(d_1,\ldots ,d_n)\).
Entry-wise verification.
For a singleton \(P = \{ p\} \), the \(1\times 1\) submatrix of the constant-\(x\) matrix has determinant \(x\).
The \(1\times 1\) matrix is \((x)\), which has determinant \(x\) (this is Matrix.det_unique from Mathlib).
Let \(A\) be the \(n\times n\) constant matrix with all entries equal to \(x\). If \(P\subseteq [n]\) has \(|P| \ge 2\), then \(\det (\operatorname {sub}_P^P A) = 0\).
When \(|P|\ge 2\), the submatrix \(\operatorname {sub}_P^P A\) is a constant matrix of size \(\ge 2\). All its rows are identical, so the determinant is \(0\).
Let \(n\in \mathbb {N}\). Let \(d_1,d_2,\ldots ,d_n\in K\) and \(x\in K\). Let \(F\) be the \(n\times n\)-matrix
Then,
where the hat over “\(d_i\)” means “omit the \(d_i\) factor.”
Write \(F = A + D\) (Lemma 5.64) where \(A\) is the \(n\times n\) constant matrix with all entries \(x\) and \(D = \operatorname {diag}(d_1,\ldots ,d_n)\). By Theorem 5.63,
For \(|P|\ge 2\), \(\det (\operatorname {sub}_P^P A) = 0\) (Lemma 5.66), so only \(P=\varnothing \) and singleton \(P=\{ p\} \) contribute:
5.2.7 Characteristic polynomial coefficients
Let \(n\in \mathbb {N}\). Let \(A\in K^{n\times n}\) be an \(n\times n\)-matrix. Let \(x\in K\). Let \(I_n\) denote the \(n\times n\) identity matrix. Then,
The matrix \(xI_n\) is diagonal with all entries equal to \(x\). Theorem 5.63 (applied to \(D = xI_n\) and \(d_i = x\)) yields
This is the first equality. The second follows by grouping the sum according to the value of \(|P|\) and substituting \(n-k\) for \(k\).
In the Lean formalization, \(x I_n\) is written as \(x \cdot I_n\) and the diagonal form is obtained from the identity \(x \cdot I_n = \operatorname {diag}(x,x,\ldots ,x)\). The second form regroups by partitioning \(\mathcal{P}([n])\) into fibers by cardinality.
5.2.8 The Pascal matrix
Define the \(n\times n\) Pascal matrix, the lower triangular factor \(L\), and the upper triangular factor \(U\) by:
for \(0\le i,j,k \le n-1\).
\(L\) is lower triangular (\(L_{i,k} = 0\) for \(i {\lt} k\)) with \(L_{i,i} = 1\). \(U\) is upper triangular (\(U_{k,j} = 0\) for \(k {\gt} j\)) with \(U_{k,k} = 1\).
\(\binom {i}{k} = 0\) when \(i {\lt} k\), and \(\binom {i}{i} = 1\). Similarly for \(U\).
Then \(A = L \cdot U\) (over \(\mathbb {Z}\)).
For all \(i,j\), we have (by Vandermonde’s identity)
The key combinatorial identity \(\sum _{k} \binom {m}{k}\binom {n}{k} = \binom {m+n}{n}\) (Vandermonde’s identity) is used to establish the LU decomposition.
The lower triangular factor \(L\) has \(\det L = 1\).
\(L\) is lower triangular (\(\binom {i}{k} = 0\) when \(i {\lt} k\)) and its diagonal entries are \(L_{i,i} = \binom {i}{i} = 1\) (Lemma 5.70). The determinant of a triangular matrix is the product of its diagonal entries, which is \(1\).
The upper triangular factor \(U\) has \(\det U = 1\).
\(U\) is upper triangular (\(\binom {j}{k} = 0\) when \(k {\gt} j\)) and its diagonal entries are \(U_{k,k} = \binom {k}{k} = 1\) (Lemma 5.70). The determinant of a triangular matrix is the product of its diagonal entries, which is \(1\).
Let \(n\in \mathbb {N}\). Let \(A\) be the \(n\times n\)-matrix
Then \(\det A = 1\).