Formalization Blueprint for “An Introduction to Algebraic Combinatorics” by Darij Grinberg

1.6 Substitution and evaluation of power series

1.6.1 Defining substitution

Definition 1.161
#

Let \(f\) and \(g\) be two FPSs in \(K[[x]]\). Assume that \([x^0]g = 0\) (that is, \(g = g_1 x + g_2 x^2 + g_3 x^3 + \cdots \) for some \(g_1, g_2, g_3, \ldots \in K\)).

We then define an FPS \(f[g] \in K[[x]]\) as follows:

Write \(f\) in the form \(f = \sum _{n \in \mathbb {N}} f_n x^n\) with \(f_0, f_1, f_2, \ldots \in K\). (That is, \(f_n = [x^n]f\) for each \(n \in \mathbb {N}\).) Then, set

\[ f[g] := \sum _{n \in \mathbb {N}} f_n g^n. \]

(This sum is well-defined, as we will see in Proposition 1.163 (b) below.)

This FPS \(f[g]\) is also denoted by \(f \circ g\), and is called the composition of \(f\) with \(g\), or the result of substituting \(g\) for \(x\) in \(f\).

Equivalently, the \(n\)-th coefficient of \(f[g]\) is the finite sum

\[ [x^n](f[g]) = \sum _{d=0}^{n} f_d \cdot [x^n](g^d). \]
Lemma 1.162

The composition \(f[g]\) defined above equals the standard substitution operation on formal power series.

Proof

By definition.

1.6.2 Well-definedness of substitution

Proposition 1.163

Let \(f\) and \(g\) be two FPSs in \(K[[x]]\). Assume that \([x^0]g = 0\). Write \(f\) in the form \(f = \sum _{n \in \mathbb {N}} f_n x^n\) with \(f_0, f_1, f_2, \ldots \in K\). Then:

(a) For each \(n \in \mathbb {N}\), the first \(n\) coefficients of the FPS \(g^n\) are \(0\).

(b) The sum \(\sum _{n \in \mathbb {N}} f_n g^n\) is well-defined, i.e., the family \((f_n g^n)_{n \in \mathbb {N}}\) is summable.

(c) We have \([x^0](\sum _{n \in \mathbb {N}} f_n g^n) = f_0\).

Proof

(a) The FPS \(g\) has constant term \([x^0]g = 0\). Hence, by Lemma 1.128 (applied to \(a = g\)), there exists an \(h \in K[[x]]\) such that \(g = xh\). Now, let \(n \in \mathbb {N}\). From \(g = xh\), we obtain \(g^n = (xh)^n = x^n h^n\). However, Lemma 1.129 (applied to \(k = n\) and \(a = h^n\)) yields that the first \(n\) coefficients of \(x^n h^n\) are \(0\). In other words, the first \(n\) coefficients of \(g^n\) are \(0\).

(b) This follows from part (a). We must prove that for each \(n \in \mathbb {N}\), all but finitely many \(i \in \mathbb {N}\) satisfy \([x^n](f_i g^i) = 0\). Fix \(n \in \mathbb {N}\). Let \(i \in \mathbb {N}\) satisfy \(i {\gt} n\). Then \(n {\lt} i\), and by part (a) (applied to \(i\) instead of \(n\)), the first \(i\) coefficients of \(g^i\) are \(0\). Since \(n {\lt} i\), the coefficient \([x^n](g^i) = 0\). Thus \([x^n](f_i g^i) = f_i \cdot 0 = 0\).

(c) Let \(n\) be a positive integer. By part (a), the first \(n\) coefficients of \(g^n\) are \(0\), so \([x^0](g^n) = 0\) and thus \([x^0](f_n g^n) = 0\). Therefore,

\[ [x^0]\left(\sum _{n \in \mathbb {N}} f_n g^n\right) = [x^0](f_0 \underbrace{g^0}_{= 1}) + \sum _{n {\gt} 0} \underbrace{[x^0](f_n g^n)}_{= 0} = f_0. \]
Lemma 1.164

Let \(g \in K[[x]]\) with \([x^0]g = 0\). Then \(0[g] = 0\).

Proof

Immediate from the definitions: \(0[g] = \sum _{n \in \mathbb {N}} 0 \cdot g^n = 0\).

Lemma 1.165

Let \(g \in K[[x]]\) with \([x^0]g = 0\). Then \(\underline{1}[g] = \underline{1}\).

Proof

Immediate from the definitions: \(\underline{1}[g] = 1 \cdot g^0 + \sum _{n {\gt} 0} 0 \cdot g^n = 1\).

1.6.3 Lemma for first \(k\) coefficients

Lemma 1.166

Let \(f, g \in K[[x]]\) satisfy \([x^0]g = 0\). Let \(k \in \mathbb {N}\) be such that the first \(k\) coefficients of \(f\) are \(0\). Then, the first \(k\) coefficients of \(f \circ g\) are \(0\).

theorem AlgebraicCombinatorics.fps_fg_coeffs_zero {K : Type u_1} [CommRing K] (f g : PowerSeries K) (hg : PowerSeries.constantCoeff g = 0) (k : ) (hf : m < k, (PowerSeries.coeff m) f = 0) (m : ) :
Proof

We have \([x^0]g = 0\). Hence, by Lemma 1.128 (applied to \(a = g\)), there exists an \(h \in K[[x]]\) such that \(g = xh\).

Write \(f = (f_0, f_1, f_2, \ldots )\). The first \(k\) coefficients of \(f\) are \(0\), so \(f_n = 0\) for each \(n {\lt} k\). Now,

\[ f \circ g = \sum _{n \in \mathbb {N}} f_n g^n = \sum _{\substack{[n, , \in , \mathbb , {, N, }, ;, , \\ , , n, , {\lt}, , k]}} \underbrace{f_n}_{= 0} g^n + \sum _{\substack{[n, , \in , \mathbb , {, N, }, ;, , \\ , , n, , \geq , k]}} f_n \underbrace{g^n}_{= x^n h^n} = \sum _{\substack{[n, , \in , \mathbb , {, N, }, ;, , \\ , , n, , \geq , k]}} f_n x^n h^n. \]

For any \(m {\lt} k\), each term \(f_n x^n h^n\) with \(n \geq k\) satisfies \([x^m](x^n h^n) = 0\) (since \(m {\lt} k \leq n\)). Thus \([x^m](f \circ g) = 0\) for each \(m {\lt} k\).

1.6.4 Kronecker delta notation

Definition 1.167
#

If \(i\) and \(j\) are any objects, then \(\delta _{i,j}\) means the element

\[ \delta _{i,j} = \begin{cases} 1, & \text{if } i = j; \\ 0, & \text{if } i \neq j \end{cases} \]

of \(K\).

Lemma 1.168

\(\delta _{i,i} = 1\).

theorem AlgebraicCombinatorics.kroneckerDelta_self {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] (i : α) :
Proof

Immediate from the definition.

Lemma 1.169

If \(i \neq j\), then \(\delta _{i,j} = 0\).

theorem AlgebraicCombinatorics.kroneckerDelta_ne {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] {i j : α} (h : i j) :
Proof

Immediate from the definition.

Lemma 1.170

\(\delta _{i,j} = \delta _{j,i}\).

Proof

Case split on whether \(i = j\).

Lemma 1.171

\(\delta _{i,j} \cdot a = \begin{cases} a & \text{if } i = j, \\ 0 & \text{if } i \neq j. \end{cases}\)

theorem AlgebraicCombinatorics.kroneckerDelta_mul_left {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] (i j : α) (a : K) :
kroneckerDelta i j * a = if i = j then a else 0
Proof

Case split on whether \(i = j\).

Lemma 1.172

\(a \cdot \delta _{i,j} = \begin{cases} a & \text{if } i = j, \\ 0 & \text{if } i \neq j. \end{cases}\)

theorem AlgebraicCombinatorics.kroneckerDelta_mul_right {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] (i j : α) (a : K) :
a * kroneckerDelta i j = if i = j then a else 0
Proof

Case split on whether \(i = j\).

Lemma 1.173

If \(\alpha \) is a finite type, then \(\sum _{i \in \alpha } \delta _{i,j} = 1\).

theorem AlgebraicCombinatorics.sum_kroneckerDelta_left {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] [Fintype α] (j : α) :
i : α, kroneckerDelta i j = 1
Proof

All terms vanish except \(i = j\), which contributes \(1\).

Lemma 1.174

If \(\alpha \) is a finite type, then \(\sum _{j \in \alpha } \delta _{i,j} = 1\).

theorem AlgebraicCombinatorics.sum_kroneckerDelta_right {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] [Fintype α] (i : α) :
j : α, kroneckerDelta i j = 1
Proof

By symmetry (\(\delta _{i,j} = \delta _{j,i}\)), this reduces to Lemma 1.173.

Lemma 1.175

If \(\alpha \) is a finite type, then \(\sum _{i \in \alpha } \delta _{i,j} \cdot f(i) = f(j)\).

theorem AlgebraicCombinatorics.sum_kroneckerDelta_mul {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] [Fintype α] (j : α) (f : αK) :
i : α, kroneckerDelta i j * f i = f j
Proof

By the multiplication property, \(\delta _{i,j} \cdot f(i) = 0\) for \(i \neq j\) and \(f(j)\) for \(i = j\). The sum collapses to the single term \(f(j)\).

Lemma 1.176

If \(\alpha \) is a finite type, then \(\sum _{j \in \alpha } f(j) \cdot \delta _{i,j} = f(i)\).

theorem AlgebraicCombinatorics.sum_mul_kroneckerDelta {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] [Fintype α] (i : α) (f : αK) :
j : α, f j * kroneckerDelta i j = f i
Proof

Analogous to Lemma 1.175.

Lemma 1.177

For natural numbers \(n,m\in \mathbb {N}\), we have \(\delta _{n,m} = \begin{cases} 1 & \text{if } n = m, \\ 0 & \text{if } n \neq m. \end{cases}\)

Proof

This is an immediate specialization of the definition to natural numbers.

Lemma 1.178

For integers \(n,m\in \mathbb {Z}\), we have \(\delta _{n,m} = \begin{cases} 1 & \text{if } n = m, \\ 0 & \text{if } n \neq m. \end{cases}\)

Proof

This is an immediate specialization of the definition to integers.

Lemma 1.179

For a nontrivial ring \(K\) and any \(i,j\), we have \(\delta _{i,j} = 0\) if and only if \(i \neq j\).

Proof

If \(i \neq j\), then \(\delta _{i,j} = 0\) by definition. Conversely, if \(i = j\), then \(\delta _{i,j} = 1 \neq 0\) (since \(K\) is nontrivial).

Lemma 1.180

For a nontrivial ring \(K\) and any \(i,j\), we have \(\delta _{i,j} = 1\) if and only if \(i = j\).

theorem AlgebraicCombinatorics.kroneckerDelta_eq_one_iff {K : Type u_1} [CommRing K] {α : Type u_2} [DecidableEq α] [Nontrivial K] {i j : α} :
kroneckerDelta i j = 1 i = j
Proof

If \(i = j\), then \(\delta _{i,j} = 1\) by definition. Conversely, if \(i \neq j\), then \(\delta _{i,j} = 0 \neq 1\) (since \(K\) is nontrivial).

1.6.5 Laws of substitution

Proposition 1.181

Composition of FPSs satisfies the rules you would expect it to satisfy:

(a) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \((f_1 + f_2) \circ g = f_1 \circ g + f_2 \circ g\).

(b) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \((f_1 \cdot f_2) \circ g = (f_1 \circ g) \cdot (f_2 \circ g)\).

(c) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \(\frac{f_1}{f_2} \circ g = \frac{f_1 \circ g}{f_2 \circ g}\), as long as \(f_2\) is invertible. (In particular, \(f_2 \circ g\) is automatically invertible under these assumptions.)

(d) If \(f, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \(f^k \circ g = (f \circ g)^k\) for each \(k \in \mathbb {N}\).

(e) If \(f, g, h \in K[[x]]\) satisfy \([x^0]g = 0\) and \([x^0]h = 0\), then \([x^0](g \circ h) = 0\) and \((f \circ g) \circ h = f \circ (g \circ h)\).

(f) We have \(\underline{a} \circ g = \underline{a}\) for each \(a \in K\) and \(g \in K[[x]]\).

(g) We have \(x \circ g = g \circ x = g\) for each \(g \in K[[x]]\).

(h) If \((f_i)_{i \in I} \in K[[x]]^I\) is a summable family of FPSs, and if \(g \in K[[x]]\) is an FPS satisfying \([x^0]g = 0\), then the family \((f_i \circ g)_{i \in I} \in K[[x]]^I\) is summable as well and we have \((\sum _{i \in I} f_i) \circ g = \sum _{i \in I} f_i \circ g\).

theorem AlgebraicCombinatorics.fps_subs_sum {K : Type u_1} [CommRing K] {ι : Type u_2} (s : Finset ι) (f : ιPowerSeries K) (g : PowerSeries K) (hg : PowerSeries.constantCoeff g = 0) :
PowerSeries.subst g (∑ is, f i) = is, PowerSeries.subst g (f i)
theorem AlgebraicCombinatorics.fps_subs_summable {K : Type u_1} [CommRing K] {ι : Type u_2} (f : ιPowerSeries K) (g : PowerSeries K) (hg : PowerSeries.constantCoeff g = 0) (hf : FPS.SummableFPS f) :
FPS.SummableFPS fun (i : ι) => PowerSeries.subst g (f i)
Proof

(a) Write \(f_1 = \sum _{n \in \mathbb {N}} f_{1,n} x^n\) and \(f_2 = \sum _{n \in \mathbb {N}} f_{2,n} x^n\). Then \(f_1 + f_2 = \sum _{n \in \mathbb {N}} (f_{1,n} + f_{2,n}) x^n\), so

\[ (f_1 + f_2)[g] = \sum _{n \in \mathbb {N}} (f_{1,n} + f_{2,n}) g^n = \sum _{n \in \mathbb {N}} f_{1,n} g^n + \sum _{n \in \mathbb {N}} f_{2,n} g^n = f_1[g] + f_2[g]. \]

(f) We have \(\underline{a} = \sum _{n \in \mathbb {N}} a \delta _{n,0} x^n\). Hence \(\underline{a}[g] = \sum _{n \in \mathbb {N}} a \delta _{n,0} g^n = a \cdot 1 + 0 = \underline{a}\).

(g) We have \(x = \sum _{n \in \mathbb {N}} \delta _{n,1} x^n\), so \(x[g] = \sum _{n \in \mathbb {N}} \delta _{n,1} g^n = g^1 = g\). Also, writing \(g = \sum _{n \in \mathbb {N}} g_n x^n\), we get \(g[x] = \sum _{n \in \mathbb {N}} g_n x^n = g\).

(b) Write \(f_1 = \sum _{i \in \mathbb {N}} f_{1,i} x^i\) and \(f_2 = \sum _{j \in \mathbb {N}} f_{2,j} x^j\). Then

\[ f_1[g] \cdot f_2[g] = \left(\sum _{i \in \mathbb {N}} f_{1,i} g^i\right) \left(\sum _{j \in \mathbb {N}} f_{2,j} g^j\right) = \sum _{n \in \mathbb {N}} \left(\sum _{\substack{[(, i, ,, j, ), , \in , \mathbb , {, N, }, ^, 2, ;, , \\ , , i, +, j, =, n]}} f_{1,i} f_{2,j}\right) g^n. \]

The same computation with \(g\) replaced by \(x\) gives \(f_1 \cdot f_2 = \sum _{n \in \mathbb {N}} c_n x^n\) where \(c_n = \sum _{\substack{[(, i, ,, j, ), , \in , \mathbb , {, N, }, ^, 2, ;, , \\ , , i, +, j, =, n]}} f_{1,i} f_{2,j}\). Thus \((f_1 \cdot f_2)[g] = \sum _{n \in \mathbb {N}} c_n g^n = f_1[g] \cdot f_2[g]\).

The interchange of infinite summation signs (discrete Fubini) is justified because the family \((f_{1,i} f_{2,j} g^{i+j})_{(i,j) \in \mathbb {N}^2}\) is summable: for any \(m \in \mathbb {N}\) and any \((i,j)\) with \(m {\lt} i+j\), we have \([x^m](g^{i+j}) = 0\) by Proposition 1.163 (a).

(c) Assume \(f_2\) is invertible. By part (b) applied to \(f_2^{-1}\), \((f_2^{-1} \cdot f_2) \circ g = (f_2^{-1} \circ g) \cdot (f_2 \circ g)\), so \((f_2^{-1} \circ g) \cdot (f_2 \circ g) = \underline{1} \circ g = \underline{1}\) by part (f). Hence \(f_2 \circ g\) is invertible. Then by part (b) applied to \(f_1/f_2\), \((f_1/f_2 \cdot f_2) \circ g = (f_1/f_2 \circ g) \cdot (f_2 \circ g)\), i.e., \(f_1 \circ g = (f_1/f_2 \circ g) \cdot (f_2 \circ g)\). Dividing by \(f_2 \circ g\) gives \(f_1/f_2 \circ g = (f_1 \circ g)/(f_2 \circ g)\).

(d) By induction on \(k\). Base case: \(f^0 \circ g = \underline{1} \circ g = \underline{1} = (f \circ g)^0\) by part (f). Induction step: \(f^{m+1} \circ g = (f \cdot f^m) \circ g = (f \circ g) \cdot (f^m \circ g) = (f \circ g) \cdot (f \circ g)^m = (f \circ g)^{m+1}\) by part (b).

(h) First, we show \((f_i \circ g)_{i \in I}\) is summable. Since \((f_i)_{i \in I}\) is summable, for each \(n \in \mathbb {N}\) there is a finite \(I_n \subseteq I\) such that \([x^n] f_i = 0\) for all \(i \in I \setminus I_n\). For \(i \in I \setminus (I_0 \cup I_1 \cup \cdots \cup I_n)\), the first \(n+1\) coefficients of \(f_i\) are all \(0\), so by Lemma 1.166, \([x^n](f_i \circ g) = 0\). Thus the family \((f_i \circ g)_{i \in I}\) is summable.

The equality \((\sum _{i \in I} f_i) \circ g = \sum _{i \in I} f_i \circ g\) follows by writing each \(f_i = \sum _{n \in \mathbb {N}} f_{i,n} x^n\) and interchanging summation signs (justified by summability of the family \((f_{i,n} g^n)_{(i,n) \in I \times \mathbb {N}}\)).

(e) Write \(g = \sum _{n \in \mathbb {N}} g_n x^n\). By Proposition 1.163 (c) (applied to \(g, h\)), \([x^0](g \circ h) = g_0 = [x^0]g = 0\).

For associativity: \(f \circ g = \sum _{n \in \mathbb {N}} f_n g^n\), and the family \((f_n g^n)_{n \in \mathbb {N}}\) is summable by Proposition 1.163 (b). By part (h),

\[ (f \circ g) \circ h = \left(\sum _{n \in \mathbb {N}} f_n g^n\right) \circ h = \sum _{n \in \mathbb {N}} (f_n g^n) \circ h. \]

By parts (b), (d), and (f), \((f_n g^n) \circ h = (\underline{f_n} \cdot g^n) \circ h = (\underline{f_n} \circ h) \cdot (g^n \circ h) = \underline{f_n} \cdot (g \circ h)^n = f_n (g \circ h)^n\). Hence \((f \circ g) \circ h = \sum _{n \in \mathbb {N}} f_n (g \circ h)^n = f \circ (g \circ h)\).

1.6.6 Example: Fibonacci generating function

Definition 1.182
#

The geometric series is the FPS \(\frac{1}{1-x} = (1-x)^{-1} \in K[[x]]\) (where \(K\) is a field).

Lemma 1.183

Substituting \(x + x^2\) into the geometric series \(\frac{1}{1-x}\) yields the generating function for shifted Fibonacci numbers:

\[ \frac{1}{1-x}\Big[x + x^2\Big] = \frac{1}{1 - x - x^2}. \]
Proof

By Proposition 1.181 (c), substitution preserves inverses when the original has invertible constant coefficient. Since \([x^0](1-x) = 1 \neq 0\), we have \(\frac{1}{1-x}\big[x+x^2\big] = \frac{1}{(1-x)[x+x^2]}\). Now \((1-x)[x+x^2] = 1 - (x+x^2) = 1 - x - x^2\) by the linearity of substitution, so the result follows.