1.6 Substitution and evaluation of power series
1.6.1 Defining substitution
Let \(f\) and \(g\) be two FPSs in \(K[[x]]\). Assume that \([x^0]g = 0\) (that is, \(g = g_1 x + g_2 x^2 + g_3 x^3 + \cdots \) for some \(g_1, g_2, g_3, \ldots \in K\)).
We then define an FPS \(f[g] \in K[[x]]\) as follows:
Write \(f\) in the form \(f = \sum _{n \in \mathbb {N}} f_n x^n\) with \(f_0, f_1, f_2, \ldots \in K\). (That is, \(f_n = [x^n]f\) for each \(n \in \mathbb {N}\).) Then, set
(This sum is well-defined, as we will see in Proposition 1.163 (b) below.)
This FPS \(f[g]\) is also denoted by \(f \circ g\), and is called the composition of \(f\) with \(g\), or the result of substituting \(g\) for \(x\) in \(f\).
Equivalently, the \(n\)-th coefficient of \(f[g]\) is the finite sum
- AlgebraicCombinatorics.fps_comp f g _hg = PowerSeries.subst g f
The composition \(f[g]\) defined above equals the standard substitution operation on formal power series.
By definition.
1.6.2 Well-definedness of substitution
Let \(f\) and \(g\) be two FPSs in \(K[[x]]\). Assume that \([x^0]g = 0\). Write \(f\) in the form \(f = \sum _{n \in \mathbb {N}} f_n x^n\) with \(f_0, f_1, f_2, \ldots \in K\). Then:
(a) For each \(n \in \mathbb {N}\), the first \(n\) coefficients of the FPS \(g^n\) are \(0\).
(b) The sum \(\sum _{n \in \mathbb {N}} f_n g^n\) is well-defined, i.e., the family \((f_n g^n)_{n \in \mathbb {N}}\) is summable.
(c) We have \([x^0](\sum _{n \in \mathbb {N}} f_n g^n) = f_0\).
(a) The FPS \(g\) has constant term \([x^0]g = 0\). Hence, by Lemma 1.128 (applied to \(a = g\)), there exists an \(h \in K[[x]]\) such that \(g = xh\). Now, let \(n \in \mathbb {N}\). From \(g = xh\), we obtain \(g^n = (xh)^n = x^n h^n\). However, Lemma 1.129 (applied to \(k = n\) and \(a = h^n\)) yields that the first \(n\) coefficients of \(x^n h^n\) are \(0\). In other words, the first \(n\) coefficients of \(g^n\) are \(0\).
(b) This follows from part (a). We must prove that for each \(n \in \mathbb {N}\), all but finitely many \(i \in \mathbb {N}\) satisfy \([x^n](f_i g^i) = 0\). Fix \(n \in \mathbb {N}\). Let \(i \in \mathbb {N}\) satisfy \(i {\gt} n\). Then \(n {\lt} i\), and by part (a) (applied to \(i\) instead of \(n\)), the first \(i\) coefficients of \(g^i\) are \(0\). Since \(n {\lt} i\), the coefficient \([x^n](g^i) = 0\). Thus \([x^n](f_i g^i) = f_i \cdot 0 = 0\).
(c) Let \(n\) be a positive integer. By part (a), the first \(n\) coefficients of \(g^n\) are \(0\), so \([x^0](g^n) = 0\) and thus \([x^0](f_n g^n) = 0\). Therefore,
Let \(g \in K[[x]]\) with \([x^0]g = 0\). Then \(0[g] = 0\).
Immediate from the definitions: \(0[g] = \sum _{n \in \mathbb {N}} 0 \cdot g^n = 0\).
Let \(g \in K[[x]]\) with \([x^0]g = 0\). Then \(\underline{1}[g] = \underline{1}\).
Immediate from the definitions: \(\underline{1}[g] = 1 \cdot g^0 + \sum _{n {\gt} 0} 0 \cdot g^n = 1\).
1.6.3 Lemma for first \(k\) coefficients
Let \(f, g \in K[[x]]\) satisfy \([x^0]g = 0\). Let \(k \in \mathbb {N}\) be such that the first \(k\) coefficients of \(f\) are \(0\). Then, the first \(k\) coefficients of \(f \circ g\) are \(0\).
We have \([x^0]g = 0\). Hence, by Lemma 1.128 (applied to \(a = g\)), there exists an \(h \in K[[x]]\) such that \(g = xh\).
Write \(f = (f_0, f_1, f_2, \ldots )\). The first \(k\) coefficients of \(f\) are \(0\), so \(f_n = 0\) for each \(n {\lt} k\). Now,
For any \(m {\lt} k\), each term \(f_n x^n h^n\) with \(n \geq k\) satisfies \([x^m](x^n h^n) = 0\) (since \(m {\lt} k \leq n\)). Thus \([x^m](f \circ g) = 0\) for each \(m {\lt} k\).
1.6.4 Kronecker delta notation
If \(i\) and \(j\) are any objects, then \(\delta _{i,j}\) means the element
of \(K\).
\(\delta _{i,i} = 1\).
Immediate from the definition.
If \(i \neq j\), then \(\delta _{i,j} = 0\).
Immediate from the definition.
\(\delta _{i,j} = \delta _{j,i}\).
Case split on whether \(i = j\).
\(\delta _{i,j} \cdot a = \begin{cases} a & \text{if } i = j, \\ 0 & \text{if } i \neq j. \end{cases}\)
Case split on whether \(i = j\).
\(a \cdot \delta _{i,j} = \begin{cases} a & \text{if } i = j, \\ 0 & \text{if } i \neq j. \end{cases}\)
Case split on whether \(i = j\).
If \(\alpha \) is a finite type, then \(\sum _{i \in \alpha } \delta _{i,j} = 1\).
All terms vanish except \(i = j\), which contributes \(1\).
If \(\alpha \) is a finite type, then \(\sum _{j \in \alpha } \delta _{i,j} = 1\).
By symmetry (\(\delta _{i,j} = \delta _{j,i}\)), this reduces to Lemma 1.173.
If \(\alpha \) is a finite type, then \(\sum _{i \in \alpha } \delta _{i,j} \cdot f(i) = f(j)\).
By the multiplication property, \(\delta _{i,j} \cdot f(i) = 0\) for \(i \neq j\) and \(f(j)\) for \(i = j\). The sum collapses to the single term \(f(j)\).
If \(\alpha \) is a finite type, then \(\sum _{j \in \alpha } f(j) \cdot \delta _{i,j} = f(i)\).
Analogous to Lemma 1.175.
For natural numbers \(n,m\in \mathbb {N}\), we have \(\delta _{n,m} = \begin{cases} 1 & \text{if } n = m, \\ 0 & \text{if } n \neq m. \end{cases}\)
This is an immediate specialization of the definition to natural numbers.
For integers \(n,m\in \mathbb {Z}\), we have \(\delta _{n,m} = \begin{cases} 1 & \text{if } n = m, \\ 0 & \text{if } n \neq m. \end{cases}\)
This is an immediate specialization of the definition to integers.
For a nontrivial ring \(K\) and any \(i,j\), we have \(\delta _{i,j} = 0\) if and only if \(i \neq j\).
If \(i \neq j\), then \(\delta _{i,j} = 0\) by definition. Conversely, if \(i = j\), then \(\delta _{i,j} = 1 \neq 0\) (since \(K\) is nontrivial).
For a nontrivial ring \(K\) and any \(i,j\), we have \(\delta _{i,j} = 1\) if and only if \(i = j\).
If \(i = j\), then \(\delta _{i,j} = 1\) by definition. Conversely, if \(i \neq j\), then \(\delta _{i,j} = 0 \neq 1\) (since \(K\) is nontrivial).
1.6.5 Laws of substitution
Composition of FPSs satisfies the rules you would expect it to satisfy:
(a) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \((f_1 + f_2) \circ g = f_1 \circ g + f_2 \circ g\).
(b) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \((f_1 \cdot f_2) \circ g = (f_1 \circ g) \cdot (f_2 \circ g)\).
(c) If \(f_1, f_2, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \(\frac{f_1}{f_2} \circ g = \frac{f_1 \circ g}{f_2 \circ g}\), as long as \(f_2\) is invertible. (In particular, \(f_2 \circ g\) is automatically invertible under these assumptions.)
(d) If \(f, g \in K[[x]]\) satisfy \([x^0]g = 0\), then \(f^k \circ g = (f \circ g)^k\) for each \(k \in \mathbb {N}\).
(e) If \(f, g, h \in K[[x]]\) satisfy \([x^0]g = 0\) and \([x^0]h = 0\), then \([x^0](g \circ h) = 0\) and \((f \circ g) \circ h = f \circ (g \circ h)\).
(f) We have \(\underline{a} \circ g = \underline{a}\) for each \(a \in K\) and \(g \in K[[x]]\).
(g) We have \(x \circ g = g \circ x = g\) for each \(g \in K[[x]]\).
(h) If \((f_i)_{i \in I} \in K[[x]]^I\) is a summable family of FPSs, and if \(g \in K[[x]]\) is an FPS satisfying \([x^0]g = 0\), then the family \((f_i \circ g)_{i \in I} \in K[[x]]^I\) is summable as well and we have \((\sum _{i \in I} f_i) \circ g = \sum _{i \in I} f_i \circ g\).
(a) Write \(f_1 = \sum _{n \in \mathbb {N}} f_{1,n} x^n\) and \(f_2 = \sum _{n \in \mathbb {N}} f_{2,n} x^n\). Then \(f_1 + f_2 = \sum _{n \in \mathbb {N}} (f_{1,n} + f_{2,n}) x^n\), so
(f) We have \(\underline{a} = \sum _{n \in \mathbb {N}} a \delta _{n,0} x^n\). Hence \(\underline{a}[g] = \sum _{n \in \mathbb {N}} a \delta _{n,0} g^n = a \cdot 1 + 0 = \underline{a}\).
(g) We have \(x = \sum _{n \in \mathbb {N}} \delta _{n,1} x^n\), so \(x[g] = \sum _{n \in \mathbb {N}} \delta _{n,1} g^n = g^1 = g\). Also, writing \(g = \sum _{n \in \mathbb {N}} g_n x^n\), we get \(g[x] = \sum _{n \in \mathbb {N}} g_n x^n = g\).
(b) Write \(f_1 = \sum _{i \in \mathbb {N}} f_{1,i} x^i\) and \(f_2 = \sum _{j \in \mathbb {N}} f_{2,j} x^j\). Then
The same computation with \(g\) replaced by \(x\) gives \(f_1 \cdot f_2 = \sum _{n \in \mathbb {N}} c_n x^n\) where \(c_n = \sum _{\substack{[(, i, ,, j, ), , \in , \mathbb , {, N, }, ^, 2, ;, , \\ , , i, +, j, =, n]}} f_{1,i} f_{2,j}\). Thus \((f_1 \cdot f_2)[g] = \sum _{n \in \mathbb {N}} c_n g^n = f_1[g] \cdot f_2[g]\).
The interchange of infinite summation signs (discrete Fubini) is justified because the family \((f_{1,i} f_{2,j} g^{i+j})_{(i,j) \in \mathbb {N}^2}\) is summable: for any \(m \in \mathbb {N}\) and any \((i,j)\) with \(m {\lt} i+j\), we have \([x^m](g^{i+j}) = 0\) by Proposition 1.163 (a).
(c) Assume \(f_2\) is invertible. By part (b) applied to \(f_2^{-1}\), \((f_2^{-1} \cdot f_2) \circ g = (f_2^{-1} \circ g) \cdot (f_2 \circ g)\), so \((f_2^{-1} \circ g) \cdot (f_2 \circ g) = \underline{1} \circ g = \underline{1}\) by part (f). Hence \(f_2 \circ g\) is invertible. Then by part (b) applied to \(f_1/f_2\), \((f_1/f_2 \cdot f_2) \circ g = (f_1/f_2 \circ g) \cdot (f_2 \circ g)\), i.e., \(f_1 \circ g = (f_1/f_2 \circ g) \cdot (f_2 \circ g)\). Dividing by \(f_2 \circ g\) gives \(f_1/f_2 \circ g = (f_1 \circ g)/(f_2 \circ g)\).
(d) By induction on \(k\). Base case: \(f^0 \circ g = \underline{1} \circ g = \underline{1} = (f \circ g)^0\) by part (f). Induction step: \(f^{m+1} \circ g = (f \cdot f^m) \circ g = (f \circ g) \cdot (f^m \circ g) = (f \circ g) \cdot (f \circ g)^m = (f \circ g)^{m+1}\) by part (b).
(h) First, we show \((f_i \circ g)_{i \in I}\) is summable. Since \((f_i)_{i \in I}\) is summable, for each \(n \in \mathbb {N}\) there is a finite \(I_n \subseteq I\) such that \([x^n] f_i = 0\) for all \(i \in I \setminus I_n\). For \(i \in I \setminus (I_0 \cup I_1 \cup \cdots \cup I_n)\), the first \(n+1\) coefficients of \(f_i\) are all \(0\), so by Lemma 1.166, \([x^n](f_i \circ g) = 0\). Thus the family \((f_i \circ g)_{i \in I}\) is summable.
The equality \((\sum _{i \in I} f_i) \circ g = \sum _{i \in I} f_i \circ g\) follows by writing each \(f_i = \sum _{n \in \mathbb {N}} f_{i,n} x^n\) and interchanging summation signs (justified by summability of the family \((f_{i,n} g^n)_{(i,n) \in I \times \mathbb {N}}\)).
(e) Write \(g = \sum _{n \in \mathbb {N}} g_n x^n\). By Proposition 1.163 (c) (applied to \(g, h\)), \([x^0](g \circ h) = g_0 = [x^0]g = 0\).
For associativity: \(f \circ g = \sum _{n \in \mathbb {N}} f_n g^n\), and the family \((f_n g^n)_{n \in \mathbb {N}}\) is summable by Proposition 1.163 (b). By part (h),
By parts (b), (d), and (f), \((f_n g^n) \circ h = (\underline{f_n} \cdot g^n) \circ h = (\underline{f_n} \circ h) \cdot (g^n \circ h) = \underline{f_n} \cdot (g \circ h)^n = f_n (g \circ h)^n\). Hence \((f \circ g) \circ h = \sum _{n \in \mathbb {N}} f_n (g \circ h)^n = f \circ (g \circ h)\).
1.6.6 Example: Fibonacci generating function
The geometric series is the FPS \(\frac{1}{1-x} = (1-x)^{-1} \in K[[x]]\) (where \(K\) is a field).
Substituting \(x + x^2\) into the geometric series \(\frac{1}{1-x}\) yields the generating function for shifted Fibonacci numbers:
By Proposition 1.181 (c), substitution preserves inverses when the original has invertible constant coefficient. Since \([x^0](1-x) = 1 \neq 0\), we have \(\frac{1}{1-x}\big[x+x^2\big] = \frac{1}{(1-x)[x+x^2]}\). Now \((1-x)[x+x^2] = 1 - (x+x^2) = 1 - x - x^2\) by the linearity of substitution, so the result follows.