Official

G - Balls and Boxes Editorial by en_translator


This problem is an introductory problem for the technique called generating functions and Formal Power Series (FPS). We hope that you make use of this problem and editorial to learn the concept of generating functions.

What are generating functions?

G.f. is the projection of combinatorial structures. — Elegia

Simple put, a generating function (GF) is a function-like form that stores the information of a sequence.
For an infinite sequence \(a_0, a_1, a_2, \dots\), define its GF \(A(x)\) as:

\[A(x) = a_0 + a_1 x + a_2 x^2 + \dots = \sum_{i=0}^\infty a_i x^i.\]

In fact, there are several kinds of GFs. The definition above is the most general definition of a GF. For distinction, this is sometimes called an ordinary generating function (OGF).

Additionally, a polynomial with possibly infinite terms, like \(A(x)\), is called a formal power series (FPS). (Although GF and FPS are confused in the context of competitive programming, but these are different concepts as you see.) Also, we define an operation symbol \(\lbrack x^n \rbrack\) as the operation to obtain a coefficient of an FPS:

\[\lbrack x^n \rbrack A(x) := a_n.\]

(Addendum that may be skipped) Here we briefly mention the topic that might strikes you later. Strictly speaking, an FPS is not a function, where you can assign a value to \(x\). Roughly speaking, \(x\) is nothing but a symbol. This is not a meaningless symbol though; under the definition of multiplication, \(x\) and \(x^n\) can be properly defined. This treatment of FPS allows us to ignore analytical discussions like radius of convergence, and FPS can be simply regarded as an algebraic concept on which we can define the addition and multiplication. What we are interested in is not the value \(A(x)\) of a function, but the coefficients \(a_n\) of \(A(x)\), so there is no issue here.

Now we have described that a GF is simply defined by a sequence. But GFs are not merely an inorganic concept that can be automatically obtained from a sequence; GF can rather be interpreted as a “projection” of combinatoric objects, which we want to count (like graphs, strings, or operation sequences), according to how we want to count them. The method of counting objects based on represents of combinatoric structures as GFs are very widely applicable. By this high applicability, combinatoric ideas using GFs have become widely known in the field of competitive programming.

In the following chapter, we describe why a GF can be regarded as a “projection,” and how objects to be counted and their sets relate to GFs.

Relation between ODF and combinatoric objects

This chapter is a modified version of ABC230 Editorial.

For clarity, in the following editorial we will consistently use the following cases and typefaces for sequences, GFs, and sets, described as follows:

  • Sequence \(\to\) lowercase italic (\(a, b, c, \dots\))
  • GF \(\to\) uppercase italic (\(A, B, C, \dots\))
  • Set \(\to\) uppercase calligraphic (\(\mathcal{A}, \mathcal{B}, \mathcal{C}, \dots\) )

In combinatoric problems, there are always “objects” (like graphs, strings, operation sequences, …) to be counted, and the “collections” (sets) of objects. When this set has a good property, sets can be replaced with GFs.

For example, let us first consider the following easy problem.

Takahashi decided to buy some number of bentos (a “bento” is a Japanese lunch box).
Cafeteria A sells \(a_i\) number of bentos for \(i\) yen (the currency in Japan) each.
Cafeteria B sells \(b_j\) number of bentos for \(j\) yen each.

(1) Takahashi decided to choose either Cafeteria A or B and buy a lunch box there. What are the possible combinations?
(2) Takahashi decided to two bentos from both of the cafeterias. What are the possible combinations? (3) Consider the GF of “the number \(a_n\) of combinations of bentos that cost a total of \(n\) yen.” Express the GF of (1) and (2) as GFs.

For example, let us formulate the following case: suppose that Cafeteria A sells two bentos, one for \(1\) yen and the other for \(2\) yen, and Cafeteria B sells two bentos, one for \(1\) en and the other for \(3\) yen.
Let us use a notation like \(1\) for a bento from Cafeteria A, and \(1^\ast\) for Cafeteria B. Then, the sets \(\mathcal{A}\) and \(\mathcal{B}\) of bentos of Cafeteria A and B are:

\[ \mathcal{A} = \lbrace 1, 2 \rbrace , \mathcal{B} = \lbrace 1^\ast , 3^\ast \rbrace. \]

Next, let us express the objects to count in (1) and (2) as sets, too.
The operation in (1) corresponds to an operation of merging two disjoint sets, and the operation in (2) corresponds to an operation of obtaining a combination by choosing one element from each of given sets.
In a broad sense, these operations can be regarded as a direct sum (a disjoint union) and a direct product of sets, so let us denote the operation of (1) by \(+\), and that of (2) by \(\times\).
In the example above, \(\mathcal{A} + \mathcal{B}\) and \(\mathcal{A} \times \mathcal{B}\) can be expressed as follows.

\[ \begin{aligned} \mathcal{A} + \mathcal{B} &= \lbrace 1,2,1^\ast, 3^\ast \rbrace \\ \mathcal{A} \times \mathcal{B} &= \lbrace 1, 2 \rbrace \times \lbrace 1^\ast , 3^\ast \rbrace \\ &= \lbrace (1,1^\ast), (1,3^\ast), (2,1^\ast), (2,3^\ast) \rbrace \end{aligned} \]

Next, let us replace them with the GFs.

Let \(A(x)\) and \(B(x)\) be the GFs of \(\mathcal{A}\) and \(\mathcal{B}\); then

\[ A(x) = \sum_i a_i x^i, B(x) = \sum_j b_j x^j.\]

The GFs corresponding to (1) and (2) can be actually described with \(A(x)\) and \(B(x)\).
First, consider \(\mathcal{A} + \mathcal{B}\). If \(\mathcal{A}\) and \(\mathcal{B}\) are disjoint, then

  • (the number of combinations in \(\mathcal{A}\) for a total of \(n\) yen) + (the number of combinations in \(\mathcal{B}\) for a total of \(n\) yen) = (the number of combinations in \(\mathcal{A} + \mathcal{B}\) for a total of \(n\) yen),

so, denoting by \(c_i\) the answer for a total of \(i\) yen, it holds that

\[ c_i = a_i + b_i.\]

By this equation, the GF \(C(x)\) for (1) is obtained as

\[ \begin{aligned} C(x) &= \sum_{i} (a_i + b_i) x^i \\ &= A(x) + B(x). \end{aligned} \]

Next, \(\mathcal{A} \times \mathcal{B}\). Let \(d_i\) be the answer for the total of \(d_i\) yen in (2), then

\[ d_k = \sum_{i + j = k} a_i b_j,\]

so it is derived that

\[ \begin{aligned} D(x) &= \sum_k \left(\sum_{i+j = k} a_i b_j\right) x^k \\ &= \sum_i a_i x^i \sum_j b_j x^j \\ &= A(x) \times B(x). \end{aligned} \]

In conclusion,

\[\mathcal{C} = \mathcal{A} + \mathcal{B} \iff C(x) = A(x) + B(x);\]

\[\mathcal{D} = \mathcal{A} \times \mathcal{B} \iff D(x) = A(x) \times B(x).\]

For example, for the case where \(\mathcal{A} = \lbrace 1, 2 \rbrace ,\mathcal{B} = \lbrace 1^\ast , 3^\ast \rbrace\),

\[ A(x) = x + x^2, B(x) = x + x^3;\]

\[C(x) = A(x) + B(x) = 2x + x^2 + x^3;\]

\[D(x) = A(x) \times B(x) = x^2 + x^3 + x^4 + x^5.\]

Comparing with the example above, it can be asserted that \(C(x)\) and \(D(x)\) are the GFs of \(\mathcal{A} + \mathcal{B}\) and \(\mathcal{A} \times \mathcal{B}\), respectively.

This example may have been too obvious that it is unclear why they are useful, so next we will introduce a bit more useful example.

Takahashi is going to climb a staircase, advancing one or two steps at once. For \(N = 0,1,\dots, n\), find the number of ways to climb a staircase of \(N\) steps.

Consider expressing “a way of climbing a staircase” a sequence. A sequence satisfying the condition consists of \(1\) or \(2\), and has a length at least \(0\). Therefore, the set of counted objects \(\mathcal{F}\) can be expressed as

\[\mathcal{F} = \lbrace \emptyset, (1), (2), (1, 1), (1, 2),(2,1), (2,2), \dots \rbrace.\]

Here, let us classify the elements of \(\mathcal{F}\) by whether he advanced \(1\) step or \(2\) steps in his first move. Then, by the recursive property of \(\mathcal{F}\), it can be described as

\[ \begin{aligned} &\mathcal{F} \\ &= \lbrace \emptyset \rbrace + \lbrace (1) \rbrace \times \lbrace \emptyset, (1), (2),\dots, \rbrace + \lbrace (2) \rbrace \times \lbrace \emptyset, (1), (2),\dots, \rbrace \\ &= \lbrace \emptyset \rbrace + \lbrace (1) \rbrace \times \mathcal{F} + \lbrace (2) \rbrace \times \mathcal{F}. \end{aligned} \]

Next, focusing on the “number of steps”, let us express \(\mathcal{F}\) as a GF. If we represent the number of steps as the exponent of \(x\), \(\emptyset\) can be replaced with \(x^0 = 1\), \((1)\) with \(x\), and \((2)\) with \(x^2\), so the GF \(F(x)\) corresponding to \(\mathcal{F}\) is written as

\[ \begin{aligned} &F(x) = 1 + xF(x) + x^2 F(x) \\ &\to F(x) = \frac{1}{1 - x - x^2}. \end{aligned} \]

Indeed, the series expansion of \(\frac{1}{1 - x - x^2}\) is

\[F(x) = 1 + x + 2x^2 + 3x^3 + 5x^4 + 8x^5 + \dots,\]

so it can be asserted that the desired sequence \((f_n)\) is equal to the Fibonacci sequence,

\[ (f_n) = 1, 1, 2, 3, 5, 8, 13, 21, \dots.\]

The two examples imply that a GF is a minimum container that stores only necessary information (this time, like price or number of steps) of objects to be counted.
If the counted objects have a certain good property, the size of set of objects can be expressed by the elementary operations like additions and multiplications of GFs, which turn out to be computable. One of the advantages of using GFs is that such operations can be performed mechanically.

Combinatorial “structure” and function composition

This chapter is a modified version of ABC230 Editorial.

The previous chapter introduces only the following two operations: \(+\) and \(\times\). Now let us introduce a little more large “building block”.

A set of counted object is often defined on a combinatorial “structure” (e.g. on a sequence, circumference, set, …). Such structures can be treated as a map from a set to another, \(\mathrm{C}(\mathcal{A})\).

It is worth noting that, if we know what kind of function \(\mathrm{C}\) is, then we can immediately illustrate the set of \(\mathrm{C}(\mathcal{A})\) by assigning \(\mathcal{A}\) to \(\mathrm{C}\).
So, no matter how complex structure the counted object has, once we know what kind of sets it is composed of, we may “decompose” it into some simple components so that the observation is drastically simplified.

As an easy example of “structure”, for some set \(\mathcal{A}\), consider a structure \(\mathrm{SEQ}(\mathcal{A})\) corresponding to “a sequence consisting of elements of \(\mathcal{A}\) with zero or more length.” A sequence of length \(i\) corresponds to \(\mathcal{A}^i\), so \(\mathrm{SEQ}(\mathcal{A})\) can be expressed as follows, where \(\mathcal{E}\) denotes the set of empty sequence:

\[\mathrm{SEQ}(\mathcal{A}) = \mathcal{E} + \mathcal{A} + \mathcal{A} \times \mathcal{A} + \mathcal{A} \times \mathcal{A} \times \mathcal{A} + \dots = \sum_{i=0}^\infty \mathcal{A}^i.\]

In the world of GF, it is transformed to

\[S(x) = 1 + A(x) + A(x)^2 + \dots = \sum_{i=0}^\infty A(x) = \frac{1}{1 - A(x)}\]

by the formula of geometric series.

Let’s apply the structure \(\mathrm{SEQ}(\mathcal{A})\) to the problem. The previous example of Fibonacci sequence can be processed by assigning to \(\mathrm{SEQ}(\mathcal{A})\), too. “A way of climbing a staircase” can be regarded as a sequence of moves of length zero ore more, and the GF of operation \(\mathcal{A}\) of “advancing one or two steps” is

\[A(x) = x + x^2.\]

By assigning it to the formula above, the GF \(F(x)\) of Fibonacci sequence can be immediately obtained as

\[F(x) = \frac{1}{1 - A(x)} = \frac{1}{1 - x - x^2}.\]

As you can see, when the objects to be counted is composed of some “structures”, if we know the function of each of the “structures”, we can immediately obtain the GF of the object to be counted by composing these functions.

Exponential Generating Function

When considering a GF of a sequence, sometimes the GF representation of \(a_0, a_1, \dots, a_n, \dots, \) itself can be complex, while that of \(\dfrac{a_0}{0!}, \dfrac{a_1}{1!}, \dots, \dfrac{a_n}{n!},\dots\) is relatively easy to handle. Since this often happens, this type of GFs are named exponential generating function (EGF). That is, the following expression \(\hat{A}(x)\) is an EGF:

\[\hat{A}(x) = \sum_{n=0}^\infty \frac{a_n}{n!} x^n\]

Define an operation symbol \(\left\lbrack \dfrac{x^n}{n!} \right\rbrack\) as the operation to obtain a coefficient of an EGF:

\[\left\lbrack \dfrac{x^n}{n!} \right\rbrack \hat{A}(x) := a_n\]

An important property of an EGF is that the convolution of EGF is a weighted convolution of an EGF; that is:

\[ \begin{aligned} &\left\lbrack \dfrac{x^k}{k!} \right\rbrack \hat{A}(x) \hat{B}(x) \\ &= \left\lbrack \dfrac{x^k}{k!} \right\rbrack \left(\sum_{i=0}^\infty \frac{a_i}{i!}x^i\right) \left(\sum_{j=0}^\infty \frac{b_j}{j!}x^j\right)\\ &= k! \sum_{i+j=k} \frac{a_i}{i!} \frac{b_j}{j!}\\ &= \sum_{i+j=k} \binom{k}{i} a_i b_j. \end{aligned} \]

We have gone through the properties of an EGF. Next, we will introduce the relations between EGFs and combinatoric structures, and why EGFs have a good property.

Relations between EGFs and combinatoric structures

As mentioned above, the addition and multiplication on an OGF corresponds to the disjoint union and direct product of sets. In fact, the addition and multiplication of the addition and multiplication of a EGF can be interpreted as the disjoint union and direct product of sets of labeled objects. We will take a closer look on what it means.

First, define a labeled object \(a\) as follows:

  • \(a\) is an object that contains \(\vert a \vert\) “containers” (like vertices of a graph). The container of \(a\) is labeled \(1, 2, \dots, |a|\).

Next, define an operation of merging labeled objects \(a\) and \(b\) to obtain a new set of objects, \(a \otimes b\).

There are \(\binom{|a| + |b|}{|a|}\) ways to divide \(|a|+|b|\) containers labeled \(1, 2, \dots, |a| + |b|\) into two sets, one of size \(|a|\) and the other of \(|b|\). For each division, generate a new object obtained by copying an object \(a\) to the former set and an object \(b\) to the latter set. Let \(a \otimes b\) be the set of the \(\binom{|a| + |b|}{|a|}\) objects obtained this way.

For example, let \(a\) be three cells painted red, and \(b\) be one cell painted blue. If we denote red by \(\text{R}\), blue by \(\text{B}\), and correspond the \(i\)-th character of a string to the cell labeled \(i\), then \(a \otimes b\) can be written as

\[a \otimes b = \lbrace \text{RRRB, RRBR, RBRR, BRRR}\rbrace.\]

This must be a natural definition of a union of labeled objects.

Now, let us express a merging operation of objects as a GF by considering the number of containers in an object. In case of ordinary unlabeled objects, a direct product of size-\(1\) set is again a size-\(1\) set, so the representation is straightforward:

\[x^{|a| + |b|} = x^{|a|} \times x^{|b|}.\]

However, this time, one pair of objects generates \(\binom{|a| + |b|}{|a|}\), so the expression above does not work well.

Instead, let us consider whether we can define weights on a GF. That is, consider defining a weight \(w(|a|)\) on a size-\(|a|\) object and correspond it to a GF of the form \(w(|a|) x^{|a|}\).

Which function is appropriate for \(w\)? The desired function \(w\) must satisfy

\[\binom{|a| + |b|}{|a|} w(|a| + |b|) x^{|a|+|b|} = w(|a|) x^{|a|} \times w(|b|) x^{|b|}.\]

Since \(\binom{|a| + |b|}{|a|} = \frac{(|a|+|b|)!}{|a|!|b|!}\), we can prove that \(w(n) = \frac{1}{n!}\) works universally.
That is, by corresponding a labeled object \(a\) to

\[\frac{x^{|a|}}{|a|!},\]

the composition of objects,

\[\binom{|a| + |b|}{|a|} = |a \otimes b|,\]

can be represented using GFs as

\[\binom{|a| + |b|}{|a|} \frac{x^{|a|+|b|}}{(|a|+|b|)!} = \frac{x^{|a|}}{|a|!} \times \frac{x^{|b|}}{|b|!}.\]

Since this is an identity, everything works fine.

So far, we have considered only compositions of a pair of labeled objects, but this definition can be extended to a direct product of a pair of sets of paired objects.

Define the GF of a set of labeled object \(\mathcal{A}\) as:

\[A(x) = \sum_{a \in \mathcal{A}} \frac{x^{|a|}}{|a|!} = \sum_{n=0}^\infty \frac{a_n}{n!} x^n.\]

Then, for two disjoint labeled objects \(\mathcal{A}\) and \(\mathcal{B}\), define

\[\mathcal{A} + \mathcal{B} = \mathcal{A} \cup \mathcal{B},\]

\[\mathcal{A} \times \mathcal{B} = \bigcup_{a \in \mathcal{A}, b \in \mathcal{B}} a \otimes b.\]

Then one can assert that

\[\mathcal{C} = \mathcal{A} + \mathcal{B} \iff C(x) = A(x) + B(x),\]

\[\mathcal{D} = \mathcal{A} \times \mathcal{B} \iff D(x) = A(x) \times B(x).\]

Here we see that, while ODFs and EGFs look different at a glance, these are actually similar concept in that the addition and multiplication correspond to the disjoint union and direct product operation of a set.

This concludes the explanation of combinatoric background of GFs. We hope that this editorial helps understanding GFs.

The answer to the original problem

Now let us return to the original problem. It turns out that Problem 1 asks to count (unlabeled) objects, and Problem 2 asks to count labeled objects. Therefore, one can adopt OGF for Problem 1 and EGF for Problem 2.

Specifically, the answer to Problem 1 is

\[\lbrack x^N \rbrack F_1(x) F_2(x) F_3(x)\]

where

\[F_1(x) = \sum_{i=0}^\infty x^{iA}, F_2(x) = \sum_{i=0}^\infty x^{iB}, F_3(x) = \sum_{i=0}^\infty x^{iC},\]

and the answer to Problem 2 is

\[\left\lbrack \frac{x^N}{N!} \right\rbrack G_1(x) G_2(x) G_3(x)\]

where

\[\lbrack x^N \rbrack F_1(x) F_2(x) F_3(x).\]

The convolution of GFs modulo \(998244353\) can be computed using a convolution library like atcoder::convolution in \(\mathrm{O}(n \log n)\) terms, where \(n\) is the number of terms in GFs. Therefore, one can compute the first \((N+1)\) terms of \(F_1(x), F_2(x), F_3(x), G_1(x), G_2(x)\), and \(G_3(x)\) and convolve them to solve our problems in a total of \(\mathrm{O}(N \log N)\) time, which is fast enough.

  • Sample code (C++)
#include <iostream>
#include <vector>
using namespace std;

#include "atcoder/convolution.hpp"
#include "atcoder/modint.hpp"
using mint = atcoder::modint998244353;

int main() {
  int N, A, B, C;
  cin >> N >> A >> B >> C;
  vector<mint> fac(N + 1), invfac(N + 1);
  fac[0] = 1;
  for (int i = 1; i <= N; i++) fac[i] = fac[i - 1] * i;
  invfac[N] = fac[N].inv();
  for (int i = N; i >= 1; i--) invfac[i - 1] = invfac[i] * i;

  {
    vector<mint> F1(N + 1), F2(N + 1), F3(N + 1);
    for (int i = 0; i <= N; i++) {
      if (i % A == 0) F1[i] = 1;
      if (i % B == 0) F2[i] = 1;
      if (i % C == 0) F3[i] = 1;
    }
    auto mid = atcoder::convolution(F1, F2);
    auto F = atcoder::convolution(mid, F3);
    cout << F[N].val() << "\n";
  }

  {
    vector<mint> G1(N + 1), G2(N + 1), G3(N + 1);
    for (int i = 0; i <= N; i++) {
      if (i % A == 0) G1[i] = invfac[i];
      if (i % B == 0) G2[i] = invfac[i];
      if (i % C == 0) G3[i] = invfac[i];
    }
    auto mid = atcoder::convolution(G1, G2);
    auto G = atcoder::convolution(mid, G3);
    cout << (G[N] * fac[N]).val() << "\n";
  }
}

posted:
last update: