Editorial - AtCoder Beginner Contest 265

Contest Duration: 2022-08-21 21:00:00+0900 - 2022-08-21 22:40:00+0900 (local time) (100 minutes) Back to Home

Official

Ex - No-capture Lance Game Editorial by en_translator

First of all, if you want to read the editorial for this problem, we recommend you start reading from “Editorial for this problem.”

This problem can be solved if you understand transformations like Fourier transform, multi-dimensional Fourier transform, and Hadamard transform. In this editorial, we aim at having you understand

how do an “ordinary convolution (product of polynomials)” and “XOR convolution” work?

First, we will explain the preliminaries.

Discrete Fourier transform

We will first explain discrete Fourier transform.
The discrete Fourier transform of length \(N\) (we simply call it FT) transforms a given input sequence \(f_0, f_1, \dots, f_{N-1}\) to an output sequence \(F_0, F_1, \dots, F_{N-1}\), where the general term \(F_k\) is defined by the following equation. Here, \(W_N\) (or simply \(W\)) denotes the \(N\)-th root of \(1\).

\[F_k = \sum_{n=0}^{N-1} f_n W_N^{nk}.\]

The equation alone may be hard to understand, so we try explaining intuitively. In terms of competitive programming, we think it is the most convent to consider FT as a “multi-point evaluation of a polynomial.” Here, “evaluation” means obtaining \(f(a)\) given a function \(f(x)\) and a value \(a\). Specifically, we consider FT as follows:

Consider a polynomial \(f(x) = f_0 + f_1 x + f_2 x^2 + \dots + f_{N-1} x^{N-1}\).
Then, the aforementioned \(F_k\) equals \(f(W^K)\). Thus, a FT is a function that computes \(f(W^0), f(W^1), \dots, f(W^{N-1})\) given \(f(x)\).

In competitive programming, we often use FT for computing the convolution of polynomials (in the text below, we regard \(f\) as \(f(x)\) many times without mentioning), so we think it is a good idea to understand FT by this definition in order to reduce the gap between theory and practice (in competing).

Also, the inverse transformation of FT (IFT) is defined as follows. (One can verify this is indeed an inverse operation by assigning the equation of FT into \(F_k\) below and asserting that the both hand sides coincide.)

\[f_n = \frac{1}{N} \sum_{k=0}^{N-1} F_k W_N^{nk}\]

An IFT is, in terms of polynomials as we did in FT, an “interpolation of a polynomial”:

Given \(N\) evaluations \(f(W^0), f(W^1),\dots,f(W^{N-1})\) of a polynomial \(f(x) = f_0 + f_1 x + f_2 x^2 + \dots + f_{N-1} x^{N-1}\),
IFT interpolates the original function \(f(x)\) from these multi-point evaluations.

The FT and IFT of length \(N\) can be computed in a total of \(\mathrm{O}(N \log N)\) time with an algorithm called Fast Fourier Transform (FFT).
Especially, when \(N\) is a power of two, there is an algorithm with a good constant factor, which is widely used, so when it comes to FFT in competitive programming, it implies that \(N\) is a power of two.

Calculating a convolution using FT

Now we will explain convolution, the most widely-used application of FT in competitive programming.

A convolution is an operation to, “given two sequences as an input, obtain a new sequence of sum of products with the same sum of indices.” Formally speaking, given sequences \(f\) and \(g\) and a binary operation \(\circ\), the operation outputs a sequence \(h\) such that

\[h_k = \sum_{i \circ j = k} f_i g_j.\]

In AtCoder especially, we often see the case where \(\circ\) is \(+\). When \(\circ\) is \(+\), it becomes

\[h_k = \sum_{i + j = k} f_i g_j.\]

Here, considering the generating functions \(f(x) = \sum_i f_i x^i, g(x) = \sum_j g_j x^j, h(x) = \sum_k h_k x^k\) of \(f, g, h\), we have

\[h(x) = f(x) g(x).\]

In other words, when \(\circ\) is \(+\), a convolution is a product of polynomials.

Other kinds of convolution and related problems is found in ABC-Ex (H) several times. For example:

if \(\circ\) is bitwise xor: XOR convolution (See also ABC212H)
if \(\circ\) is bitwise or / and: OR convolution, AND convolution (See also ABC220H)
if \(\circ\) is \(\times\): Dirichlet convolution (See also ABC239Ex)
miscellaneous: Concave Max Plus Convolution (See also ABC218H)

In these problems, one of the main theme is an optimization of “reducing a naive \(\mathrm{O}(N^2)\) convolution to a complexity of \(\mathrm{o}(N^2)\) with some tweaks.” A number of other examples of convolution is found in “畳み込めるものまとめ” (convolution gallery) by noshi91 (Japanese).

An ordinary convolution (the product of polynomials) can be computed with FT in a total of \(\mathrm{O}(N \log N)\) time, where \(N = \deg(f) + \deg(g)\). How can we compute the convolution?

We will denote by \(\mathcal{F}_N\) (or simply by \(\mathcal{F}\)) the function that means FT of length \(N\), and the IFT by \(\mathcal{F}_N^{-1}\). Also, given sequences \(f=(f_0, f_1, \dots, f_{N-1})\) and \(g=(g_0, g_1, \dots, g_{N-1})\), we define \(f \ast g\) as follows (\(f \ast g\) is an operation that is often referred to by the “pointwise product”):

\[(f \ast g) = (f_0 g_0, f_1 g_1, \dots, f_{N-1} g_{N-1}).\]

Consider an operation of obtaining sequence \(h\) from sequences \(f\) and \(g\) by the following transformation:

\[h = \mathcal{F}^{-1} (\mathcal{F}(f) \ast \mathcal{F}(g))\]

What is the relation between \(h\) obtained by this operation and \(f,g\)?

First, let us consider an easy case, \(N=2\).

Given \(f = (f_0, f_1)\) and \(g = (g_0, g_1)\), evaluate

\[h = \mathcal{F}^{-1} (\mathcal{F}(f) \ast \mathcal{F}(g)).\]

The FT \(\mathcal{F}(f)\) of length \(2\) is an operation of obtaining \(f(W^0), f(W^1)\) from \(f(x)\). Thus, it holds that

\[\mathcal{F}(f) \ast \mathcal{F}(g) = (f(W^0) g(W^0), f(W^1)g(W^1)).\]

Applying \(\mathcal{F}^{-1}\) to both hand sides yields \(h\). The IFT \(\mathcal{F}^{-1}(F)\) was to obtain the original polynomial \(f_0 + f_1 x\) from \(F_ 0 = f(W^0),F_1 = f(W^1)\).
Thus, in terms of polynomials, finding \(h\) is equivalent to:

Find a function \(h\) of first order such that \(h(W^0) = f(W^0)g(W^0), h(W^1) = f(W^1)g(W^1)\).

If \(h\) does not need to be a function of first order, by the polynomial remainder theorem, we may express the polynomial \(h(x)\) by the following general term:

\(h(x) = f(x) g(x) + C(x)(x - W^0)(x - W^1)\), where \(C(x)\) is an arbitrary polynomial

If \(N=2\), then \(W_2=-1\), so the equation above is equivalent to

\[h(x) = f(x) g(x) + C(x) (x^2 -1).\]

Considering that \(h\) is a function of first order, we can see that \(h(x)\) equals the remainder when dividing \(f(x)g(x)\) by \(x^2-1\), that is:

\[h(x) = f(x) g(x) \bmod{(x^2 - 1)}\]

Therefore, we could find that when \(N=2\), we can express \(h\) as

\[ \begin{aligned} h(x) &= f(x) g(x) \bmod{(x^2 - 1)}\\ &= (f_0 + f_1 x)(g_0 + g_1 x) \bmod{(x^2-1)} \\ &= (f_0 g_0 + f_1 g_1) + (f_0 g_1 + f_1 g_0) x. \end{aligned} \]

The same discussion is valid for general \(N\). By the polynomial remainder theorem, the following equation of \(h\) holds (where \(C(x)\) is an arbitrary polynomial):

\[h(x) = f(x) g(x) + C(x)\prod_{i=0}^{N-1}(x-W_N^i)\]

where the second term \(\prod_{i=0}^{N-1}(x-W_N^i)\) can be proved to be

\[\prod_{i=0}^{N-1}(x-W_N^i) = x^N - 1,\]

so we obtain the relation

\[h(x) = f(x) g(x) \bmod{(x^N - 1)}.\]

(Here, \(\bmod\) is a binomial mod operation corresponding to % in C++.)

What does \(\bmod{(x^N - 1)}\) on the right hand side actually mean? Since we have

\[x^{i + N} \equiv x^i \pmod{x^N - 1},\]

the operation of “modulo \(x^N - 1\)” is equivalent to “while as there is a term \(x^d, d \geq N\), adding its coefficient to that of \(x^{d-N}\) and removing it.” Therefore, considering \(h(x) = f(x) g(x) \bmod{(x^N - 1)}\) as a sequence, its term is written as an expression of convolution:

\[h_k = \sum_{i + j \equiv k \pmod{N}} f_i g_j.\]

Such a convolution on the addition \(\mod N\) is equivalent to “a circular convolution of period \(N\).”

Summary: Circular convolution of period \(N\)

Given sequences \(f\) and \(g\) of length \(N\), the sequence \(h\) obtained by

\[h = \mathcal{F}_N^{-1} (\mathcal{F}_N(f) \ast \mathcal{F}_N(g))\]
(in other words, the sequence \(h\) obtained as the IFT of pointwise product of FTs of \(f\) and \(g\)) satisfies:

\[h_k = \sum_{i + j \equiv k \pmod{N}} f_i g_j.\]
The operation of obtaining \(h\) from \(f\) and \(g\) is called “a circular convolution of period \(N\).”

For example, the circular convolution of period \(3\) of \(f(x) = 1 + 2x^2\) and \(g(x) = 1 + 2x + 3x^2\) is \(h(x) = 5 + 8x + 5x^2\). If you want to calculate it by hand, we can do the following column multiplication:

As you can see, colloquially speaking, a circular convolution of period \(N\) is understood by the image of “plucking off the terms of degree \(N\) and higher, and adding them to the lower terms.”

Now, how can we use a circular convolution to compute the ordinary product of polynomials, \(h(x) = f(x) g(x)\)?

As we described in the figure above, a circular convolution is roughly speaking “\(h\) is a product of polynomials, where something odd happens when it has terms of degree \(N\) or higher.” Therefore, if the period \(N\) is greater than \(\deg(f) + \deg(g)\), nothing peculiar happens, and it results in an ordinary product of polynomials!
Therefore, you can find the product of polynomials by the following algorithm in a total of \(\mathrm{O}(N \log N)\) time:

The algorithm of finding the product of polynomials:

Input: \(f, g\)
Output: \(h\) such that \(h_k = \sum_{i+j=k} f_i g_j\)

Take an integer \(N\) that greater than or equal to \(|f| + |g| - 1\).

Note that the degree of a sequence of length \(f\) is not \(|f|\), but \(|f| - 1\).

Zero-pad \(f, g\) to make the lengths \(N\).

Use FFT to obtain \(h'\), the circular convolution of period \(N\) of \(f\) and \(g\).

Let \(h\) be the first \(|f| + |g| - 1\) terms of \(h\).

As a sample code, here is a quote from AtCoder Library. Here, internal::butterfly is a function corresponding to discrete FT, and internal::butterfly_inv to inverse discrete FT (without multiplying by \(\frac{1}{N}\) at last).

Source: convolution.hpp, line 198-214

template <class mint, internal::is_static_modint_t<mint>* = nullptr>
std::vector<mint> convolution_fft(std::vector<mint> a, std::vector<mint> b) {
    int n = int(a.size()), m = int(b.size());
    int z = 1 << internal::ceil_pow2(n + m - 1);
    a.resize(z);
    internal::butterfly(a);
    b.resize(z);
    internal::butterfly(b);
    for (int i = 0; i < z; i++) {
        a[i] *= b[i];
    }
    internal::butterfly_inv(a);
    a.resize(n + m - 1);
    mint iz = mint(z).inv();
    for (int i = 0; i < n + m - 1; i++) a[i] *= iz;
    return a;
}

Multi-dimensional Fourier Transform

(This chapter is not ready yet. It will be published later.)

XOR convolution and Hadamard transform

An XOR convolution is, given sequences \(f\) and \(g\) of length \(2^n\), an operation of obtaining the sequence \(h\) of length \(2^n\) such that:

\[h_k = \sum_{i \oplus j = k} f_i g_j.\]

In the context of competitive programming, XOR convolution tends to be understood independently from FT, but in fact XOR convolution is a convolution using multi-dimensional FT.

Let us consider a circular convolution of length \(2\) again.

Given sequences \(f = (f_0, f_1)\) and \(g = (g_0, g_1)\) of length \(2\), the circular convolution of them yields

\[h = (f_0 g_0 + f_1 g_1, f_0 g_1 + f_1 g_0).\]

Taking a closer look, we find that this \(h\) satisfies

\[h_k = \sum_{0 \leq i \lt 2, 0 \leq j \lt 2, i \oplus j} f_i g_j.\]

That is, a circular convolution of period \(2\) is a 1-bit XOR convolution!

Since bitwise XOR is of course bitwise independent, the \(n\)-dimensional extension of circular convolution of length \(2\) is equivalent to \(n\)-bit XOR convolution.

The \(n\)-dimensional discrete FT of length \(2\) is called Hadamard transform (Hadamard-Walsh transform), which may be more widely known.

See also: Wikipedia

A common implementation of Hadamard conversion follows. Staring at the implementation, you will see that the process coincides with multi-dimensional FFT.

template <typename T>
void walsh_hadamard_transform(vector<T>& f, bool inverse = false) {
  int n = f.size();
  for (int i = 1; i < n; i <<= 1) {
    for (int j = 0; j < n; j++) {
      if ((j & i) == 0) {
        T x = f[j], y = f[j | i];
        f[j] = x + y, f[j | i] = x - y;
      }
    }
  }
  if (inverse) {
    T invn = T{1} / T{f.size()};
    for (auto& x : f) x *= invn;
  }
}

Editorial for this problem

The gang’s all here; time to explain the problem.

First, let us consider the conditions of the initial state that the first player will win.
For each row, let us define the evaluation value corresponding to the state of the row as follows:

Suppose that the first player’s and second player’s pieces are placed at \((i, j)\) and \((i, k)\), respectively. We define the pair of evaluation values \((s_i, g_i)\) of the \(i\)-th row as follows:

if \(k \lt j\): \((s_i, g_i) = (0, j - k - 1)\);

if \(k \gt j\): \((s_i, g_i) = ((j-1) - (W-k), 0)\).

Also, let us define the evaluation value of the entire game by:

\[(S, G) = (\sum_{i=1}^H s_i, \bigoplus_{i=1}^H g_i).\]

Then, the conditions that the first player will win is:

\[(\text{The first player will win}) \iff (S \gt 0) \vee \left\lbrace (S = 0) \wedge (G \gt 0) \right\rbrace.\]

It can be justified by a careful discussion; it can also be understand by the fact that \(s_i\) corresponds to Surreal Number that is used to evaluate a partisan game, and \(g_i\) to Grundy Number that is used for a impartial game. (For the editorial of Surreal Number, see ABC229-H)

Based on this fact, the problem can be solved by a naive DP in a total of \(\mathrm{O}(H^2 W^3)\) time. Moreover, with a two-dimensional convolution that computes the ordinary convolution along the first axis and the XOR convolution on the second, it can be sped up to \(\mathrm{O}(HW^2 \log (HW))\) time.

As this can be considered as a convolution using \((\left\lceil \log_2{W} \right\rceil + 1)\)-dimensional FFT, so the algorithm we explained so far can be directly applied for the implementation.

With a more careful observation on the property of the problem, it can also be solved in about \(\mathrm{O}(HW \text{poly}(\log(HW)))\) time, but we will omit the details.

Sample code, C++

posted:
last update: