Editorial - AtCoder Regular Contest 161

Contest Duration: 2023-05-28 21:00:00+0900 - 2023-05-28 23:00:00+0900 (local time) (120 minutes) Back to Home

Official

E - Not Dyed by Majority (Cubic Graph) Editorial by ygussany

Supplement

This is a supplement of Editorial.

When $N$ is large, at least about $\displaystyle\frac{3}{4}$ of the color sequences are eligible.

Take $n$ vertices $r_1, r_2, \dots, r_n$ from the $N$ vertices such that each pair of vertices has distance at least $5$. From the degree constraint, there are at most $3 + 6 + 12 + 24 = 45$ vertices having distance at most $4$ from each vertex, so we may assume $n \ge \left\lceil\displaystyle\frac{N}{46}\right\rceil$. We then show that the ratio of the eligible color sequences is at least

\[\frac{3}{16} \sum_{k = 1}^{n} \left(\frac{3}{4}\right)^{k-1} = \frac{3}{4}\left(1 - \left(\frac{3}{4}\right)^n\right).\]

This value is at least $0.707$ when $n \ge 10 \ (N \ge 416)$, and at least $0.747$ when $n \ge 20 \ (N \ge 876)$.

Consider the directed graph whose vertex set is the set of color sequences and each of whose edge represents transition by the operation. Then, an eligible color sequence corresponds to a vertex of in-degree $0$. In this directed graph, exactly $1$ edge leaves each vertex, so the numbers of vertices and of edges are equal. Thus, the number of vertices of in-degree $0$ is equal to the sum of “the in-degree minus $1$” taken over all the vertices of in-degree at least $2$. The latter corresponds to the sum of “the number of color sequences resulting in the same color sequence by the operation minus $1$”. Based on the above observation, instead of eligible color sequences, we count as many pairs of color sequences $(S, S')$ that result in the same color sequence by the operation without making a cycle (so that the undirected graph formed by $(S, S')$ becomes a forest) as possible.

For $k = 1, 2, \dots, n$, make pairs as follows. Let $a_k, b_k, c_k$ be the three adjacent vertices of $r_k$, and consider color sequences $S$ such that

$S[a_k] = S[b_k] = S[c_k]$ holds, and
for $j = 1, 2, \dots, k - 1$, at least one of $S[a_j] \neq S[b_j]$, $S[b_j] \neq S[c_j]$, and $S[c_j] \neq S[a_j]$ holds.

Since $r_k$ and $r_j \ (j \neq k)$ have distance at least $5$, these conditions can be considered independently. The number of such color sequences is exactly $\left(\displaystyle\frac{1}{4}\right)\left(\displaystyle\frac{3}{4}\right)^{k-1}$ of the whole, and none of them appears multiple times by the two conditions.

For such a color sequence $S$, let $S'$ be the color sequence obtained by flipping the color of one of the three neighbors $a_k, b_k, c_k$ of $r_k$. We count the pairs $(S, S')$ such that $S$ and $S'$ result in the same color sequence by the operation. For any case, we have $S[a_k] = S[b_k] = S[c_k]$ and $S'[a_k] \neq S'[b_k]$, $S'[b_k] \neq S'[c_k]$, or $S'[c_k] \neq S'[a_k]$, so together with the second condition we can see that these $(S, S')$-s do not form any cycle even if we count them for every $k$ independently. (For each $S$, the only minimum $k$ such that $S[a_k] = S[b_k] = S[c_k]$ can change the color of one of them, so if some color sequence is reachable from another, such a path is unique.)

By symmetry, consider the case when $S'[a_k] \neq S[a_k]$ and multiply $3$ to the result. Let $x, y$ be the adjacent vertices of $a_k$ other than $r_k$. The results of the operation to $S$ and $S'$ are the same if and only if the adjacent vertices of $x, y$ other than $a_k$ have respectively the same color in $S$ ($x$ and $y$ have the colors, respectively, after the operation). The vertices $x, y$ may coincide with some of $r_k, b_k, c_k$, but cannot coincide with any of $r_j, a_j, b_j, c_j \ (j \neq k)$. (Recall that $r_k$ and $r_j$ have distance at least $5$.) Thus, under the above condition, the ratio of the color sequences $S$ is at least $\left(\displaystyle\frac{1}{2}\right)^2 = \displaystyle\frac{1}{4}$, and then $\left(\displaystyle\frac{1}{4}\right)^2\left(\displaystyle\frac{3}{4}\right)^{k-1}$ pairs $(S, S')$ have the same result of the operation. By multiplying $3$ and summing up over $k$, we obtain the statement at the beginning.

Estimation of success probability when $N$ is small.

We use the following Chernoff bound.

Let $X$ be a random variable, and let $a, t$ be positive reals. Then, the following inequality holds:

\[\mathrm{Pr}(X \le s) \le e^{ts} \cdot \mathbf{E}[e^{-tX}].\]

Let us consider the input in which all the test cases satisfy $N \le 416$. The sum of $N$ is at most $5 \times 10^4$, and it suffices to consider the maximum case. For the sake of simplicity, we consider the case when $N$ is the same for all the test cases, i.e., $T \approx \displaystyle\frac{5 \times 10^4}{N} \ge 120$. (Otherwise, e.g., classify $N$ into several intervals, and for each $[N_\mathrm{min}, N_\mathrm{max}]$ consider the case when the input contains $\displaystyle\frac{5 \times 10^4}{N_\mathrm{min}}$ test cases of size $N_\mathrm{max}$. This worsens the situation, but it will be still fine because the following estimation has a large margin.)

Let $p$ be the success probability, and let $q = 1 - p$ be the failure probability. By ignoring the constant factor, we regard the computational time of one trial as $N$, and estimate the probability that the computational time for one input will be a substantially large. For one test case, the expectation of computational time until succeeding is

\[\sum_{k = 1}^\infty pq^{k-1} kN = \frac{pN}{1 - q}\left(\sum_{k = 1}^\infty kq^{k-1} - \sum_{k=1}^\infty kq^k\right) = N\sum_{k=0}^\infty q^k = \frac{N}{p},\]

so by the linearity of expectation the expectation of the total computational time is $\displaystyle\frac{TN}{p}$. In what follows, take a suitable real $a > \displaystyle\frac{1}{p}$, suppose that the number of trials for one input is $aT$, and estimate the probability that the number of successes is less than $T$.

Let $X_i \ (i = 1, 2, \dots, aT)$ be the random variables that takes $1$ if the $i$-th trial succeeds and $0$ if it fails. Then, the number of successes is $X = X_1 + X_2 + \cdots + X_{aT}$. We have

\[\mathbf{E}[e^{-tX_i}] = pe^{-t} + q,\]

and $X_1, X_2, \dots, X_{aT}$ are independent, so we have

\[\mathbf{E}[e^{-tX}] = \mathbf{E}\left[e^{-t\sum_{i=1}^{aT} X_i}\right] = \mathbf{E}\left[\prod_{i=1}^{aT} e^{-tX_i}\right] = \prod_{i=1}^{aT} \mathbf{E}[e^{-tX_i}] = \left(pe^{-t} + q\right)^{aT}.\]

By applying the Chernoff bound, we obtain

\[\mathrm{Pr}(X < T) \le \mathrm{Pr}(X \le T) \le e^{tT} \left(pe^{-t} + q\right)^{aT}.\]

The right-hand side of this inequality is desired to be small, so let us consider which $t$ attains the minimum.

Let $f(t)$ be the right-hand side of the inequality. We then have

\[\log f(t) = tT + aT\cdot \log \left(pe^{-t} + q\right),\]

and its derivative is

\[\frac{\mathrm{d}}{\mathrm{d}t}\log f(t) = T + aT\cdot\frac{-pe^{-t}}{pe^{-t} + q}.\]

The right-hand side converges to $T - apT < 0$ as $t \to +0$ and to $T > 0$ as $t \to +\infty$, and $t^*$ satisfying

\[pe^{-t^*} = \frac{q}{a-1}\]

attains $0$. Hence, this $t^*$ minimizes $f(t)$, and the minimum value is

\[f(t^*) = \left(\frac{p}{q}\cdot(a - 1)\right)^T\left(q\cdot\left(1 + \frac{1}{a-1}\right)\right)^{aT} = \left(pq^{a-1}a\cdot\left(1 + \frac{1}{a-1}\right)^{a-1}\right)^T.\]

When $p = \displaystyle\frac{3}{16}, \ q = 1 - p = \displaystyle\frac{13}{16}$, by taking $a = 10$ for example, the inside of $T$-th power is about $\displaystyle\frac{3}{4}$, and the whole value is about $10^{-15}$ when $T = 120$. Thus, when the input consists of only test cases with $N$ small, the probability that the number of trials is more than $10T$ (the average number of failures is at least $10$) is almost zero, so we can get AC within the time limit with a large margin.

posted:
last update: