Official

E - Not Dyed by Majority (Cubic Graph) Editorial by ygussany

Supplement

This is a supplement of Editorial.

When $N$ is large, at least about $\displaystyle\frac{3}{4}$ of the color sequences are eligible.

Take \(n\) vertices \(r_1, r_2, \dots, r_n\) from the \(N\) vertices such that each pair of vertices has distance at least \(5\). From the degree constraint, there are at most \(3 + 6 + 12 + 24 = 45\) vertices having distance at most \(4\) from each vertex, so we may assume \(n \ge \left\lceil\displaystyle\frac{N}{46}\right\rceil\). We then show that the ratio of the eligible color sequences is at least

\[\frac{3}{16} \sum_{k = 1}^{n} \left(\frac{3}{4}\right)^{k-1} = \frac{3}{4}\left(1 - \left(\frac{3}{4}\right)^n\right).\]

This value is at least \(0.707\) when \(n \ge 10 \ (N \ge 416)\), and at least \(0.747\) when \(n \ge 20 \ (N \ge 876)\).

Consider the directed graph whose vertex set is the set of color sequences and each of whose edge represents transition by the operation. Then, an eligible color sequence corresponds to a vertex of in-degree \(0\). In this directed graph, exactly \(1\) edge leaves each vertex, so the numbers of vertices and of edges are equal. Thus, the number of vertices of in-degree \(0\) is equal to the sum of “the in-degree minus \(1\)” taken over all the vertices of in-degree at least \(2\). The latter corresponds to the sum of “the number of color sequences resulting in the same color sequence by the operation minus \(1\)”. Based on the above observation, instead of eligible color sequences, we count as many pairs of color sequences \((S, S')\) that result in the same color sequence by the operation without making a cycle (so that the undirected graph formed by \((S, S')\) becomes a forest) as possible.

For \(k = 1, 2, \dots, n\), make pairs as follows. Let \(a_k, b_k, c_k\) be the three adjacent vertices of \(r_k\), and consider color sequences \(S\) such that

  • \(S[a_k] = S[b_k] = S[c_k]\) holds, and
  • for \(j = 1, 2, \dots, k - 1\), at least one of \(S[a_j] \neq S[b_j]\), \(S[b_j] \neq S[c_j]\), and \(S[c_j] \neq S[a_j]\) holds.

Since \(r_k\) and \(r_j \ (j \neq k)\) have distance at least \(5\), these conditions can be considered independently. The number of such color sequences is exactly \(\left(\displaystyle\frac{1}{4}\right)\left(\displaystyle\frac{3}{4}\right)^{k-1}\) of the whole, and none of them appears multiple times by the two conditions.

For such a color sequence \(S\), let \(S'\) be the color sequence obtained by flipping the color of one of the three neighbors \(a_k, b_k, c_k\) of \(r_k\). We count the pairs \((S, S')\) such that \(S\) and \(S'\) result in the same color sequence by the operation. For any case, we have \(S[a_k] = S[b_k] = S[c_k]\) and \(S'[a_k] \neq S'[b_k]\), \(S'[b_k] \neq S'[c_k]\), or \(S'[c_k] \neq S'[a_k]\), so together with the second condition we can see that these \((S, S')\)-s do not form any cycle even if we count them for every \(k\) independently. (For each \(S\), the only minimum \(k\) such that \(S[a_k] = S[b_k] = S[c_k]\) can change the color of one of them, so if some color sequence is reachable from another, such a path is unique.)

By symmetry, consider the case when \(S'[a_k] \neq S[a_k]\) and multiply \(3\) to the result. Let \(x, y\) be the adjacent vertices of \(a_k\) other than \(r_k\). The results of the operation to \(S\) and \(S'\) are the same if and only if the adjacent vertices of \(x, y\) other than \(a_k\) have respectively the same color in \(S\) (\(x\) and \(y\) have the colors, respectively, after the operation). The vertices \(x, y\) may coincide with some of \(r_k, b_k, c_k\), but cannot coincide with any of \(r_j, a_j, b_j, c_j \ (j \neq k)\). (Recall that \(r_k\) and \(r_j\) have distance at least \(5\).) Thus, under the above condition, the ratio of the color sequences \(S\) is at least \(\left(\displaystyle\frac{1}{2}\right)^2 = \displaystyle\frac{1}{4}\), and then \(\left(\displaystyle\frac{1}{4}\right)^2\left(\displaystyle\frac{3}{4}\right)^{k-1}\) pairs \((S, S')\) have the same result of the operation. By multiplying \(3\) and summing up over \(k\), we obtain the statement at the beginning.

Estimation of success probability when $N$ is small.

We use the following Chernoff bound.

Let \(X\) be a random variable, and let \(a, t\) be positive reals. Then, the following inequality holds:

\[\mathrm{Pr}(X \le s) \le e^{ts} \cdot \mathbf{E}[e^{-tX}].\]

Let us consider the input in which all the test cases satisfy \(N \le 416\). The sum of \(N\) is at most \(5 \times 10^4\), and it suffices to consider the maximum case. For the sake of simplicity, we consider the case when \(N\) is the same for all the test cases, i.e., \(T \approx \displaystyle\frac{5 \times 10^4}{N} \ge 120\). (Otherwise, e.g., classify \(N\) into several intervals, and for each \([N_\mathrm{min}, N_\mathrm{max}]\) consider the case when the input contains \(\displaystyle\frac{5 \times 10^4}{N_\mathrm{min}}\) test cases of size \(N_\mathrm{max}\). This worsens the situation, but it will be still fine because the following estimation has a large margin.)

Let \(p\) be the success probability, and let \(q = 1 - p\) be the failure probability. By ignoring the constant factor, we regard the computational time of one trial as \(N\), and estimate the probability that the computational time for one input will be a substantially large. For one test case, the expectation of computational time until succeeding is

\[\sum_{k = 1}^\infty pq^{k-1} kN = \frac{pN}{1 - q}\left(\sum_{k = 1}^\infty kq^{k-1} - \sum_{k=1}^\infty kq^k\right) = N\sum_{k=0}^\infty q^k = \frac{N}{p},\]

so by the linearity of expectation the expectation of the total computational time is \(\displaystyle\frac{TN}{p}\). In what follows, take a suitable real \(a > \displaystyle\frac{1}{p}\), suppose that the number of trials for one input is \(aT\), and estimate the probability that the number of successes is less than \(T\).

Let \(X_i \ (i = 1, 2, \dots, aT)\) be the random variables that takes \(1\) if the \(i\)-th trial succeeds and \(0\) if it fails. Then, the number of successes is \(X = X_1 + X_2 + \cdots + X_{aT}\). We have

\[\mathbf{E}[e^{-tX_i}] = pe^{-t} + q,\]

and \(X_1, X_2, \dots, X_{aT}\) are independent, so we have

\[\mathbf{E}[e^{-tX}] = \mathbf{E}\left[e^{-t\sum_{i=1}^{aT} X_i}\right] = \mathbf{E}\left[\prod_{i=1}^{aT} e^{-tX_i}\right] = \prod_{i=1}^{aT} \mathbf{E}[e^{-tX_i}] = \left(pe^{-t} + q\right)^{aT}.\]

By applying the Chernoff bound, we obtain

\[\mathrm{Pr}(X < T) \le \mathrm{Pr}(X \le T) \le e^{tT} \left(pe^{-t} + q\right)^{aT}.\]

The right-hand side of this inequality is desired to be small, so let us consider which \(t\) attains the minimum.

Let \(f(t)\) be the right-hand side of the inequality. We then have

\[\log f(t) = tT + aT\cdot \log \left(pe^{-t} + q\right),\]

and its derivative is

\[\frac{\mathrm{d}}{\mathrm{d}t}\log f(t) = T + aT\cdot\frac{-pe^{-t}}{pe^{-t} + q}.\]

The right-hand side converges to \(T - apT < 0\) as \(t \to +0\) and to \(T > 0\) as \(t \to +\infty\), and \(t^*\) satisfying

\[pe^{-t^*} = \frac{q}{a-1}\]

attains \(0\). Hence, this \(t^*\) minimizes \(f(t)\), and the minimum value is

\[f(t^*) = \left(\frac{p}{q}\cdot(a - 1)\right)^T\left(q\cdot\left(1 + \frac{1}{a-1}\right)\right)^{aT} = \left(pq^{a-1}a\cdot\left(1 + \frac{1}{a-1}\right)^{a-1}\right)^T.\]

When \(p = \displaystyle\frac{3}{16}, \ q = 1 - p = \displaystyle\frac{13}{16}\), by taking \(a = 10\) for example, the inside of \(T\)-th power is about \(\displaystyle\frac{3}{4}\), and the whole value is about \(10^{-15}\) when \(T = 120\). Thus, when the input consists of only test cases with \(N\) small, the probability that the number of trials is more than \(10T\) (the average number of failures is at least \(10\)) is almost zero, so we can get AC within the time limit with a large margin.

posted:
last update: