Editorial - AtCoder Regular Contest 161

Contest Duration: 2023-05-28 21:00:00+0900 - 2023-05-28 23:00:00+0900 (local time) (120 minutes) Back to Home

Official

E - Not Dyed by Majority (Cubic Graph) Editorial by evima

The intended solution for this problem is to come up with a random selection through experimentation (or ESP). Since there is a case with $N = 10$ in the sample, if you try the operation for all $2^{10}$ color sequences, you can confirm that more than half of them are suitable as solutions (cannot appear after the operation). Furthermore, if you generate and test some small input examples with $N \le 20$, you can confirm that more than half of them are suitable (and the ratio increases with respect to $N$). So, if you expect this to hold for large $N$, you can consider a solution that “randomly selects a color sequence, checks if it is suitable as a solution, and prints it (if not, try again)”, and you can actually get AC with this method.

Also, you might be able to come up with this idea by thinking, “What’s going on with the judge for this problem?”

Sample Implementation (C, 57 ms)

Judge (Determining whether a solution is suitable or not)

First, let’s think about the judge for this problem. Given a color sequence $S$, we want to determine if there is a color sequence $T$ such that the color sequence becomes $S$ after the operation. This problem can be reduced to a 2-SAT with $N$ variables and $3N$ clauses, which can be reduced to a strongly connected component decomposition of a directed graph with $2N$ vertices and $6N$ edges, and can be solved in $\mathrm{O}(N)$ time.

Let $R[i]$ denote the $i$-th character of the string $R$ (corresponding to the color of vertex $i$ in the color sequence $R$). Prepare variables $t_i \ (i = 1, 2, \dots, N)$, where the value is true if $T[i] = {}$B and false if $T[i] = {}$W. Let $a, b, c$ be the vertices adjacent to vertex $i$. If $S[i] = {}$B, at least two of $T[a], T[b], T[c]$ must be B, which can be expressed as the constraint

\[(t_a \vee t_b) \wedge (t_b \vee t_c) \wedge (t_c \vee t_a).\]

These clauses represent the condition “if any one of them is false, the remaining two must be true”, which is equivalent to “more than half are true”. If $S[i] = {}$W, prepare clauses with all variables negated, and the reduction to 2-SAT is complete.

Ratio of suitable color sequences as solutions

In this problem, the ratio of suitable color sequences as solutions is sufficiently large. Experimentally, the minimum is $\displaystyle\frac{30}{64} = 0.48675$ for $N = 6$, at least $\displaystyle\frac{3}{5}$ for $N \ge 12$, and at least $\displaystyle\frac{4}{5}$ for $N \ge 20$, and this ratio increases with respect to $N$. Theoretically, as mentioned in the supplement, it can be proven that for large $N$, more than about $\displaystyle\frac{3}{4}$ of the color sequences are suitable as solutions. Specifically, for $n = \left\lceil\displaystyle\frac{N}{46}\right\rceil$, the ratio is at least the following:

\[\frac{3}{16} \sum_{k = 1}^{n} \left(\frac{3}{4}\right)^{k-1} = \frac{3}{4}\left(1 - \left(\frac{3}{4}\right)^n\right).\]

For $n \ge 10 \ (N \ge 416)$, it is more than $0.707$, and for $n \ge 20 \ (N \ge 876)$, it is more than $0.747$.

Estimating the success probability

Let’s consider the probability of obtaining a correct answer for all test cases within the time limit. (For more details, see the supplement.) For simplicity, let’s consider inputs consisting only of cases with large $N$’s inputs consisting only of cases with small $N$’s. (For inputs with a mix of both, consider inputs whose sums of large $N$’s and small $N$’s are both around $5 \times 10^4$, which would be worse than any valid input, and evaluate the probability with about twice the time to spare.)

Cases with large $N$'s

First, let’s consider the case where $N$ is large. For $N \ge 416$, the failure probability is at most $0.293$ from the aforementioned ratio of solutions. For example, assuming (with some margin) that you will get TLE if you fail $25$ times in a row for the maximum case, the probability of such a failure is at most $0.293^{25} < 5 \times 10^{-14}$. There can be at most $\displaystyle\frac{5 \times 10^4}{416} \approx 120$ test cases with $N \ge 416$ in each input, and even if there are $100$ inputs, there are only about $1.2 \times 10^4$ cases in total. From the Union-Bound, the probability of failing $25$ times in a row for at least one of them is at most around the following:

\[5 \times 10^{-14} \times 1.2 \times 10^4 = 6 \times 10^{-10}.\]

Thus, in these cases where the input consists only of large $N$ test cases, you can obtain AC with a sufficiently high probability. (In practice, the success rate is much higher because you just need to succeed early on average within each input, and our solution ratio evaluation is loose.)

Cases with small $N$'s

Next, let’s consider the case where $N$ is small. For $N \le 46$, the theoretical lower bound of the solution ratio is $\displaystyle\frac{3}{16}$, and since the number of test cases can be large, the evaluation may not be as reassuring as in the case of large $N$. Let’s think of another method.

The total sum of $N$ in one input is at most $5 \times 10^4$, and it is sufficient to consider the maximum case. In the case of $N < 416$, there must be at least $\displaystyle\frac{5 \times 10^4}{416} \approx 120$ test cases, and by the law of large numbers, the probability of the total execution time deviating significantly from its expected value is very low. The expected execution time for each test case is $\displaystyle\frac{N}{p}$, ignoring constant factors, where $p$ is the success probability, and by the linearity of expectation, the total execution time is the sum of these expected values. As mentioned earlier, $p$ is at least $\displaystyle\frac{3}{16}$ (experimentally, at least $0.48$) even for small $N$, so $\displaystyle\frac{1}{p}$ can be considered a sufficiently small constant, and intuitively, this is enough. In the supplement, we show a more rigorous evaluation using concentration inequalities. (There may be a better method.)

(Above is a modification of a translation by GPT-4.)

posted:
last update: