Official

E - Procrastinate Counter Editorial by evima

WIP

When computing \(f(P)\), even though the changes to \(P\) during the operations look complicated, the number of times each element is moved to the end in fact has a simple structure.

For example, suppose \(P_N = 1\). In that case, \(f(P)\) is always \((0,1,2,\dots,N-1)\). Let us show why. First, consider the steps until \(1\) reaches the front. Each of \(X_1, X_2, \dots, X_{N-1}\) is moved to the end exactly once during this process. Among them, the value \(2\) will not be moved until the sequence looks like \((2,1,\dots)\), so by the time \(1\) arrives at the front, the end of the sequence must be \(2\). In subsequent steps, the \(1\) at the beginning is clearly ignorable, so \(X\) can be seen as a permutation whose smallest element is at the end, and from here the argument proceeds by induction.

Generalizing this observation, we see the following about how \(f(P)\) is computed:

Let \(k\) be the largest integer such that \((1,2,\dots,k)\) is a subsequence of \(P\). Let \(A_1 < A_2 < \dots < A_k\) be the indices in \(P\) with \(P_{A_i} = i\ (i=1,2,\dots,k)\). For convenience, set \(A_0 = 0, A_{k+1} = N+1\). Then, for each \(i\), how the elements of \(P\) from index \(A_{i-1}+1\) to \(A_i - 1\) appear in the original permutation has no effect on \(f(P)\).

To see this, note that until \(1\) reaches the front, the first \(A_1 - 1\) elements will all be moved to the end exactly once. Once \(1\) first reaches the front, the end of the sequence must be the minimum among the first \(A_1 - 1\) elements, and we can apply induction.

Based on this, an algorithm to compute \(f(P)\) is:

  1. Let \(k\) be the largest integer such that \((1,2,\dots,k)\) is a subsequence of \(P\). Let \(A_1 < A_2 < \dots < A_k\) be the indices in \(P\) with \(P_{A_i} = i\ (i=1,2,\dots,k)\). For convenience, set \(A_0 = 0, A_{k+1} = N+1\).
  2. For each \(i=1,2,\dots,k+1\), let \(S_i\) be the set of values in \(P\) from index \(A_{i-1}+1\) to \(A_i\). Put \(S_1,S_2,\dots,S_{k+1}\) into a queue in this order.
  3. For \(x=1,2,\dots,N\):
    • Repeatedly pop from the front of the queue until you pop the set that contains \(x\). Remove \(x\) from the union of the popped sets and push the resulting set into the back of the queue.

When analyzing the possible outcomes of \(f(P)\), we can ignore which elements are in each set in the queue, and retroactively decide which set \(x\) belongs to. Hence, we can focus on how many times an element has been moved to the end. By observing the operations, you see that the queue takes the shape \((t,t,\dots,t,(t \text{ or } t+1),t+1,\dots,t+1)\) (the \(t \text{ or } t+1\) set occurs when the set for \(x = K + 1\) gets merged). Summarizing these points, an algorithm to generate possible \(f(P)\) other than \((0,0,\dots,0)\) is:

  1. Choose an integer \(K\) with \(1 \leq K \leq N-1\). Initialize \(C = (0,0,\dots,0,1)\) and \(Q = ((1,1),(1,1),\dots,(1,1),(1,2))\). Here, there are \(K\) zeros in \(C\), and \(K\) pairs \((1,1)\) in \(Q\).
  2. Repeat until \(C\) has length \(N\):
    • Choose an integer \(k\). Let the first and \(k\)-th elements of \(Q\) be \((a,b)\) and \((c,d)\). Append either \(c\) or \(d\) to \(C\). Then, pop the first \(k\) elements from \(Q\), and push \((a+1,d+1)\) onto the back.

Here, the contents of \(Q\) will look like \(((t,t),(t,t),\dots,(t,t),(t,t+1),(t+1,t+1),\dots,(t+1,t+1))\). Let \([t,n,m]\) denote the state where there are \(n\) pairs \((t,t)\) and \(m\) pairs \((t+1,t+1)\) in the queue.

Consider counting distinct outcomes when we generate \(n\) terms from the state \([t,k,0]\). If we want the first term to be \(t+1\), it is clearly advantageous to choose \((t,t+1)\). Now, consider when we want the first \(i\) terms to be \(t\). For the first \(i-1\) terms, it is advantageous to choose the first \((t,t)\). The question is whether to choose \(t\) from the first \((t,t)\) or from \((t,t+1)\) for the \(i\)-th term. Let us basically choose the \(t\) from \((t, t+1)\) (Option 1), and consider what \(C\) can be generated only by choosing from \((t,t)\) (Option 2).

First, after \(i\) additions of \(t\), the state is \([t,k-i,i]\); besides \(t\), we have no choice but to append \(t+1\). Clearly, the number of distinct sequences \(C\) that can be generated from the state \([val,x,0]\) increases when \(x\) increases, so it is advantageous to choose \(t+1\) from \((t,t+1)\), and the state will transition to \([t+1,i,0]\). Now, we have three cases based on the number of times we append \(t+1\):

  • \(i-2\) times or fewer: In this case, we transition from \([t+1,i,0]\) to \([t+2,j-1,0]\) or \([t+1,i-j,j]\rightarrow [t+2,j,0]\), but we can transition from there to either state like \([t,k,0]\rightarrow [t+1,i-1,0] \rightarrow [t+1,i-2,0]\) since \(j \leq i-2\). Thus, the sequences \(C\) that can be generated this case can also be generated via Option 1.

  • \(i-1\) times: In this case, we can transition from \([t+1,i,0]\) to \([t+2,i-2,0]\) or \([t+1,1,i-1]\rightarrow [t+2,i-1,0]\), but the former state can also be accessed via Option 1, so we only consider the latter. What can be generated from this state by \(i-2\) or fewer additions of \(t+2\) can also be generated via Option 1, similarly to the argument above, so we ignore this case. For \(i-1\) additions of \(t+2\), we only need to consider the transition \([t+2,0,i-1]\rightarrow [t+3,i-1,0]\), similarly to the argument above, which is the same situation as above. Therefore, the sequences \(C\) that can only generated via Option 2 is the ones that can be generated by appending \(i\) copies of each of \(t+1,t+2,\dots,t+a-1\) and \(i+1\) copies of \(t+a\), after which the state will be \([t+a+1,i-1,0]\), and then continuing the algorithm.

  • \(i\) or \(i+1\) times: In this case, it is confirmed that the resulting \(C\) is impossible to generate via Option 1. For the case of \(i+1\) additions, the state will always transition to \([t+2,i,0]\), but for the case of \(i\) additions, we need to consider both options 1 and 2 again.

Therefore, if we let

\(dp[0][k][n]:\) the number of possible outcomes when generating \(n\) terms from the state \([t,k,0]\), and

\(dp[1][k][n]:\) the number of possible outcomes when generating \(n\) terms from the state \([t,k,0]\) after choosing Option 2,

we have the following for \(0 < k\):

\[\displaystyle dp[0][k][n] = dp[0][0][n-1] + \sum_{i=1}^{k+1} dp[0][i-1][n-i] + \sum_{i=1}^{k}dp[1][i][n-i-1],\]

\[\displaystyle dp[1][k][n] = \left(\sum_{2\leq i} dp[0][k-1][n-ik]\right) + (dp[0][k-1][n-k] + dp[1][k][n-k-1]) + dp[0][k][n-k-1].\]

This allows us to compute the answer in \(O(N^3)\) time, which is sufficiently fast under the constraints of the problem.

Furthermore, taking differences yields the following:

\[dp[0][k][n] = dp[0][k-1][n] + dp[0][k][n-k-1] + dp[1][k][n-k-1]\]

\[\displaystyle dp[1][k][n] = (dp[1][k][n-k] - dp[1][k][n-2k-1] - dp[0][k][n-2k-1] + dp[0][k-1][n-k]) + dp[1][k][n-k-1] + dp[0][k][n-k-1]\]

The final value we want is \(\sum_{i}dp[0][i][N-2-i]\). We can accelerate such DP computations similar to how we handle partition-number DP (see https://degwer.hatenablog.com/entry/20170829 (Japanese)). That is, if we let \(B=\lceil \sqrt{N} \rceil\) and \(sdp[p][t][s] = \sum_{B\leq x} dp[p][x][s-tx]\), we only need the values where \(t \leq B\), so it suffices to compute \(dp[p][k][n]\) for \(k \leq B\), leading to an \(O(N \sqrt{N})\) time solution.

posted:
last update: