F - Square Subsequence Editorial by en_translator
We denote by \(\sigma(i, c)\) the leftmost position of a lowercase English letter \(c\) occurring in the \(i\)-th and later characters of \(S\): \(S_{i+1} S_{i+2} \ldots S_N\). If the \(i\)-th and later characters do not contain \(c\), we let \(\sigma(i, c) \coloneqq \infty\).
This problem asks to find the number of (distinct) non-empty subsequences \(T\) of \(S\) that is represented as \(TT\) by a string \(T\).
As described in Sample Input 2, there can be multiple ways to extract a string from \(S\), so we need to avoid counting the same subsequence multiple times. To avoid it, we impose the following rule:
(\(\spadesuit\)): extract as left characters of \(S\) as possible
In other words, when we extract a subsequence \(TT = T_1T_2\ldots T_nT_1T_2\ldots T_n\) from \(S\),
we first focus on the first \(T\) of \(TT\):
- extract the \(p_1\)-th character of \(S\), where \(p_1 \coloneqq \sigma(0, T_1)\);
- extract the \(p_2\)-th character of \(S\), where \(p_2 \coloneqq \sigma(p_1, T_2)\);
- extract the \(p_3\)-th character of \(S\), where \(p_3 \coloneqq \sigma(p_2, T_3)\);
- \(\cdots\)
- extract the \(p_n\)-th character of \(S\), where \(p_n \coloneqq \sigma(p_{n-1}, T_n)\).
Then we focus on the second \(T\) of \(TT\):
- extract the \(q_1\)-th character of \(S\), where \(q_1 \coloneqq \sigma(p_n, T_1)\);
- extract the \(q_2\)-th character of \(S\), where \(q_2 \coloneqq \sigma(q_1, T_2)\);
- extract the \(q_3\)-th character of \(S\), where \(q_3 \coloneqq \sigma(q_2, T_3)\);
- \(\cdots\)
- extract the \(q_n\)-th character of \(S\), where \(q_n \coloneqq \sigma(q_{n-1}, T_n)\);
This rule uniquely determines the way to extract a subsequence for every substring of \(S\), so each subsequence \(T\) can be counted without duplicates. Specifically, we try to count the following number for each fixed integer \(x\):
(\(\star\)): the number of subsequences \(TT\) such that \(q_1 = x\) when extracting it from \(S\) along the rule (\(\spadesuit\)).
By taking the sum of (\(\star\)) over all possible positions \(x\) that can be \(x\).
For a fixed \(q_1 = x\), we have \(T_1 \coloneqq S_x\) and \(p_1 \coloneqq \sigma(0, T_1) = \sigma(0, S_x)\). Consider determining the \(2\)-nd and succeeding characters of \(T\) in order while extracting from \(S\) the first half \(T\) of \(TT\) starting from \(p_1\) and the second half \(T\) of \(TT\) starting from \(q_1\), in parallel. Precisely,
- We first determine which lowercase English character should be \(T_2\). Let \(p_2 = \sigma(p_1, T_2)\) and \(q_2 = \sigma(q_1, T_2)\), and extract the \(p_2\)-th and \(q_2\)-th characters of \(S\).
- Then we determine which lowercase English character should be \(T_3\). Let \(p_3 = \sigma(p_2, T_3)\) and \(q_3 = \sigma(q_2, T_3)\), and extract the \(p_3\)-th and \(q_3\)-th characters of \(S\).
- \(\cdots\)
- Then we determine which lowercase English character should be \(T_n\). Let \(p_n = \sigma(p_{n-1}, T_n)\) and \(q_n = \sigma(q_{n-1}, T_n)\), and extract the \(p_n\)-th and \(q_n\)-th characters of \(S\).
In order to find (\(\star\)), it is sufficient to count the strings \(T\) constructed by the process above such that \(\sigma(p_n, T_1) = q_1\) (a necessary condition of the rule \(\spadesuit\)).
This can be computed by a Dynamic Programming (DP). Specifically, define a DP table by
\(dp[i][j] = \) (the number of \(T = T_1T_2\ldots T_n\) such that \(p_n = i\) and \(q_n = j\));
then, for each \((i, j)\), perform the following \(26\) transitions:
\(dp[\sigma(i, c)][\sigma(j, c)] \leftarrow dp[\sigma(i, c)][\sigma(j, c)] + dp[i][j]\),
which corresponds to the choice of the \((n+1)\)-th character \(T_{n+1}\) of \(T\). (Here, we ignore \(c\) such that \(\sigma(i, c) = \infty\) or \(\sigma(j, c) = \infty\).)
The desired (\(\star\)) is obtained as the number of \(T\) such that \(\sigma(p_n, T_1) = q_1\); that is, \(\sum_{\sigma(i, S_x) = x}\sum_j dp[i][j]\).
The time complexity of this DP is \(O(N^2)\). Computing this for all positions \(x\) that can be \(q_1\) (\(O(N)\) cases), we can solve this problem in a total of \(O(N^3)\) time.
posted:
last update: