Official

G - Random Student ID Editorial by en_translator


We first consider the expected value of student ID of the \(i\)-th new student for a fixed \(i\). The student ID of New Student \(i\) equals the number of new students with names that is lexicographically smaller than or equal to that of New Student \(i\). So, its expected value is the sum of “the probability \(P_{i, j}\) that \(S_j\) is smaller than \(S_i\)” for all \(j = 1, 2, \ldots, N\). \(P_{i, j}\) is found as follows:

  1. If \(S_j\) is a prefix of \(S_i\) (including \(S_j = S_i\)): \(S_j\) is always lexicographically smaller than or equal to \(S_i\), so \(P_{i, j} = 1\).

  2. If \(S_i\) is a prefix of \(S_j\) (excluding \(S_j = S_i\)): \(S_j\) is always lexicographically larger than \(S_i\), so \(P_{i, j} = 0\).

  3. If neither 1. or 2. above are met: there exists \(k\) such that \(S_i\) and \(S_j\) have the same first \(k\) characters in common but have different ones as the \((k+1)\)-th characters. Thus, \(S_i\) and \(S_j\) is ordered by the precedence of their \((k+1)\)-th characters. However, the English characters are treated symmetrically, so one is larger than another with a equal probability \(1/2\). Thus, \(P_{i, j} = 1/2\).

Therefore, denoting by \(A_i\) the number of \(j\) satisfying 1. above and by \(B_j\) that of satisfying 2., the answer for New Student \(1\) is

\[ \sum_{j=1}^N P_{i, j} =A_i \times 1 + B_i \times 0 + (N-A_i-B_i) \times \frac{1}{2} = \frac{A_i - B_i + N}{2}. \]

Therefore, in order to solve this problem, it is sufficient to find \(A_i\) and \(B_i\) for all \(i = 1, 2, \ldots, N\) fast enough. This can be accomplished by representing the set of the names of the \(N\) people on a Trie to boil down the suffix/prefix relations of the strings onto the ancestor/descendant relations on Trie, and performing a Breadth-First Search rooted at the root of the Trie.

Hence the problem can be solved in a total of \(\mathrm{O}(\sum_{i=1}^N |S_i|)\) time.

posted:
last update: