Official

D - Distinct Elements on Subsegments Editorial by antontrygubO_o


For an array \(A\) of length \(N+K-1\), define binary sequences \(L\) and \(R\) of length \(N+K-1\), as follows:

  • For \(1 \le i \le N+K-1\), let \(R_i\) be \(1\) if there exists an element equal to \(A_i\) in the nearest \(K-1\) elements to the right of it (that is, in the range \(A[i+1, min(i+K-1, N+K-1)]\). Otherwise, let it be \(0\).

  • Similarly, let \(L_i\) be \(1\) if there exists an element equal to \(A_i\) in the nearest \(K-1\) elements to the left of it (that is, in the range \(A[max(1, i - K+1), i-1]\). Otherwise, let it be \(0\).

Now, let’s make a few observations about these values.

  • For \(1 \le i \le N-1\) holds \(B_i - (1 - R_i) = B_{i+1} - (1 - L_{i+K}) = \) the number of distinct elements on the segment \(A[i+1:i+K-1]\). Equivalently, \(L_{i+K} - R_i = B_i - B_{i+1}\).

  • \(B_1 = K - L_1 - L_2 - \ldots - L_K\). Indeed, the number of distinct elements on this segment is just the number of \(1 \le i \le K\) with \(L_i = 1\).

  • Similarly, \(B_K = K - R_N - R_{N+1} - \ldots - R_{N+K-1}\).

Now, consider all positions where \(R\) is equal to \(1\), and all positions where \(L\) is equal to \(1\). Then:

  • The number of such positions in \(L\) and in \(R\) are the same. Denote them \(X_1, X_2, \ldots, X_T\) for \(R\) and \(Y_1, Y_2, \ldots, Y_T\) for \(L\).

  • For each \(i\), \(1 \le Y_i - X_i \le K-1\).

To see this, consider any element that appears in the sequence \(A\), suppose that it appears at positions \(P_1 < P_2 < \ldots < P_W\). Then, for each \(1 \le i \le W-1\), if \(P_{i+1} - P_i > K-1\), we would have \(R_{P_i} = L_{P_{i+1}} = 0\), else we would have \(R_{P_i} = L_{P_{i+1}} = 1\). So, all ones in \(X\) and \(Y\) can be split into pairs of form \((L_i, R_j)\) with \(1 \le j-i \le K-1\). It’s easy to see that these conditions would remain after considering the positions of ones in the sorted order.

Now, I claim, that, for given array \(B\), if the sequences \(L\) and \(R\) satisfying all \(5\) conditions above exist, then we can construct the corresponding array \(A\). This is actually very easy to see: given such sequences \(L, R\), construct elements of \(A\) one by \(1\). If \(L_i\) is \(0\), just make \(A_i\) equal to some new element that hasn’t appeared among the previous \(K-1\) elements. Otherwise \(i = Y_j\) for some \(j\). Then just set \(A_{Y_j}\) to \(A_{X_j}\).

Now we have to check whether such sequences \(L\) and \(R\) even exist. What freedom do we actually have in choosing \(L, R\)?

If \(B_i>B_{i+1}\) or \(B_i<B_{i+1}\), then we can deduce \(R_i\) and \(L_{i+K}\). If \(B_i = B_{i+1} = k\), then we have to set \(R_i = L_{i+K} = 0\), otherwise we have a choice: to set them both to \(0\) or to \(1\). I claim that we can simply set them both to \(1\).

Proof: suppose that there exist some \(L, R\) satisfying all of the conditions above, in which \(R_i = L_{i+K} = 0\). As \(B_i < K\), there are two equal elements in \(A[i+1:i+K-1]\), wlog \(A_u, = A_v\), which are “paired” ones in \(R\) and \(L\). But then we can set \(R_i = L_{i+K} = 1\) and pair \(i\) in \(R\) with \(v\) in \(L\) and \(u\) in \(R\) with \(i+K\) in \(L\).

As for the condition \(L_1 + L_2 + \ldots + L_K = B_1 - K\), we are basically given the number of ones among the first \(K\) elements of \(L\). It’s clear that it’s optimal (for our conditions) to put all those ones as the suffix of \(L[1:K]\). Similarly, put all needed ones in \(R[N-K+1:N]\) as prefix.

Therefore, if such array \(A\) exists, we can find some arrays \(L\) and \(R\) for which all of those conditions hold. Therefore, just construct these arrays and check whether the conditions indeed hold (and if they do, construct \(A\) as described earlier).

posted:
last update: