F - K-Medians Clustering Editorial by sumitacchan
Since each of \(c_1, \dots , c_K\) must be contained in \(S\), let us consider the other \(N' := N - K\) elements. The necessary and sufficient condition for the existence of \((S_1, \dots, S_K)\) satisfying the conditions is that we can remove all \(N'\) elements by the following operations.
- Remove two elements \(x, y\) such that \(x \le c_k \le y\) holds for some \(k\ (1 \le k \le K)\).
(Corresponding to adding \(x, y\) to \(S_k\)) - Remove one element \(x\) such that \(c_k \le x\) holds for some \(k\ (1 \le k \le K)\). This operation can be performed at most only once for each \(k\).
(Corresponding to adding \(x\) to \(S_k\) and make the number of elements in \(S_k\) even)
We assume \(c\) is sorted in ascending order and we difine \(c_0 := 0, c_{K+1} := M + 1\).
Suppose that the number of elements \(T_k\) in the open interval \((c_k, c_{k+1})\) satisfies \(T_k - k > N' - T_k\), then even if the operations in 2 are performed to the maximum, we cannot remove all of these elements with the operations in 1. Conversely, if there is no such \(k \ (0 \le k \le K)\), then it is possible to remove all \(N'\) elements.
Since it never happens that \(T_k - k > N' - T_k\) holds for more than one \(k\) at the same time, we can count up independently. We should count up the number of shortest paths from \((0,0)\) to \((N'-1, M)\) on a grid, which pass through \((x_0, y_0) := (c_{k+1} - c_{k} - 2, \lfloor (N' + k) / 2 \rfloor + 1 )\) or its upper. Since passing through \((x_0, y_0)\) or its upper is equivalent to passing through \((x_0, y_0)\) or its left, we can count up as the sum of \(c_{k+1} - c_k - 1\) terms for each \(k\), and in total \(O(M)\) terms.
We can answer this problem in \(O(N + M + K \log K)\) time.
posted:
last update: