Contest Duration: - (local time) (100 minutes) Back to Home
Official

## G - Colorful Candies 2 Editorial by en_translator

Let $$C$$ be the total number of distinct kinds of candies. Without loss of generality, we assume there are candies with Color $$1$$ to Color $$C$$. Obviously, the answer does not depend on the order of candies, but the number of candies with each color. For $$i = 1, 2, \ldots, C$$, let $$n_i$$ be the number of candies with Color $$i$$.

Let us first fix the number of candies $$K$$ that Takahashi chooses, and find the expected value of the number of distinct colors of candies in Takahashi’s $$K$$ candies.
For $$i = 1, 2, \ldots, C$$, let us define a random variable $$X_i$$ to be $$X_i = 1$$ if Takahashi’s $$K$$ candies has one or more candy with Color $$i$$, or $$X_i = 0$$ otherwise.
Then, the number of distinct colors of candies in the $$K$$ candies can be expressed as $$\sum_{i = 1}^C X_i$$. What we want is its expected value $$E\lbrack \sum_{i=1}^C X\rbrack$$.
By the linearity of expected value, $$E\lbrack \sum_{i=1}^C X \rbrack = \sum_{i=1}^C E\lbrack X_i \rbrack$$. Also, by the definition of $$X_i$$, $$E\lbrack X_i \rbrack$$, is equal to “the probability that $$K$$ candies has one or more candy with Color $$i$$,” so $$E\lbrack X_i \rbrack = \lbrace \binom{N}{K} - \binom{N-n_i}{K} \rbrace / \binom{N}{K}$$.
Therefore, $$E\lbrack \sum_{i=1}^C X \rbrack = \sum_{i=1}^C E\lbrack X_i \rbrack = \sum_{i=1}^C \lbrace \binom{N}{K} - \binom{N-n_i}{K} \rbrace / \binom{N}{K}$$, so finding $$E\lbrack \sum_{i=1}^C X\rbrack$$ boils down to computing $$\sum_{i=1}^C \lbrace \binom{N}{K} - \binom{N-n_i}{K} \rbrace$$.
Since the number of kinds of candies $$C$$ can be at most $$N$$, so if we compute this sum naively, it costs $$\mathrm{\Theta}(N)$$ computation time. Since it costs $$\mathrm{\Theta}(N)$$ time for one $$K$$, if we do this for all $$K = 1, 2, \ldots, N$$ it costs $$\mathrm{\Theta}(N^2)$$ at worst, which does not fit in the time limit.

Now we want to compute $$\sum_{i=1}^C \lbrace \binom{N}{K} - \binom{N-n_i}{K} \rbrace$$ faster. Since $$N$$ is a constant given as an input, and $$K$$ is currently fixed, the value $$\binom{N}{K} - \binom{N-n_i}{K}$$ only depends on the value $$n_i$$. That is, if $$n_i = n_j$$, then $$\binom{N}{K} - \binom{N-n_i}{K} = \binom{N}{K} - \binom{N-n_j}{K}$$. From this viewpoint, we can combine colors $$i$$ with the same $$n_i$$. Namely, let $$a_x$$ be the number of integers $$1 \leq i \leq C$$ such that $$n_i = x$$, and express the value as $$\sum_{i=1}^C \lbrace \binom{N}{K} - \binom{N-n_i}{K} \rbrace = \sum_{x = 1}^N a_x \lbrace \binom{N}{K} - \binom{N-x}{K} \rbrace$$.
Here, since the total number of candies can be expressed as $$N = a_1 + 2 a_2 + 3 a_3 + \cdots N a_N$$, there are at most $$\mathrm{O}(\sqrt{N})$$ positive values in $$a_1, a_2, \ldots, a_N$$. (This is because, letting $$M$$ be the number of positive elements, $$N = a_1 + 2 a_2 + 3 a_3 + \cdots N a_N \geq a_1 + 2 a_2 + 3 a_3 + \cdots + M a_M \geq 1 + 2 + 3 + \cdots + M$$)
Therefore, $$\sum_{x = 1}^N a_x \lbrace \binom{N}{K} - \binom{N-x}{K} \rbrace$$ can be expressed as a sum of $$\mathrm{O}(\sqrt{N})$$ terms, which can be computed in an $$\mathrm{O}(\sqrt{N})$$ time.

Now that we can find the answer in an $$\mathrm{O}(\sqrt{N})$$ for a single $$K$$, we can do this for each $$K = 1, 2, \ldots, N$$ to solve this problem in a total of $$\mathrm{O}(N\sqrt{N})$$ time.

posted:
last update: