Official

F - Counting Subsets Editorial by evima


Let us fix the smallest number \(a\) in \(S\) that is not a power of two and find the count for each \(a\).

Consider the sequence denoting the number of ways to represent each integer as the sum of some elements of \(S\). In the end, its values at the indices \(0, 1, ..., N\) must be \(1\) or \(2\). Now, let us add elements to \(S\) one by one in ascending order and consider what this sequence is at each step. When the elements up to \(a\) have been added, it looks like this: \(a\) occurrences of \(1\), followed by \((2^k-a)\) occurrences of \(2\), followed by \(a\) occurrences of \(1\). (\(2^k\) is the smallest power of two that is at least \(a\).)

If we add another element, it will change as follows.

  • Copy the current sequence. Then, “overlay” the leftmost \(a\) occurrences of \(1\) in the copy on the rightmost \(a\) occurrences of \(1\) in the original sequence to obtain the new sequence.

To “overlay” parts of two sequences is to concatenate the two sequences while replacing these parts with \(b\) occurrences of \(1\), followed by \((a-b)\) occurrences of \(2\), followed by \(b\) occurrences of \(1\). (\(b\) is an arbitrary non-negative integer at most \(a\).)

Noting that integers greater than \(N\) may have three or more representations, what we should find is as follows.

  • Repeat the above procedure until the length of the sequence is \((N+1)\) or more. We should find the sum, over all sequences that can be obtained this way, of distances from the index \(N\) to the nearest index to the left that contains \(2\).

We can consider a full binary tree corresponding to the process of constructing the sequence. Each vertex corresponds to a (contiguous) subsequence of the sequence, as follows.

  • Each leaf of the full binary tree corresponds to a subsequence that is a copy of the “\((2^k-a)\) occurrences of \(2\)” in the initial sequence.
  • Each vertex at height \(k\) of the full binary tree corresponds to a subsequence that is a copy of the “overlayed” part in the \(k\)-th procedure.

Now, let us fix the number of vertices that correspond to subsequences positioned within the range between \(0\) and \(N\). Since a subsequence corresponding to a vertex at height \(1\) or above is at least \(a\), so there are only at most \(2N/a + 1\) such vertices.

Here, the length of the subsequences corresponding to the vertices at each height can be specified to be any value between \(a\) and \(2a\). Thus, with simple dynamic programming, we can count the number of ways to specify the lengths so that the total length of the subsequences is at most \((N+1)\) and would exceed \((N+1)\) if the next subsequence to the right were included.

In reality, we have to find the sum, over such sequences, of distances from the index \(N\) to the nearest index to the left that contains \(2\), but it can also be done similarly with dynamic programming.

Since the heights are \(O(\log N)\), so the time complexity of this solution for each fixed setting is \(O(N\log N)\). There are \(O(\sum N/a)=O(N \log N)\) settings, so it runs in \(O(N^2 \log ^2 N)\) time in total.

It is also possible to find the answer in \(O(N^2 \log N)\) time by incrementing the number of vertices contained in the subsequences positioned within the range \(0, ..., N\) one by one and reusing the results from the dynamic programming at each step. However, the solution earlier mentioned is already fast enough since the constant factor is low.

posted:
last update: