Official

E - XOR Matching Editorial by evima


[1] Reformulation using a frequency array

For \(0\leq x \leq 2 ^ M - 1\), let

\[C[x]\]

denote the number of indices \(i\) satisfying \(A_i = x\). Writing the answer as \(\mathrm{ANS}\),

\[ \begin{aligned} \mathrm{ANS}[0] &= \sum_i \lfloor C[i]/2\rfloor, \\ \mathrm{ANS}[X]& = \sum_{i<j, i\oplus j=X} \min(C[i],C[j])\quad (X\neq 0) \end{aligned} \]

is immediately clear. Since \(\mathrm{ANS}[0]\) is easy to compute, the challenge is to compute the latter type efficiently.


[2] Computation by exhaustive enumeration

Enumerating all pairs \((i,j)\) and performing the above computation takes \(O((2^M)^2)\) time.

Also, by enumerating only pairs \((i,j)\) with \(C[i], C[j]>0\), the answer can be computed in \(O(N^2)\) time.


[3] Using XOR convolution

For integers \(a,b\), let us write

\[ [a\leq b]=\begin{cases}1 & (a\leq b) \\ 0 & (a > b)\end{cases}. \]

Then we can rewrite

\[ \min(C[i],C[j]) = \sum_{x=1}^{\infty} [C[i]\geq x]\cdot [C[j]\geq x]. \]

Explanation of the rewrite

Here, when we "earn $\min(C[i],C[j])$ points," we decompose this as:

  • If $\min(C[i],C[j])\geq 1$, earn $1$ point.
  • If $\min(C[i],C[j])\geq 2$, earn $1$ point.
  • If $\min(C[i],C[j])\geq 3$, earn $1$ point.
  • $\cdots$

The condition $\min(a,b)\geq x$ can be decomposed into the independent per-component condition "$a\geq x$ and $b\geq x$," which is often easier to handle.

Therefore, to compute the answer, for \(x=1,2,\ldots,\max(C)\), we consider the sequence \(S\) defined by \(S_i = [C[i]\geq x]\), and compute the XOR convolution of \(S\) with itself, that is,

\[ T_k=\sum_{i\oplus j=k}S_iS_j \]

(to extract the \(i < j\) case, divide this by \(2\)).

The time complexity of this approach is \(O(M\cdot 2^M\cdot \max(C))\).

We can also note that the same sequence \(S\) may arise from different values of \(x\), and replace \(\max(C)\) with “the number of distinct values in \(C\)” (which is \(O(\sqrt{N})\)).


[4] Combining the two approaches

We use these two computation methods selectively. Choose an appropriate threshold \(K\) and compute as follows:

  • For \(x=1,2,\ldots,K\), use the method in [3].
  • At this point, the cases that have not yet been correctly computed are those where \(\min(C[i],C[j])> K\). Enumerate all such pairs \((i,j)\) exhaustively and correct the computation for those pairs.

For example, taking \(K\approx 10\) gives a solution sufficiently fast for the constraints of this problem. It has been confirmed that, depending on the implementation, a wide range of \(K\) is acceptable, roughly \(2\leq K\leq 500\).

The time complexity is

\[ O(K\cdot M2^M + (N/K)^2), \]

and by taking \(K = (N^2/M2^M)^{1/3}\), this becomes

\[ O((NM2^M)^{2/3}). \]

posted:
last update: