Ex - XOR Sum of Arrays Editorial by spheniscine


This is essentially a hashing problem. The challenge lies in finding a hash scheme that distributes over pointwise xor (function \(S\) in the problem). (formally we want \(H(S(a, b)) = H(a) \oplus H(b)\))

We may think of interpreting the bit representations of \(A_i\) as members of the ring of polynomials \(\text{mod } 2 \text{ mod } x^k - 1\), where \(k\) is a convenient bit width (e.g. \(64\)). Addition is thus equivalent to xor-sum, and multiplication easily implemented in \(O(k)\) using bit tests, rotations, and xor. (see footnote)

Now, we might try to solve it in a manner similar to Rabin-Karp (with xor and polynomial-mul-mod instead of with the usual add and multiply with mod), where we generate a random polynomial \(g\) and define the hash of the substring starting at index \(a\) and of length \(l\) as \(h(a, l) = \sum^{l-1}_{i = 0} A_{a+i} \cdot g^{l-1-i}\), then use binary search to find the lowest index where the values differ. Except, this doesn’t work. If you generate several random \(g\)s and checked their sequence \(g^i\), you might notice that they tend to either converge to \(0\), or repeat in a really short period (about \(k\) in length). Thus it is easy to generate strings that will collide and compare equal when they shouldn’t.

Instead, forget about trying to find the hash for arbitrary length substrings. We define hashes only for lengths that are a power of \(2\). We generate \(\lfloor \log N \rfloor\) random polynomials \(g_i\), then define the hash function as \(h(a, 1) = a\) and \(h(a, 2^i) = h(a, 2^{i-1}) \cdot g_i + h(a + 2^{i-1} , 2^{i-1}) \). These can be precalculated in a manner similar to a sparse table in time \(O(Nk \log N)\).

We can then answer each query by “binary lifting” to find the first index where the substrings differ, for this we only need hashes for substrings of power-of-two lengths. This way each query can be answered in \(O(\log N)\)

Final time complexity is \(O((Nk + Q) \log N)\), while final space complexity is \(O(N \log N)\).

Footnote: As AtCoder runs on a 64-bit x86 machine, it is also possible to use the CLMUL instruction set (“carryless multiplication”, essentially multiplication of polynomials \(\text{mod } 2\); to perform the modulo step, just xor the two halves of the 128-bit result to get a 64-bit result) to perform this in effective \(O(1)\). Figuring out how to do this in your chosen language is another matter though; if it is even possible, you may need to invoke assembly code or LLVM intrinsics directly. In Rust, the relevant intrinsic is called core::arch::x86_64::_mm_clmulepi64_si128.

Addendum: there is another solution that uses nimber multiplication for a Rabin-Karp style rolling hash. This is a field rather than merely a ring, and thus a random nonzero element has multiplicative order \(\large\frac{2^k-1}{\gcd(2^k-1, x)}\) where \(x\) is a uniformly-distributed random number between \(1\) and \(2^k-1\). Thus this scheme is unlikely to encounter the short period problem that the polynomial ring scheme had.

You may also use an irreducible polynomial \(\text{mod } 2\) (a polynomial that can’t be factored as a product of lower-degree polynomials \(\text{mod } 2\), somewhat analogous to prime numbers) as the polynomial modulus instead to construct a field, e.g. \(x^{64} + x^4 + x^3 + x +1\) (source). This field is isomorphic to the nimber field (all finite fields of the same order / size are isomorphic), but with the elements mapped differently to the bit representations or labels.

posted:
last update: