G - Range Shuffle Query Editorial by en_translator
This problem features a technique called square-root decomposition (of a sequence).
What is square-root decomposition?
A square-root decomposition (of a sequence) is a data structure to handle queries against a length-\(N\) sequence by managing the sequence split it into \(\mathrm{O}(\sqrt{N})\) buckets.
Let us consider an easy example. Given a sequence \((A_1, A_2, \dots, A_{100})\), suppose you want to process the following two kinds of queries:
- Given \(i\) and \(x\), set \(A_i\) to \(x\).
- Given \(L\) and \(R\), compute \(\sum_{i=L}^R A_i\).
When we apply the square-root decomposition, the elements are divided into \(10\) buckets, where the sum of the elements in each bucket is managed separately. If we appropriately manage \(B_1, B_{2}, \dots, B_{10}\) defined as
- \(B_1 = A_1 + A_2 + \dots + A_{10}\)
- \(B_2 = A_{11} + A_{12} + \dots + A_{20}\)
- \(\vdots\)
- \(B_{10} = A_{91} + A_{92} + \dots + A_{100}\),
if we are given \((L, R) = (39, 61)\), we have
\[\sum_{i=39}^{61} A_i = A_{39} + A_{40} + B_5 + B_6 + A_{61},\]
which enables us to find the desired sum fast using \(A\) and \(B\).
The computational complexity for this case is:
- Updating an element is done by appropriately applying the delta to \(A\) and \(B\) in \(O(1)\) time;
- Retrieving the segment-sum requires addition of at most \(\mathrm{O}(\sqrt{N})\) elements, for a total of \(\mathrm{O}(\sqrt{N})\) time.
Some readers may suspect that we may also use a segment tree or a Fenwick tree. You are right; most segment queries can be dealt without square-root decomposition and only by a segment tree or a Fenwick tree. Also, with logarithmic-time process, segment trees and Fenwick trees typically outperform square-root decomposition, which costs \(\mathrm{O}(\sqrt{N})\) time at worst.
However, some kinds of queries do not allow us to design a monoid that can be stored in a segment tree, in which case square-root decomposition is one of the good options.
The original problem introduces even trickier use case of square root decomposition, somewhat different from such examples. (Although we said “tricky,” this is in fact a typical optimization method.) Let us consider how to apply square-root decomposition.
Solution
Let’s move on to the solution to the original problem.
In our problem, we may apply Mo’s algorithm and a segment tree to solve it in a total of \(\mathrm{O}((N \sqrt{Q} + Q) \log N)\) time.
- If you do not know Mo’s algorithm, please refer to the editorial of the following problems. Problem 1, Problem 2, Problem 3
Simply put, let \(f_x\) be the number of occurrences of \(x\) within the segment. It is sufficient to manage the sum of \(f_x\) as well as \(\frac{1}{f_x !}\). The complexity is analyzed as follows.
- Every time we shrink or extend a segment, we update the segment tree in \(\mathrm{O}(\log N)\) time. With \(N \sqrt{Q}\) updates, it costs a total of \(\mathrm{O}(N \sqrt{Q} \log N)\) time.
- Every time we respond to a query, we retrieve a segment-product in \(\mathrm{O}(\log N)\) time. With \(Q\) queries, it costs a total of \(\mathrm{O}(Q \log N)\) time.
- These sum to \(\mathrm{O}((N \sqrt{Q} + Q) \log N)\).
However, this probably will not meet the execution time limit.
Let us replace the use of a segment tree with square-root decomposition. It seems getting worse at a glance, but let us analyze the complexity again:
- Every time we shrink or extend a segment, we update buckets in \(\mathrm{O}(1)\) time. With \(N \sqrt{Q}\) updates, it costs a total of \(\mathrm{O}(N \sqrt{Q})\) time.
- Every time we respond to a query, we retrieve a segment-product in \(\mathrm{O}(\sqrt{N})\) time. With \(Q\) queries, it costs a total of \(\mathrm{O}(Q \sqrt{N})\) time.
- These sum to\(\mathrm{O}(N \sqrt{Q} + Q \sqrt{N})\).
If we assume \(N=Q\), the complexity is reduced from \(\mathrm{O}(N^{1.5} \log N)\) to \(\mathrm{O}(N^{1.5})\)!
The problem can be solved by appropriately implementing this algorithm. To ask to eliminate the \(\mathrm{O}(\log N)\) factor, the execution time limit is stricter than usual problems, but you can still obtain the correct answer in sufficiently short time, as long as you do not worsen the constant factor during the searching in Mo’s algorithm.
posted:
last update: