E - Subarray Sum Divisibility Editorial by en_translator
Every length-\(L\) subarray of \(A\) sums to a multiple of \(M\) if and only if the following are both satisfied:
- \(A_1 + A_2 + \ldots + A_L\) is a multiple of \(M\).
- \(A_i - A_{i + L}\) is a multiple of \(M\) for all \(i\) with \(1 \leq i \leq N - L\).
Thus, when we know the number of times we apply the operation for each \(A_1, A_2, \ldots, A_L\), then for \(i = L + 1, L + 2, \ldots, N\) the number of operations required is the minimum number of operations required to make \(A_{i - L}\) congruent to \(A_i\) modulo \(M\).
Let \(f_{i, j}\) be the minimum number of operations required to make all of \(A_i, A_{L + i}, A_{2L + i}, \ldots\) congruent to \(j\) modulo \(M\). This can be naively computed in \(O(NM)\) time.
We utilize the values \(f_{i, j}\) to perform DP (Dynamic Programming).
\(dp_{i,j}\) be the number of operations required to perform against \(A_1, A_2, \ldots, A_i\), so that the remainder of \((A_1 + A_2 + \ldots, A_i)\) divided by \(M\) equals \(j\). Here, “the number of operations required” is not only for \(A_1, A_2, \ldots, A_i\), but for all elements whose indices are congruent to any of \(1, 2, \ldots, i\) modulo \(L\). By the discussion above, fixing the number of operations against \(A_i\) uniquely determines the number of operations against \(A_{L + i}, A_{2L + i}, \ldots\), and that is why this definition is computable.
We describe the transitions of the DP in detail. If we determine that \(A_i\) will be made congruent to \(k\) modulo \(M\), then the number of operations required for \(A_i, A_{L + i}, A_{2L + i}, \ldots\) is \(f_{i, k}\) (by definition). Thus, we can fill the DP by \(dp_{i, ((j + k) \bmod M)} \leftarrow \min(dp_{i, ((j + k) \bmod M}), dp_{i - 1, j} + f_{i, k})\). The final answer is \(dp_{L, 0}\).
The DP can be processed in a total of \(O(LM^2)\) time, which is fast enough under an appropriate implementation.
Bonus: it can be solved in \(O(NM)\) time.
Sample code
#include <bits/stdc++.h>
using namespace std;
const int inf = 1000000001;
template<class T, class U> inline bool chmin(T& a, const U& b) { if (a > b) { a = b; return true; } return false; }
int main() {
int n, m, l;
cin >> n >> m >> l;
vector<int> a(n);
for (int i = 0; i < n; i++) cin >> a[i];
vector<int> dp(m, inf);
dp[0] = 0;
for (int i = 0; i < l; i++) {
vector<int> ep(m, inf);
for (int j = 0; j < m; j++) {
int cost = 0;
for (int k = i; k < n; k += l) {
if (j >= a[k]) cost += j - a[k];
else cost += j - a[k] + m;
}
for (int k = 0; k < m; k++) chmin(ep[(k + j) % m], dp[k] + cost);
}
dp = move(ep);
}
cout << dp[0] << '\n';
}
posted:
last update: