G - 221 Substring 解説 by en_translator
Simplifying the problem
Let \((v_1, m_1), \cdots, (v_n, m_n)\) be the run-length encoding of \(S\) (where \((v_i, m_i)\) represents \(m_i\) copies of \(v_i\)).
For a (contiguous) subarray that is a 221-sequence, its run-length encoding \((v_l, v_l), \cdots, (v_r, v_r)\) satisfies:
- \(v_l \leq m_l\), \(v_r \leq m_r\)
- \(v_i = m_i\) (\(l < i < r\))
Let \(T\) be the sequence obtained by replacing each pair in the run-length encoding of \(S\) as follows:
- if \(v_i > m_i\): into \((0)\)
- if \(v_i = m_i\): into \((v_i)\)
- if \(v_i < m_i\): into \((v_i, 0, v_i)\)
Then there is a one-to-one correspondence as a index sequence \((l, \cdots, r)\) between:
- a subarray of \(T\) that does not contain \(0\): \((v_l, \cdots, v_r)\)
- a subarray of \(S\) that is a 221-sequence: \((v_l, v_l), \cdots, (v_r, v_r)\)
Thus, the problem is simplified to counting distinct subarrays of \(T\) that do not contain \(0\).
Considering the reduced problem
Let \(T[l\ :\ r]\) denote the subarray formed by the \(l\)-th through \(r\)-th digits of \(T\). Let \(T[p_1\ :\ |T|] < \cdots < T[p_{|T|}\ :\ |T|]\) be the suffix array of \(T\), and \(\mathrm{lcp}[i]\) be the length of the longest common prefix of \(T[p_{i-1}\ :\ |T|]\) and \(T[p_{i}\ :\ |T|]\).
Consider the number of subarrays not containing \(0\), that are not a prefix of any \(T[p_i\ :\ |T|]\) (\(i < k\)), but is a prefix of \(T[p_k\ :\ |T|]\). If we denote by \(\mathrm{zerofree}[k]\) the number of consecutive occurrences of non-\(0\) digits starting from the \(p_k\)-th digit, the count is:
- \(0\) if \(\mathrm{zerofree}[k] \leq \mathrm{lcp}[k]\);
- \(\mathrm{zerofree}[k] - \mathrm{lcp}[k]\) if \(\mathrm{zerofree}[k] > \mathrm{lcp}[k]\).
The answer is the sum of the value above over \(k = 1, \cdots, |T|\).
Implementation strategy
As a famous fact, the suffix array of a string whose alphabet size is \(\sigma\) can be found in \(O(N + \sigma)\) time, and the lengths of the longest common prefixes in \(O(N)\) time.
Also, the number of consecutive occurrences of non-\(0\) digits starting from the \(p_k\)-th digit, \(\mathrm{zerofree}[k]\), can be found in a total of \(O(N)\) time by backtracking from the end of the string.
By implementing this, the problem can be solved fast. The time complexity is \(O(N + \sigma)\).
Sample code (C++)
#include <iostream>
using std::cin;
using std::cout;
#include <vector>
using std::vector;
using std::pair;
using std::make_pair;
using std::min;
using std::max;
#include <set>
using std::set;
#include <atcoder/string>
using atcoder::suffix_array;
using atcoder::lcp_array;
typedef long long int ll;
const int SIGMA = 9;
ll n;
vector<int> a;
struct Item {
ll len;
int val;
ll idx;
};
// run-length list of (length, value)
vector<Item> runlength (const vector<int> &a) {
vector<Item> lens;
ll i = 0;
while (i < (ll)a.size()) {
ll j = i;
while (j < (ll)a.size() && a[j] == a[i]) j++;
lens.push_back({len: j-i, val: a[i], idx: i});
i = j;
}
return lens;
}
// count substrings of s that doesn't contain 0 (ignoring position difference)
ll count_zerofrees (const vector<int> &s) {
// a substring of s that doesn't contain 0 <-1to1-> original 221substr
const ll m = s.size();
const vector<int> sa = suffix_array(s, SIGMA);
const vector<int> lcp = lcp_array(s, sa);
// max_zerofree[i] := argmax_len [ s[i, i+len) doesn't contain 0 ]
vector<ll> max_zerofree(m+1);
max_zerofree[m] = 0;
for (ll i = m-1; i >= 0; i--) {
if (s[i] == 0) {
max_zerofree[i] = 0;
} else {
max_zerofree[i] = max_zerofree[i+1] + 1;
}
}
ll sum = 0;
for (ll i = 0; i < m; i++) {
ll l = ((i == 0) ? 0 : lcp[i-1]);
ll r = max_zerofree[sa[i]];
// l < len <= r
sum += max(0LL, r - l + 0);
}
return sum;
}
void solve () {
// run-length list of (length, value)
const vector<Item> lens = runlength(a);
// maximal [l, r) where [lens[l], lens[r]) satisfies either:
// 1. lens[k].val == lens[k].len, or
// 2. lens[k].val < lens[k].len AND k is in {l, r-1}
vector<pair<ll, ll> > maximals;
{
ll idxlen = 0;
while (idxlen < (ll)lens.size()) {
if (lens[idxlen].val > lens[idxlen].len) {
idxlen++;
continue;
}
ll jlen = idxlen + 1;
ll jsee = jlen;
while (jlen < (ll)lens.size()) {
if (lens[jlen].val > lens[jlen].len) {
jlen += 0;
jsee += 0;
break;
} else if (lens[jlen].val < lens[jlen].len) {
jlen += 0;
jsee += 1;
break;
} else {
jlen += 1;
jsee += 1;
continue;
}
}
maximals.push_back({idxlen, jsee});
idxlen = jlen;
}
}
// [0] ++ {[lens[li].val, lens[ri].val)}.join([0]) ++ [0]
vector<int> s = {0};
for (const auto &[li, ri] : maximals) {
for (int i = li; i < ri; i++) {
s.push_back(lens[i].val);
}
s.push_back(0);
}
ll ans = count_zerofrees(s);
cout << ans << "\n";
}
int main (void) {
std::cin.tie(nullptr);
std::ios_base::sync_with_stdio(false);
cin >> n;
a.resize(n);
for (ll i = 0; i < n; i++) {
cin >> a[i];
assert(a[i] <= SIGMA);
}
solve();
return 0;
}
投稿日時:
最終更新: