E - リピート再生と部分一致 / Repeat Playback and Partial Matching Editorial by admin
GPT 5.2 HighOverview
Given an infinitely repeated string \(S^{\infty}\), we need to determine \(Q\) times whether the string \(T_i\) obtained from the interval \([L_i, R_i]\) appears as a substring of the original \(S\). To process each query in \(O(1)\), we build a Suffix Automaton (SAM) of \(S\) and precompute over \(S+S\).
Analysis
Key Observation 1: It suffices to look at \(S+S\)
Let \(m = R_i - L_i + 1\) be the length of \(T_i\). Then:
- If \(m > N\), it cannot be a substring of \(S\) (which has length \(N\)), so the answer is always
No. - When \(m \le N\), \(T_i\) starts at some position in \(S^{\infty}\), and the starting position within \(S\) is determined by [ a = (L_i - 1)\bmod N \quad (0\text{-indexed}) ]
Since \(m \le N\), \(T_i\) fits within at most two consecutive copies, so it is always the case that [ T_i = (S+S)[a \dots a+m-1] ] In other words, each query reduces to: “Does a certain substring of \(S+S\) appear in \(S\)?”
Key Observation 2: A naive approach is too slow
If we extract \(T_i\) for each query and search within \(S\) (e.g., using KMP or find), the worst case costs: - \(O(m)\) for extraction - \(O(N+m)\) for searching
Since \(Q\) can be up to \(2\times 10^5\), this is far too slow.
Instead, we use a data structure (SAM) that can quickly determine whether something is a substring of \(S\), and additionally scan \(S+S\) in advance to reduce each query to \(O(1)\).
Algorithm
1. Build the Suffix Automaton (SAM)
A SAM is an automaton that accepts all substrings of string \(S\).
It can be constructed in \(O(N)\) by adding characters of \(S\) one by one from left to right.
2. Scan \(T = S+S\) on the SAM to create dp
For each position \(i\) (\(0\)-indexed) of \(T\), compute:
dp[i]= “the maximum length among substrings ending at position \(i\) of \(T\) that appear in \(S\)”
While reading \(T\) on the SAM, we maintain the current state v and the current match length l. When no transition exists for the next character, we follow suffix links to “shorten until a match is found.” This is a standard SAM matching technique and runs in \(O(|T|)=O(N)\) overall.
Once dp is computed, determining whether an arbitrary interval \([a, a+m-1]\) (with endpoint \(e=a+m-1\)) appears in \(S\) is equivalent to:
- “A substring of length \(m\) ending at position \(e\) of \(T\) exists and appears in \(S\)”
- This is equivalent to
dp[e] >= m
3. Query Processing (\(O(1)\))
For each query:
1. If \(m = R-L+1\) exceeds \(N\), output No
2. Compute \(a = (L-1)\bmod N\), \(e=a+m-1\)
3. If dp[e] >= m, output Yes; otherwise output No
Complexity
- Time complexity:
- SAM construction: \(O(N)\)
- Scanning \(T=S+S\): \(O(N)\)
- Each query: \(O(1)\), repeated \(Q\) times
- Overall: \(O(N+Q)\)
- Space complexity: The number of SAM states is at most around \(2N-1\), so including transitions (fixed 26 characters), the space is \(O(N)\).
Implementation Notes
- \(L_i, R_i\) can be up to \(10^{18}\), so the starting position must be computed as \((L_i-1)\bmod N\) (safe with Python’s int).
dpindices are handled in \(0\)-indexed form; be careful not to miscalculate the endpoint \(e=a+m-1\).- In Python, for speed and memory efficiency:
- Store strings as
bytes(convert withb - 97to get values 0–25) - Store SAM transitions as a 1D
array('i')(indexed bystate*26 + c)
- Store strings as
These optimizations are applied.
Source Code
import sys
from array import array
def main():
data = sys.stdin.buffer.read().split()
N = int(data[0])
Q = int(data[1])
S = data[2] # bytes
max_states = 2 * N + 5
nxt = array('i', [-1]) * (max_states * 26)
link = array('i', [-1]) * max_states
length = array('i', [0]) * max_states
size = 1
last = 0
link[0] = -1
length[0] = 0
# Build SAM for S
for b in S:
c = b - 97
cur = size
size += 1
length[cur] = length[last] + 1
p = last
pc = p * 26 + c
while p != -1 and nxt[pc] == -1:
nxt[pc] = cur
p = link[p]
if p != -1:
pc = p * 26 + c
if p == -1:
link[cur] = 0
else:
q = nxt[p * 26 + c]
if length[p] + 1 == length[q]:
link[cur] = q
else:
clone = size
size += 1
src = q * 26
dst = clone * 26
for k in range(26):
nxt[dst + k] = nxt[src + k]
length[clone] = length[p] + 1
link[clone] = link[q]
while p != -1 and nxt[p * 26 + c] == q:
nxt[p * 26 + c] = clone
p = link[p]
link[q] = clone
link[cur] = clone
last = cur
# Scan T = S+S and compute dp[i] = longest match ending at i that occurs in S
T = S + S
Tlen = 2 * N
dp = array('i', [0]) * Tlen
v = 0
l = 0
for i in range(Tlen):
c = T[i] - 97
idx = v * 26 + c
while v and nxt[idx] == -1:
v = link[v]
l = length[v]
idx = v * 26 + c
to = nxt[idx]
if to != -1:
v = to
l += 1
else:
v = 0
l = 0
dp[i] = l
it = iter(data[3:])
out_lines = []
for _ in range(Q):
L = int(next(it))
R = int(next(it))
m = R - L + 1
if m > N:
out_lines.append("No")
continue
a = (L - 1) % N
e = a + m - 1
out_lines.append("Yes" if dp[e] >= m else "No")
sys.stdout.write("\n".join(out_lines))
if __name__ == "__main__":
main()
This editorial was generated by gpt-5.2-high.
posted:
last update: