Official

E - リピート再生と部分一致 / Repeat Playback and Partial Matching Editorial by admin

GPT 5.2 High

Overview

Given an infinitely repeated string \(S^{\infty}\), we need to determine \(Q\) times whether the string \(T_i\) obtained from the interval \([L_i, R_i]\) appears as a substring of the original \(S\). To process each query in \(O(1)\), we build a Suffix Automaton (SAM) of \(S\) and precompute over \(S+S\).

Analysis

Key Observation 1: It suffices to look at \(S+S\)

Let \(m = R_i - L_i + 1\) be the length of \(T_i\). Then:

  • If \(m > N\), it cannot be a substring of \(S\) (which has length \(N\)), so the answer is always No.
  • When \(m \le N\), \(T_i\) starts at some position in \(S^{\infty}\), and the starting position within \(S\) is determined by [ a = (L_i - 1)\bmod N \quad (0\text{-indexed}) ]

Since \(m \le N\), \(T_i\) fits within at most two consecutive copies, so it is always the case that [ T_i = (S+S)[a \dots a+m-1] ] In other words, each query reduces to: “Does a certain substring of \(S+S\) appear in \(S\)?”

Key Observation 2: A naive approach is too slow

If we extract \(T_i\) for each query and search within \(S\) (e.g., using KMP or find), the worst case costs: - \(O(m)\) for extraction - \(O(N+m)\) for searching

Since \(Q\) can be up to \(2\times 10^5\), this is far too slow.

Instead, we use a data structure (SAM) that can quickly determine whether something is a substring of \(S\), and additionally scan \(S+S\) in advance to reduce each query to \(O(1)\).

Algorithm

1. Build the Suffix Automaton (SAM)

A SAM is an automaton that accepts all substrings of string \(S\).
It can be constructed in \(O(N)\) by adding characters of \(S\) one by one from left to right.

2. Scan \(T = S+S\) on the SAM to create dp

For each position \(i\) (\(0\)-indexed) of \(T\), compute:

  • dp[i] = “the maximum length among substrings ending at position \(i\) of \(T\) that appear in \(S\)

While reading \(T\) on the SAM, we maintain the current state v and the current match length l. When no transition exists for the next character, we follow suffix links to “shorten until a match is found.” This is a standard SAM matching technique and runs in \(O(|T|)=O(N)\) overall.

Once dp is computed, determining whether an arbitrary interval \([a, a+m-1]\) (with endpoint \(e=a+m-1\)) appears in \(S\) is equivalent to:

  • “A substring of length \(m\) ending at position \(e\) of \(T\) exists and appears in \(S\)
  • This is equivalent to dp[e] >= m

3. Query Processing (\(O(1)\))

For each query: 1. If \(m = R-L+1\) exceeds \(N\), output No 2. Compute \(a = (L-1)\bmod N\), \(e=a+m-1\) 3. If dp[e] >= m, output Yes; otherwise output No

Complexity

  • Time complexity:
    • SAM construction: \(O(N)\)
    • Scanning \(T=S+S\): \(O(N)\)
    • Each query: \(O(1)\), repeated \(Q\) times
    • Overall: \(O(N+Q)\)
  • Space complexity: The number of SAM states is at most around \(2N-1\), so including transitions (fixed 26 characters), the space is \(O(N)\).

Implementation Notes

  • \(L_i, R_i\) can be up to \(10^{18}\), so the starting position must be computed as \((L_i-1)\bmod N\) (safe with Python’s int).
  • dp indices are handled in \(0\)-indexed form; be careful not to miscalculate the endpoint \(e=a+m-1\).
  • In Python, for speed and memory efficiency:
    • Store strings as bytes (convert with b - 97 to get values 0–25)
    • Store SAM transitions as a 1D array('i') (indexed by state*26 + c)

These optimizations are applied.

Source Code

import sys
from array import array

def main():
    data = sys.stdin.buffer.read().split()
    N = int(data[0])
    Q = int(data[1])
    S = data[2]  # bytes

    max_states = 2 * N + 5
    nxt = array('i', [-1]) * (max_states * 26)
    link = array('i', [-1]) * max_states
    length = array('i', [0]) * max_states

    size = 1
    last = 0
    link[0] = -1
    length[0] = 0

    # Build SAM for S
    for b in S:
        c = b - 97
        cur = size
        size += 1
        length[cur] = length[last] + 1

        p = last
        pc = p * 26 + c
        while p != -1 and nxt[pc] == -1:
            nxt[pc] = cur
            p = link[p]
            if p != -1:
                pc = p * 26 + c

        if p == -1:
            link[cur] = 0
        else:
            q = nxt[p * 26 + c]
            if length[p] + 1 == length[q]:
                link[cur] = q
            else:
                clone = size
                size += 1
                src = q * 26
                dst = clone * 26
                for k in range(26):
                    nxt[dst + k] = nxt[src + k]
                length[clone] = length[p] + 1
                link[clone] = link[q]

                while p != -1 and nxt[p * 26 + c] == q:
                    nxt[p * 26 + c] = clone
                    p = link[p]

                link[q] = clone
                link[cur] = clone

        last = cur

    # Scan T = S+S and compute dp[i] = longest match ending at i that occurs in S
    T = S + S
    Tlen = 2 * N
    dp = array('i', [0]) * Tlen

    v = 0
    l = 0
    for i in range(Tlen):
        c = T[i] - 97
        idx = v * 26 + c
        while v and nxt[idx] == -1:
            v = link[v]
            l = length[v]
            idx = v * 26 + c
        to = nxt[idx]
        if to != -1:
            v = to
            l += 1
        else:
            v = 0
            l = 0
        dp[i] = l

    it = iter(data[3:])
    out_lines = []
    for _ in range(Q):
        L = int(next(it))
        R = int(next(it))
        m = R - L + 1
        if m > N:
            out_lines.append("No")
            continue
        a = (L - 1) % N
        e = a + m - 1
        out_lines.append("Yes" if dp[e] >= m else "No")

    sys.stdout.write("\n".join(out_lines))

if __name__ == "__main__":
    main()

This editorial was generated by gpt-5.2-high.

posted:
last update: