Official

G - Count Substring Query Editorial by en_translator


Let ~ be a character lexicographically larger than any lowercase English letter.

The query asks the number of, among \(S[1:],S[2:],\ldots,S[N:]\), the strings with a prefix \(T\). Denoting by \(T'\) the string \(T\) followed by ~, the count equals the number of \(i\) with \(T\leq S[i:]\leq T'\). By precalculating \(S[1:],S[2:],\ldots,S[N:]\) sorted in lexicographical order (as known as the suffix array) and performing a binary search on this sequence, one can find the number of \(i\) with \(T\leq S[i:]\leq T'\) fast. The complexity for the construction of the suffix array and for each query is \(O(|S|)\) and \(O(|T|\log |S|)\), respectively.

from atcoder.string import suffix_array

S = input()
SA = suffix_array(S)

def binary_search(T):
    ng, ok = -1, len(S)
    while ok - ng > 1:
        mid = (ng + ok) // 2
        l = SA[mid]
        if T <= S[l : l + len(T)]:
            ok = mid
        else:
            ng = mid
    return ok

Q = int(input())
for _ in range(Q):
    T = input()
    print(binary_search(T + "~") - binary_search(T))

(Bonus) a standard feature of a language may help implementing it concisely.

from atcoder.string import suffix_array
import bisect

S = input()
SA = suffix_array(S)

Q = int(input())
for _ in range(Q):
    T = input()
    l = bisect.bisect_left(SA, T, key=lambda x: S[x : x + len(T)])
    r = bisect.bisect(SA, T, key=lambda x: S[x : x + len(T)])
    print(r - l)

posted:
last update: