G - Count Substring Query Editorial by en_translator
Let ~
be a character lexicographically larger than any lowercase English letter.
The query asks the number of, among \(S[1:],S[2:],\ldots,S[N:]\), the strings with a prefix \(T\). Denoting by \(T'\) the string \(T\) followed by ~
, the count equals the number of \(i\) with \(T\leq S[i:]\leq T'\). By precalculating \(S[1:],S[2:],\ldots,S[N:]\) sorted in lexicographical order (as known as the suffix array) and performing a binary search on this sequence, one can find the number of \(i\) with \(T\leq S[i:]\leq T'\) fast. The complexity for the construction of the suffix array and for each query is \(O(|S|)\) and \(O(|T|\log |S|)\), respectively.
from atcoder.string import suffix_array
S = input()
SA = suffix_array(S)
def binary_search(T):
ng, ok = -1, len(S)
while ok - ng > 1:
mid = (ng + ok) // 2
l = SA[mid]
if T <= S[l : l + len(T)]:
ok = mid
else:
ng = mid
return ok
Q = int(input())
for _ in range(Q):
T = input()
print(binary_search(T + "~") - binary_search(T))
(Bonus) a standard feature of a language may help implementing it concisely.
from atcoder.string import suffix_array
import bisect
S = input()
SA = suffix_array(S)
Q = int(input())
for _ in range(Q):
T = input()
l = bisect.bisect_left(SA, T, key=lambda x: S[x : x + len(T)])
r = bisect.bisect(SA, T, key=lambda x: S[x : x + len(T)])
print(r - l)
posted:
last update: