Official

G - Prefix Concatenation Editorial by en_translator


First, let us consider how to find the length \(f(i)\) of the longest common prefix of the substring from the \(i\)-th to the \(|S|\)-th characters of \(S\) and \(S\) itself, for each \(i\) such that \(2\leq i\leq |S|\). We can use an algorithm known as Z-algorithm to find this, which allows us to find them in a total of \(O(|S|)\) time. The following is the outline of the algorithm. We denote by \(S[i]\) the \(i\)-th character of \(S\).

  • For \(i=2\), just find it naively; i.e. for each \(j=1,2,\ldots\), compare \(S[j+1]\) and \(S[j]\), and let \(f(2)=j-1\) for the first \(j\) such that \(S[j+1]\neq S[j]\).

  • For \(3\leq i\leq N\), take \(2\leq l\leq i-1\) that maximizes \(l+f(l)\). Then, (the \(l\)-th through \((l+f(l)-1)\)-th characters of \(S\)) \(=\) (the \(1\)-st through \(f(l)\)-th characters of \(S\)), so it holds that (the \(i\)-th through \((l+f(l)-1)\)-th characters of \(S\)) equals (the \((i-l+1)\)-th through \(f(l)\)-th characters of \(S\)).

  • Thus, if \(f(i-l+1)<f(l)-(i-l+1)\), then \(f(i)=f(i-l+1)\).

  • Otherwise, for each \(j=1,2,\ldots\), compare \(S[l+f(l)-1+j]\) and \(S[l+f(l)-i+j]\), and let \(f(i)=l+f(l)-i+j-1\) for the first \(j\) such that \(S[l+f(l)-1+j]\neq S[l+f(l)-i+j]\).

A total of about \(\displaystyle\max_{1\leq j\leq i} (f(j)+j)-\displaystyle\max_{1\leq j\leq i-1} (f(j)+j)\) comparisons, or roughly \(|S|\) comparisons are required to find each \(f(i)\).

Consider solving this problem based on the algorithm above. Suppose that we have already found \(f(x)\) for \(S\).

Let \(g(i)\) be the minimum number of prefixes of \(S\) required to obtain the substring from \(1\)-st through \(i\)-th characters of \(T\) as their concatenation. What we want is \(g(|T|)\), or if it is impossible to do so, let \(g(i)=\infty\). Note that \(g(i)\) are (weakly) monotonically increasing. First, let \(g(0)=0\); then, let us update them as follows for each \(i=1,2,\ldots, |T|\),in a “distributing DP (Dynamic Programming)” manner:

  • If \(i=1\), then compare \(S[j]\) and \(T[j]\) for each \(j=1,2,\ldots\), and let \(g(j)=1\) for all \(j\) less than the minimum \(j_0\) such that \(g(j)=1\).

  • Otherwise, if \(g(i-1)=\infty\), then \(g(i-1)\) is never updated, and finally we have \(g(N)\geq g(i-1)=\infty\), so abort the program and output \(-1\).

  • Otherwise, what we want to do here is to update \(g(j)=\min(g(j),g(i-1)+1)\) for all \(i\leq j\leq |T|\) such that (the \(i\)-th through \(j\)-th characters of \(T\)) equals (the \(1\)-st through \((j-i+1)\)-th characters of \(S\)). Here, by the monotonicity of \(g(i)\), the values already set before \(i\) is at most \(g(i-1)+1\), so all we need is to update as \(g(j)=g(i-1)+1\) for those such that \(g(j)=\infty\).

  • Now, in order to do so, take the maximum \(r\) such that \(1\leq r\leq |T|\) and \(g(r)<\infty\). Also, there exists \(l\) such that \(1\leq l\leq i-1\) and (the \(l\)-th through \(r\)-th characters of \(T\)) equals (the \(1\)-st through \((r-l+1)\)-th characters of \(T\)), so take such \(l\).

  • Since (the \(i\)-th through the \(r\)-th characters of \(T\)) equals (the \((i-l+1)\)-th through \((r-l+1)\)-th characters of \(S\)), if \(f(i-l+1)<r-i+1\), then for \(j>r\) we have \(g(j)=\infty\), so we don’t have to do anything.

  • When \(f(i-l+1)\geq r-i+1\), then compare \(T[r+j]\) and \(S[r-i+1+j]\), and for all \(j\) less than the minimum \(j_0\) such that \(T[r+j]\neq S[r-i+1+j]\), let \(g(r+j)=g(i-1)+1\).

All that left is to output \(g(|T|)\) that we found. The number of comparisons in this program is also at most \(O(|T|)\) because the maximum \(r\) such that \(g(r)<\infty\) is updated every time, so the problem has been solved in a total of \(O(|S|+|T|)\) time, which is fast enough. Sample code in C++

#include <bits/stdc++.h>
using namespace std;

int main() {
	string s, t;
	int c[500010], d[500010];
	int l, r;

	cin >> s;
	cin >> t;
	int n = s.size();
	int m = t.size();

	r = 1;
	c[0] = n;
	for (int i = 1; i < n; i++) {
		if (r == n)c[i] = n - i;
		if (i == 1) {
			while (r < n) {
				if (s[r] == s[r - i])r++;
				else break;
			}
			c[i] = r - i;
			l = i;
		}
		else if (c[i - l] < (r - i)) {
			c[i] = c[i - l];
		}
		else {
			r = max(i, r);
			while (r < n) {
				if (s[r] == s[r - i])r++;
				else break;
			}
			c[i] = r - i;
			l = i;
		}
	}

	for (int i = 0; i < m; i++)d[i] = -1;
	r = 0;
	for (int i = 0; i < m; i++) {
		if (r == m)break;
		if (i == 0) {
			while ((r < m) && (r < n)) {
				if (t[r] == s[r]) {
					d[r] = 1;
					r++;
				}
				else break;
			}
			l = 0;
		}
		else if (d[i - 1] == -1)break;
		else if (c[i - l] >= r - i) {
			while ((r < m) && ((r - i) < n)) {
				if (t[r] == s[r - i]) {
					d[r] = d[i - 1] + 1;
					r++;
				}
				else break;
			}
			l = i;
		}
	}
	cout << d[m - 1] << endl;

	return 0;
}

posted:
last update: