公式

D - 1122 Substring 解説 by en_translator


Solution \(1\): Consider subarrays starting from the odd-indexed terms of \(A\) and the even-indexed terms separately

Here, we explain how to find the length of the longest 1122 sequence that is a subsequence of \(A\) starting from an odd-indexed element, one in \(A_1, A_3,\ldots\). Those starting from even-indexed one can be treated in the same way, and the answer is the longer of them.

An important fact follows:

  • for \(1\leq l<r\leq N\), if \((A_l,A_{l+1},\ldots,A_{r})\) is a 1122 sequence, then the sequence obtained by removing the last two elements, namely \((A_l,A_{l+1},\ldots,A_{r-2})\), is also a 1122 sequence. Here, notice that it yields an empty sequence if \(r-l=2\).

Therefore, the problem can be solved with the sliding window trick.

First, for an even number \(r=2,4,\ldots\), let \(f(r)\) be the smallest odd number \(l\) \((1\leq l\leq r+1)\) such that \((A_l,A_{l+1},\ldots,A_r)\) forms a 1122 sequence. Here, if \(l=r+1\), then \((A_l,A_{l+1},\ldots,A_r)\) is an empty sequence, especially a 1122 sequence, so \(f(r)\) is always defined.
Moreover, by the fact above, we can also find that \(f(r)\) is weakly monotonically increasing.

The sought value, that is, “the length of the longest subarray of \(A\) starting from an odd-indexed element of \(A\), one in \(A_1, A_3,\ldots\), that is a 1122 sequence,” can be represented using \(f(r)\) as \(\displaystyle \max_{r=2,4,\ldots} (r-f(r)+1)\).

Now, let us consider evaluating \(f(r)\) for \(r=2,4,\ldots\) in order.
For \(r=2\), we have \(l=1\) if \(A_1=A_2\), and \(l=3\) otherwise.
When we know \(f(r-2)\), we can find \(f(r)\) as follows:

  1. if \(A_{r-1}\neq A_r\), then \(f(r)=r+1\).
  2. If \(A_{r-1}=A_r\), and if no element among \(A_{f(r-2)}, A_{f(r-2)+2}, \ldots, A_{r-3}\) equals \(A_{r-1}\), then \(f(r)=f(r-2)\).
  3. If \(A_{r-1}=A_r\), and if there exists \(k\in \{ f(r-2), f(r-2)+2,\ldots, r-3\}\) such that \(A_{r-1}=A_k\), then \(f(r)=k+2\). Here, note that such \(k\) is always unique, because \((A_{f(r-2)},A_{f(r-2)+1},\ldots,A_{r-2})\) forms a 1122 sequence.

However, if we try to naively inspect each of \(A_{f(r-2)}, A_{f(r-2)+2}, \ldots, A_{r-3}\) to check if it coincides with \(A_{r-1}\), then it costs \(\Theta(r)\) time at worst, which sums to overall complexity of \(\Theta(N^2)\), not finishing within the execution time limit.
Thus, we need to compute this fast. There are several ways; for example, since the constraints guarantee that \(A_i\leq N\) \((\leq 2\times 10^5)\), we can record for each \(1\leq x\leq N\) the maximum \(i\) among \(i=1,3,\ldots, r-3\) such that \(A_i=x\) (or \(-1\) if it does not exist), in order to check whether the maximum \(i\) with \(A_i=A_{r-1}\) is greater than or equal to \(f(r-2)\), and what is that value if it exists, in \(O(1)\) time. Updating this record is also lightweight; we just need to update one element for each \(r\) (namely, for \(x=A_{r-1}\), update the maximum \(i\) with \(f(i)=x\) to \((r-1)\)), costing a total of \(O(N)\) time. This way, the problem can be solved in a total of \(O(N)\) time.

Therefore, the problem has been solved fast enough.
One can also use a data structure like std::map in C++ to solve the problem in a total of \(O(N\log N)\).

Solution \(2\): Generalize the definition of 1122 sequence to those with an odd number of elements, and solve at once

As a generalization of 1122 sequence, let us define a good sequence as a sequence \(X=(X_1,X_2,\ldots)\) such that:

  • \(X_{2i-1}\) and \(X_{2i}\) are equal for all integers \(i\) with \(1\leq i\leq \lfloor \frac{\lvert X \rvert}{2}\rfloor \).
  • The odd-indexed elements \(X_1,X_3,\ldots, X_{ 2\cdot\lfloor \frac{\lvert X \rvert+1}{2}\rfloor -1}\) of \(X\) are distinct.

Here, note that \(X\) is a 1122 sequence if and only if \(X\) is a good sequence with an even number of elements.
Also, a good sequence satisfies the following property:

  • Given a good sequence, removing the last elements still yields another good sequence.

Therefore, if the longest subarray of \((A_1,A_2,\ldots, A_N)\) that is a good sequence has a length of \(M\), then the answer is the largest even number not exceeding \(M\) \((=2\cdot \lfloor \frac{M}{2}\rfloor)\). Using this property, the problem can be solved using the sliding window trick as follows.

The solution itself is almost the same as solution \(1\).
For \(1\leq r\leq N\), let \(g(r)\) be the smallest \(l\) \((1\leq l\leq r)\) such that \((A_l,A_{l+1},\ldots,A_r)\) is a good sequence. Here, if \(l=r\), then \((A_l,A_{l+1},\ldots,A_r)\) forms a single-term sequence, which is a good sequence, so such \(g(r)\) is always defined.
The length \(M\) of the longest subarray of \((A_1,A_2,\ldots, A_N)\) that is a good sequence can be expressed using \(g(r)\) as \(M=\displaystyle \max_{1\leq r\leq N} (r-g(r)+1)\). Also, by the property above, we can show that \(g(r)\) is weakly monotonically increasing.

Consider finding \(g(r)\) for \(r=1,2,\ldots\) in order.
We always have \(g(1)=1\). When we have \(g(r-1)\), \(g(r)\) can be found as follows:

  • If \((r-g(r-1))\) is even:
  1. if none of \(A_{g(r-1)}, A_{g(r-1)+1}, \ldots, A_{r-1}\) equals \(A_r\), then \(g(r)=g(r-1)\).
  2. If some \(k\in \{ g(r-1), g(r-1)+1,\ldots, r-1\}\) satisfies \(A_{r}=A_k\), then let \(k_0\) be the largest such \(k\); if \(k_0=r-1\), then \(g(r)=k_0\), and otherwise, \(g(r)=k_0+1\).
  • If \((r-g(r-1))\) is odd:
  1. If \(A_{r-1}=A_r\), then \(g(r)=g(r-1)\).
  2. If \(A_{r-1}\neq A_r\), then \(g(r)=r\).

The justification of this formula requires a casework, but each case is obvious so we omit it here.
To find the value of \(g(r)\) for each \(r\) in \(O(1)\) time, it is sufficient to record the maximum \(i\) such that \(A_i=x\) among \(i=1,2,\ldots, r-1\) for each \(1\leq x\leq N\) (or \(-1\) if it does not exist). By this way, the problem can be solved in a total of \(O(N)\) time.

Hence, the problem has been solved fast enough.

Sample code in C++ (Solution \(1\)):

#include <bits/stdc++.h>
using namespace std;

#define rep(i, n) for(int i = 0; i < n; ++i)
#define N 200000

int main() {
	int n,l,ans=0;
	int a[N];
	int last[N+1];
	cin>>n;
	rep(i,n)cin>>a[i];

	rep(i,N)last[i+1]=-2;
	l=0;
	for(int i=0;(i+1)<n;i+=2){
		if(a[i]!=a[i+1])l=i+2;
		else l=max(l,last[a[i]]+2);
		ans=max(ans,i+2-l);
		last[a[i]]=i;
	}

	rep(i,N)last[i+1]=-2;
	l=1;
	for(int i=1;(i+1)<n;i+=2){
		if(a[i]!=a[i+1])l=i+2;
		else l=max(l,last[a[i]]+2);
		ans=max(ans,i+2-l);
		last[a[i]]=i;
	}
	cout<<ans<<endl;
	return 0;
}

Sample code in C++ (Solution \(2\)):

#include <bits/stdc++.h>
using namespace std;

#define rep(i, n) for(int i = 0; i < n; ++i)
#define N 200000

int main() {
	int n,l,ans=0;
	int a[N];
	int last[N+1];
	cin>>n;
	rep(i,n)cin>>a[i];

	rep(i,N)last[i+1]=-2;
	l=0;
	rep(i,n){
		if((i-l)%2==0){
			if(last[a[i]]==i-1)l=last[a[i]];
			else if(last[a[i]]>=l)l=last[a[i]]+1;			
		}
		else{
			if(a[i-1]!=a[i])l=i;
		}
		ans=max(ans,i-l+1);
		last[a[i]]=i;
	}
	cout<<(2*(ans/2))<<endl;
	return 0;
}

投稿日時:
最終更新: