F - Concat (2nd) Editorial by en_translator
Original proposer: physics0523
In this problem, \(st\) will denote the string obtained by concatenating \(s\) and \(t\).
If the problem is of the following form, the solution is famous:
You are given \(N\) strings \(S_i\).
Find the lexicographically smallest string when concatenating \(S_i\) in any order.
ECR9-C is exactly this problem (Editorial (English) [Reference (Japanese)(https://drken1215.hatenablog.com/entry/2022/12/16/194300)]); ABC225-F is another exercise of this problem (Editorial).
First, we will restate the solution of this problem.
The problem above can be solved as follows:
The optimal string is obtained by sorting the strings in the order defined as follows, and then concatenating them in the resulting order:
- order \(X<Y\) if and only if \(XY<YX\).
For the proof, please refer to the editorials above.
However, there is a pitfall under the constraints of this problem (\(N \le 3 \times 10^5, \Sigma |S_i| \le 10^6\)). (This was pointed out by Nyaan during the problemsetting.)
When comparing strings, if you naively construct \(XY\) and \(YX\), the comparison costs \(O(|X|+|Y|)\).
If a specific string \(S_i\) is referenced \(O(N)\) times during the sort, the computational complexity will be come \(\Omega(N |S_i|)\), leading to TLE when \(|S_i|\) is large.
(There are also solutions that circumvent this issue.)
For the same reason, we cannot spend \(O(\max(|X|,|Y|))\) time for comparison.
However, if the comparison can be done in \(O(\min(|X|,|Y|))\) time, the sorting can be done in a total of \(O(\Sigma |S_i| \log N)\) time.
How can we determine if \(XY < YX\) in \(O(\min(|X|,|Y|))\) time?
In addition to a string \(X\), we also memorize the Z-array (the result of applying Z-algorithm) \(Z_X\).
The following is the algorithm when \(|X| \ge |Y|\). (Same applies to \(|X| < |Y|\).)
- Regard \(XY\) as \(X[1,|Y|] X[|Y|+1,|X|], Y\), and \(YX\) as \(Y X[1,|X|-|Y|], X[|X|-|Y|+1,|X|]\).
- First, compare \(X[1,|Y|]\) and \(Y\) by scanning from the beginning.
- If the results are undetermined by this, we need to compare \(X[|Y|+1,|X|]\) and \(X[1,|X|-|Y|]\).
- We use \(Z_X\) to perform this comparison in \(O(1)\).
- If \(Z_X[|Y|+1] \ge |X|-|Y|\), we see that these strings are equal, so we advance to the next string.
- Otherwise, the \((Z_X[|Y|+1] +1)\)-th characters of these strings are different, so inspect these characters to determine the order.
- If the results are undetermined by the procedure so far, compare \(Y\) and \(X[|X|-|Y|+1,|X|]\) by scanning from the beginning.
Let \(S'_1,S'_2,\dots,S'_N\) be the result of this sorting. We will now explain how to find how to find the answer based on this.
- If \(N=2\), then the answer is \(S'_2S'_1\).
- If there exists an integer \(1 \le i < N\) such that \(S'_i S'_{i+1}=S'_{i+1}S'_i\), then swapping \(S'_i\) and \(S'_{i+1}\) yields the same concatenation, so the answer is the string obtained by concatenating the strings in \(S'\) in order.
- If the answer is undetermined up to this point, it holds that:
- \(S'_iS'_{i+1}\) is not equal to \(S'_{i+1}S'_i\) for any integers \(1 \le i < N\).
- In fact, the answer is the lexicographically smaller of the following strings.
- \(S'_1S'_2 \dots S'_{N}S'_{N-1}\)
- \(S'_1S'_2 \dots S'_{N-1}S'_{N-2}S'_N\)
We will prove the last claim.
First, any permutation of inversion number \(2\) or greater (compared to \(S'\)) is not applicable for the answer.
This is because reducing the inversion number always yields a lexicographically smaller string (in other words, if the inversion number is \(2\) or greater, there are at least two lexicographically smaller strings.)
Next, consider two strings \(S'_1S'_2 \dots S'_{N-1}S'_N\) and \(S'_1S'_2 \dots S'_{N}S'_{N-1}\).
Since we can obtain them, any strings whose prefix is not \(S'_1S'_2 \dots S'_{N-2}\) cannot be the answer.
Since \(S'_iS'_{i+1}\) and \(S'_{i+1}S'_i\) are different for any integer \(1 \le i < N\), among the permutations with inversion number \(1\), those swapping strings prior to \(S'_{N-2}\) cannot be the answer.
Hence, the following two candidates remain:
- \(S'_1S'_2 \dots S'_{N}S'_{N-1}\)
- \(S'_1S'_2 \dots S'_{N-1}S'_{N-2}S'_N\)
For example, for \(S'=(\) a, b, c \()\) the former is the answer; for \(S'=(\) a, aab, b \()\), the latter is.
All that left is to implement this algorithm.
One can sort the strings in \(O(\Sigma |S_i| \log N)\).
One can determine if there exists an \(i\) with \(S'_i S'_{i+1}=S'_{i+1}S'_i\) in \(S'_i S'_{i+1}=S'_{i+1}S'_i\) even with a naive implementation.
The comparison between the two candidates in the last step can also be done in \(O(\Sigma |S_i|)\) time.
Therefore, the problem has been solved in a total of \(O(\Sigma |S_i| \log N)\) time.
Sample code (C++):
#include<bits/stdc++.h>
#include<atcoder/string>
using namespace std;
using namespace atcoder;
using psv=pair<string,vector<int>>;
bool comp(const psv &x,const psv &y){
int xl=x.first.size(),yl=y.first.size();
if(xl>=yl){
for(int i=0;i<yl;i++){
if(x.first[i] != y.first[i]){
return (x.first[i] < y.first[i]);
}
}
if(xl!=yl && x.second[yl]<(xl-yl)){
return (x.first[yl+x.second[yl]]<x.first[x.second[yl]]);
}
for(int i=0;i<yl;i++){
if(y.first[i] != x.first[i+(xl-yl)]){
return (y.first[i] < x.first[i+(xl-yl)]);
}
}
}
else{
for(int i=0;i<xl;i++){
if(x.first[i] != y.first[i]){
return (x.first[i] < y.first[i]);
}
}
if(y.second[xl]<(yl-xl)){
return (y.first[y.second[xl]]<y.first[xl+y.second[xl]]);
}
for(int i=0;i<xl;i++){
if(y.first[i+(yl-xl)] != x.first[i]){
return (y.first[i+(yl-xl)] < x.first[i]);
}
}
}
return false;
}
string concat(vector<psv> &s){
string res="";
for(auto &nx : s){ res+=nx.first; }
return res;
}
int main(){
int t;
cin >> t;
while(t--){
int n;
cin >> n;
vector<psv> s(n);
for(auto &nx : s){
cin >> nx.first;
nx.second=z_algorithm(nx.first);
}
sort(s.begin(),s.end(),comp);
if(n==2){
cout << s[1].first+s[0].first << "\n";
continue;
}
bool ok=false;
for(int i=1;i<n;i++){
if(s[i-1].first+s[i].first == s[i].first+s[i-1].first){ok=true; break;}
}
if(ok){
cout << concat(s) << "\n";
continue;
}
swap(s[n-1],s[n-2]);
string c1=concat(s);
swap(s[n-1],s[n-2]);
swap(s[n-2],s[n-3]);
cout << min(c1,concat(s)) << "\n";
}
return 0;
}
posted:
last update: