D - Mismatched Parentheses Editorial by en_translator
Let \(|X|\) be the length of the string \(X\).
Since the resulting string is independent of the order of operations, we can greedily perform the operations from the beginning. Intuitively, we can inspect the character one by one from the beginning, and every time you found an )
, remove the substring starting from the last occurrence of (
. Formally, consider the following procedure:
- process each character of \(S\) in order as follows:
- if the inspected character is not
)
, do nothing; - if the inspected character is
)
, find the nearest(
before the current character:- if there is such an
(
, remove the substring starting with that(
and ending with current)
- if there is no
)
, do nothing.
- if there is such an
- if the inspected character is not
- Print \(S\).
In this procedure, (in most languages) removing a substring from a string costs \(\Theta(|S|)\) time, so if \(S\) equals ()()()...
, it costs \(\Theta(N^2)\). So, instead of actually editing \(S\), consider storing the answer to an another string and editing it. That is, consider the following procedure.
- Prepare an empty string \(T\). Process each character of \(S\) in order as follows:
- if the inspected character \(c\) is not
)
, append \(c\) to the tail of \(T\); - if the inspected character \(c\) is
)
, find the last occurrence of(
in \(T\):- if there is such
(
, remove the substring starting from that(
from \(T\); - if there is not, append \(c\) to \(T\).
- if there is such
- if the inspected character \(c\) is not
- Print \(T\).
In this procedure, (in most languages) removal of the substring can be achieved in an \(O(\text{length of substring to be removed})\) time, since the removed substring is always a suffix of \(T\). However, even with this procedure, “finding the last occurrence of (
in \(T\)” costs \(\Theta(|T|)\) time when, for example, \(S\) is ))))))...
, so it costs a total of \(\Theta(N^2)\) time. To improve this, we remember the positions of all occurrences of (
in \(T\); i.e. consider the following procedure:
- Prepare an empty array \(A\) and an empty string \(T\). Process each character of \(S\) in order as follows:
- if the inspected character \(c\) is not
(
nor)
, append \(c\) to the tail of \(T\); - if the inspected character \(c\) is
(
, append \(c\) to the tail of \(T\), and append its index to the tail of \(A\); - if the inspected character \(c\) is
)
, check if \(A\) is empty.- If it is not empty, let \(x\) be the last element, and remove the \(x\)-th and later characters of \(T\). Moreover, remove the \(x\) from \(A\).
- If it is empty, append \(c\) to \(T\).
- if the inspected character \(c\) is not
- Print \(T\).
This way, you can “find the last occurrence of (
in \(T\)” in an \(O(1)\) time. Since the removal of substrings costs a total of \(O(N)\) time, the problem has been solved in a total of \(O(N)\).
As we described so far, it is essential to identify the bottleneck of the entire process and come up with improvements step-by-step.
Actually, you do not actually need to maintain \(A\). It is sufficient to track the number of (
, and if the count is one or greater, track back from the tail of \(T\); this also costs a total of \(O(N)\) time.
posted:
last update: