Official

G - Conquest Editorial by en_translator


This problem features game theory.
Recall the notes in the <details> (expander) in the problem statement:

At each stage, each player assigns probabilities summing to \(1\) to the available choices based on the information revealed up to that point, and randomly selects one choice according to those probabilities.
Each player chooses a strategy that maximizes “the minimum possible win rate,” that is, “the win rate when the opponent chooses a strategy that minimizes the player’s win rate.” If there are multiple strategies satisfying this condition, any of them may be chosen.
When both players act according to this policy, Takahashi’s win rate is uniquely determined regardless of the specific strategy chosen. Find this value.

We will first overview the prerequisite in order to thoroughly explain the implications of the notes and the justification of the claim.

Two-Player Zero-Sum Game

A game in the following rules is defined as a two-player zero-sum game:

  • There are two players.
  • Each player chooses a move exactly once, and the out come determines each player’s benefit.
  • Each player does not know the opponent’s move before determining their move.
  • There is a finite number of moves that each player can choose.
  • The two player’s benefits resulting from the game sum to \(0\).

In case of a game where each player tries to maximize their winning probability, like in our original problem, the game is not-zero sum, because if we define their benefit functions by their winning rates, they sum to \(1\); nevertheless, the constant sum implies that we can safely reinterpret it as a two-player zero-sum game.
Also, the original problem has two stages two make moves (banning stage and deck-selection stage), but the decision is made simultaneously at each stage, so the game analysis can be backtracked from the latter stage, and we can apply the same discussion.
From now on, we will introduce properties of two-player zero-sum games (which we will simply call zero-sum games).

Mixed strategy

In most game-theory problems in competitive programming, it is optimal for players to choose a single move that maximizes the winning rate. This is called a pure strategy.
In zero-sum games however, a pure strategy does not necessarily maximize the winning rate, because the opponent’s move is unknown.
The game Rock Paper Scissors may help understanding why. For example, nobody would claim that “shooting rock \(100 \%\)” is an optimal strategy, because it is obviously not favorable against a person who prefers shooting paper.
Instead, let us think of a strategy of assigning probabilities to multiple moves, and choosing one according to them. This is called a mixed strategy.

Thus, the first line of the notes—

At each stage, each player assigns probabilities summing to \(1\) to the available choices based on the information revealed up to that point, and randomly selects one choice according to those probabilities.

—was describing a mixed strategy. From now on, we will simply refer to a mixed strategy as a “strategy.”

Maximin Strategy

Next, we will explain the second line of the notes:

Each player chooses a strategy that maximizes “the minimum possible win rate,” that is, “the win rate when the opponent chooses a strategy that minimizes the player’s win rate.”

This is an explanation of a strategy called maximin strategy.

First, let us define several mathematical concepts.
Let us call the two players Alice and Bob. Let

\[\Delta_n = \left\lbrace x=(x_1, x_2, \dots, x_n)^T \vert \sum_{i=1}^n x_i = 1, x_i \geq 0 \right\rbrace.\]

In other words, \(\Delta_n\) is a set of mixed strategies assignable to \(n\) choices.
Define a benefit matrix \(A\) as a matrix whose \((i, j)\)-component is the winning probability when Alice chooses move \(i\), and Bob chooses move \(j\).
When the benefit matrix of a zero-sum game is represented by a matrix \(A\), define the benefit function \(E(x, y)\) \((x \in \Delta_n, y \in \Delta_m)\) as the expected benefit of Alice when Alice and Bob chooses mixed strategies \(x\) and \(y\), respectively. So we have:

\[E(x, y) = x^{T} A y.\]

From now on \(x\) and \(y\) will implicitly denote a mixed strategy of Alice and Bob, respectively.

Based on this formulation, consider an Alice’s strategy that maximizes “the minimum possible win rate,” that is, “the win rate when the opponent chooses a strategy that minimizes the player’s win rate.” The winning probability when the opponent chooses a strategy that minimizes the player’s win probability is

\[\min_y E(x, y),\]

so the strategies that maximize this value, and the maximum benefit, are respectively:

\[\argmax_x \min_y E(x, y), \max_x \min_y E(x, y).\]

Such a strategy is called a maximin strategy. (Note: \(\argmax_x\) represents the whole set of the strategies that achieves the maximum. If the set has multiple elements, she may choose any one of them.)

When Bob acts in the same strategy, we have

\[\max_y \min_x -E(x, y),\]

but this value equals, by the property of a zero-sum game, \(-1\) times Alice’s benefit:

\[\min_y \max_x E(x, y).\]

With no negative sings, this form is easier to treat with, so we will use this expression to consider Bob’s strategy.

In short,

  • Alice chooses \(\argmax_x \min_y E(x, y)\), and
  • Bob chooses \(\argmin_y \max_x E(x, y)\).

This is what is claimed by the second line of the notes.

Minimax theorem

Finally, let us inspect the third line of the notes:

When both players act according to this policy, Takahashi’s win rate is uniquely determined regardless of the specific strategy chosen. Find this value.

The proposition “win rate is uniquely determined” is formulated as:

\[\max_x \min_y E(x, y) = \min_y \max_x E(x, y).\]

This is called minimax theorem.
First, let us justify the minimax theorem via linear programming (LP).

First, let us represent Alice’s strategy as a LP. What we want is:

\[\max_x \min_y x^{T} A y.\]

For a fixed \(x\), a \(y\) that minimizes \(x^{T} A y\) is always a pure strategy (because it is linear with respect to \(y\)). Thus we obtain:

\[ \begin{aligned} \max. & & \min_j \sum_{i=1}^n A_{i,j} x_i \\ \mathrm{s.t.} & & \sum_{i=1}^n x_i = 1 \\ & & x_i \geq 0. \end{aligned} \]

Write \(\min_j \sum_{i=1}^n A_{i,j} x_i = v\) to get

\[ \begin{aligned} \max. & & v \\ \mathrm{s.t.} & & v - \sum_{i=1}^n A_{i,j} x_i \leq 0 & & (\forall j) \\ & & \sum_{i=1}^n x_i = 1 \\ & & x_i \geq 0. \end{aligned} \]

Let take the dual of this LP. Assign variables \(y_j\) and \(w\) in order to obtain

\[ \begin{aligned} \min. & & w \\ \mathrm{s.t.} & & w - \sum_{j=1}^m A_{i,j} y_j \geq 0 & & (\forall i)\\ & & \sum_{j=1}^m y_j = 1\\ & & y_j \geq 0. \end{aligned} \]

Since we can set \(w = \max_i \sum_{j=1}^m A_{i,j} y_j\), we eliminate \(w\) to arrive at

\[ \begin{aligned} \min. & & \max_i \sum_{j=1}^m A_{i,j} y_j \\ \mathrm{s.t.} & & \sum_{j=1}^m y_j = 1\\ & & y_j \geq 0, \end{aligned} \]

which exactly matches \(\min_y \max_x x^{T} A y\)!

Based on competitive programming knowledge, a bounded feasible LP and its dual has the same solution; thus it follows that:

\[\max_x \min_y E(x, y) = \min_y \max_x E(x, y).\]

Based on the discussions so far, we will prove the third line of the notes:

When both players act according to this policy, Takahashi’s win rate is uniquely determined regardless of the specific strategy chosen. Find this value.

(Note that while the winning rate is indeed “unique,” optimal strategies are not necessarily are.)

(Proof) Let \(x^{\ast}\) and \(y^{\ast}\) be any one optimal solution of the main and dual problems, respectively. (The “any one” corresponds to “regardless of strategy”.) Also, let \(V\) be the optimal value of the LP. Then

  • \(\forall j, \sum_{i=1}^n A_{i,j} {x_i}^{\ast} \geq V \to A^{T}x^{\ast} \geq V \mathbf{1}\),
  • \(\forall i, \sum_{j=1}^m A_{i,j} {y_j}^{\ast} \leq V \to Ay^{\ast} \leq V \mathbf{1}\).

Multiplying each inequalities by \((y^{\ast})^T\) and \((x^{\ast})^T\) from left, we get

\[(y^{\ast})^T A^{T}x^{\ast} \geq (y^{\ast})^T V \mathbf{1} = V,\]

\[(x^{\ast})^T Ay^{\ast} \leq (x^{\ast})^T V \mathbf{1} = V.\]

Thus,

\[V \leq (x^{\ast})^T Ay^{\ast} \leq V \iff (x^{\ast})^T Ay^{\ast} = V,\]

which implies that the optimal value is unique regardless of the choice of optimal solutions. (End of proof)

As a side note, one can relatively easily show that any function \(f(x, y)\) satisfies

\[\max_x \min_y f(x, y) \leq \min_y \max_x f(x, y).\]

(This is called the minimax inequality.)

(Proof) Since \(\forall x, \min_y f(x, y) \leq f(x, y)\) and \(\forall y, f(x, y) \leq \max_x f(x, y)\), we obtain

\[\forall x, \forall y, \min_{y'} f(x,y') \leq \max_{x'} f(x',y)\]

Maximizing the right hand side by \(x\) and minimizing the left hand side by \(y\) yields the desired inequality. (End of proof)

Nash equilibrium

By the discussions so far, the notes in the problem statement have been justified, and the objective has been fulfilled. However, some people may still pose a kind of semantic questions: why do we consider mixed strategies, and maximin strategies?
We will introduce a concept called Nash equilibrium and show that the sought pair of strategies coincides with a Nash equilibrium.

A Nash equilibrium (technically, a Nash equilibrium in mixed strategies, or a mixed Nash equilibrium) is a pair of strategies \((x^{\ast}, y^{\ast})\) such that:

  • \(\forall x, E(x^{\ast}, y^{\ast}) \geq E(x, y^{\ast})\) and \(\forall y, E(x^{\ast}, y^{\ast}) \leq E(x^{\ast}, y)\).

Colloquially, it can be explained as follows:

  • When Bob’s strategy is \(y^{\ast}\), Alice does not benefit from changing her strategy from \(x^{\ast}\).
  • When Alice’s strategy is \(x^{\ast}\), Bob does not benefit from changing his strategy from \(y^{\ast}\).

A Nash equilibrium is one of the most natural definitions in a sense that each player is acting according to their optimal strategy, and not motivated to alter their strategy.

Here, we have the following fact in a zero-sum game:

For a strategy pair \((x^{\ast}, y^{\ast})\), the following are equivalent:

  1. \((x^{\ast}, y^{\ast})\) is a Nash equilibrium.
  2. \(x^{\ast}\) is an \(x\) that maximizes \(\min_y E(x, y)\), and \(y^{\ast}\) is a \(y\) that minimizes \(\max_x E(x, y)\) (so these are the strategies chosen by the policy described in the notes of the problem statement).

(Proof)

(1) \(\to\) (2): Let \((x^{\ast}, y^{\ast})\) be a Nash equilibrium, and \(v = E(x^{\ast}, y^{\ast})\).

By the criteria for a Nash equilibrium, for all \(x\) we have:

\[E(x^{\ast},y^{\ast}) \geq E(x, y^{\ast}),\]

so

\[\min_y E(x, y) \leq E(x, y^{\ast}) \leq v.\]

Maximizing both hand sides by \(x\) yields

\[\max_x \min_y E(x, y) \leq v.\]

Likewise, by the criteria for a Nash equilibrium, for all \(y\) we have

\[E(x^{\ast},y^{\ast}) \leq E(x^{\ast}, y),\]

so

\[\max_x E(x, y) \geq E(x^{\ast}, y) \geq v.\]

Minimizing both hand sides by \(y\) yields

\[\max_x \min_y E(x, y) = \min_y \max_x E(x, y) = v.\]

Thus

\[\max_x \min_y E(x, y) \leq v \leq \min_y \max_x E(x, y),\]

but by the minimax Theorem the leftmost and rightmost terms are equal, so

\[\max_x \min_y E(x, y) = \min_y \max_x E(x, y) = v.\]

When this equality is satisfied,

\[v = \min_y E(x^{\ast}, y) = \max_x E(x, y^{\ast}).\]

Hence \(x^{\ast}, y^{\ast}\) satisfy the condition (2).

(2) \(\to\) (1): Let \(V:=\max_x \min_y E(x, y)\). By definition

\[\max_x E(x, y^{\ast}) = \min_y E(x^{\ast}, y) = V,\]

and by the discussion above

\[E(x^{\ast}, y^{\ast}) = V.\]

Moreover,

\[E(x, y^{\ast}) \leq \max_x E(x, y^{\ast}), E(x^{\ast}, y) \geq \min_y E(x^{\ast}, y),\]

thus we obtain

\[\forall x, \forall y, E(x, y^{\ast}) \leq E(x^{\ast}, y^{\ast}) \leq E(x^{\ast}, y),\]

which matches the definition of Nash equilibrium. (End of proof)

This way, it has been shown that a strategy pair defined in the problem statement actually coincides with a Nash equilibrium, which is the most natural strategy pair.
We also have the following fact:

A zero-sum game that allows a mixed strategy always has a Nash equilibrium.

This follows from the facts that a Nash equilibrium of a zero-sum game can be formulated as a LP because it is equal to the strategy pair as defined in the problem statement, and any bounded feasible LP has an optimal solution, as discussed so far.
It is obvious that a zero-sum game that only allows mixed strategy does not have a Nash equilibrium: consider the rock-paper-scissors example. It is interesting that allowing a mixed strategy leads to the existence of a Nash equilibrium—this is what motivates us to consider mixed strategies.

Given all these prerequisites, let us finally start to explain the solution for the original problem.

Solution

Define \(V_{i,j}\) as the optimal winning probability when \(A_i\) and \(B_j\) are banned. \(V_{i,j}\) is the optimal solution of a zero-sum game represented by an \((N-1) \times 2\) matrix. Once we obtain the values of \(V_{i,j}\), the initial banning stage can be interpreted as a zero-sum game represented by a \(3 \times N\) matrix. Therefore, the problem can be solved by the following two steps:

  • Compute all \(V_{i,j}\).
  • Solve a zero-sum game represented by a \(3 \times N\) matrix.

First, let us compute \(V_{i,j}\).
We will solve it independently for each \(j\). We will consider the case where \(j=3\). (The same applies to the other \(j\).) Rewrite \(V_{i,j} = v_i\). What we want is \(v_1, v_2, \dots, v_N\).
Let \(x\) denote the probability that Aoki uses \(B_1\). Then

\[ \begin{aligned} v_i &= \min_{0 \leq x \leq 1} \max_{k\neq i} p_{k,1} x + p_{k,2} (1-x) \\ &= \min_{0 \leq x \leq 1} \max_{k\neq i} (p_{k,1} - p_{k,2}) x + p_{k,2}, \end{aligned} \]

so if we define a line \(L_k\) as\(y = (p_{k,1}-p_{k,2}) x + p_{k,2}\),

  • \(v_i\) \(=\) (the minimum value within \(0 \leq x \leq 1\) on the upper envelope of the \((N-1)\) lines except for line \(L_i\)).

Let us call the task of enumerating these \(v_1, v_2, \dots, v_N\) the subproblem.
If we are to find just one \(v_i\), just applying the algorithm that finds the upper envelope in \(\mathrm{O}(N \log N)\) time will do, but this time we need to find \(N\) values, so we need an improvement.
There are several ways to solve the subproblem; here, we will introduce two of them.

Solution of subproblem 1: Using geometric properties of envelope

The intended solution is a very simple solution that follows from the property of the envelope.
First, find the upper envelope formed by \(L_1, L_2, \dots, L_N\). Here, take any coordinates where the upper envelope takes the minimum within \(0 \leq x \leq 1\), and denote it by \((x_0, y_0)\). Then we have the following fact:

  • There exists a set of lines \(S\) of size at most \(2\), such that \(v_k = y_0\) if \(L_k \not \in S\).

For example, if there are only two lines \(L_i\) and \(L_j\) that passes through \((x_0, y_0)\), one can prove that \(S = \lbrace L_i, L_j \rbrace\). (We omit the proof here, but drawing figure may help understanding intuitively.) Even for the cases where three or more lines pass through \((x_0, y_0)\), the \(x\) coordinates that take the minimum value form a segment, or \(x_0=0\) or \(x_0=1\), we can still show that \(v_k = y_0\) holds except for two or less lines.

Using this fact, the following approach allows us to solve the problem in \(\mathrm{O}(N \log N)\) time:

  • Find \((x_0, y_0)\) and \(S\).
  • Compute \(v_i\) for \(i \in S\) individually, and let \(v=y_0\) for the other \(v\).

This solution runs correctly with an appropriate implementation…but, this solution is vulnerable against errors, so precise implementations require properly handling computational errors occurring at multiple steps, so forcing this solution requires a decent knowledge on computational geometry. Alternatively, in our problem all the coordinates can be computed as rational numbers, using a geometry library based on rationals may be also an option.

Solution to subproblem 2: Using persistent dynamic Li Chao Tree

As an alternative solution, we will introduce a solution that uses a powerful data structure called persistent dynamic Li Chao Tree.
First, we will define a dynamic persistent Li Chao Tree. We will not explain Li Chao Tree itself here. An ordinary Li Chao Tree has a fixed tree shape, but we can extend the algorithm to consider a dynamic Li Chao Tree where the tree shape changes according to the lines added. (This is analogous to a dynamic segment tree or a sparse segment tree.) However, depending on the lines to be added, the depth of the tree becomes dependent on the number of lines to be added, which worsen the computational complexity; thus, note that need an additional constraint on the depth, such as limiting the recursion step to \(50\).

  • This additional depth constraint causes some of the lines discarded, but the error caused by this is extremely tiny. Especially, in our problem, the domain and slopes are within \([0,1]\) and \([-1,1]\), respectively, so imposing a recursion depth limit of \(O\) guarantees an upper error bound of \(2^{-D}\).

A dynamic Li Chao Tree defined this way is easy to make persistent; the persistent version is called a persistent dynamic Li Chao Tree.
A persistent dynamic Li Chao Tree supports the following operations. (\(D\) denotes the maximum depth.)

  • add_line(a, b): add a line \(y = ax + b\). \(\mathrm{O}(D)\)
  • get_max(x): Find \(\displaystyle \max_{f : \text{added lines}} f(x)\). \(\mathrm{O}(D)\)
  • copy(): clone the Li Chao Tree. \(\mathrm{O}(1)\).

With a persistent dynamic Li Chao Tree, we can easily compute \(v_i\). Let

  • \(s_i\): a Li Chao Tree containing \(L_1, L_2, \dots, L_i\),
  • \(t_i\): a Li Chao Tree containing \(L_i, L_{i+1}, \dots, L_N\).

\(s_i\) and \(t_i\) can be efficiently obtained by repeating cloning and line additions in the manner of cumulative sums. By enumerating \(s\) and \(t\), “the \(y\) coordinate at \(x=x'\) of the upper envelope of the \((N-1)\) lines except for line \(L_i\)” can be immediately found by querying \(s_{i-1}\) and \(t_{i+1}\). Thus, by the convexity of the upper envelope, the minimum value on the upper envelope can be found by ternary searching. Hence, \(v_1, v_2, \dots, v_N\) can be found in about \(\mathrm{O}(N S D)\) time, where \(S\) is the number of steps in a ternary search, and \(D\) is the maximum depth.

There are several possible other ways to enumerate \(V_{i,j}\). We omit the details here, but they include:

  • using a Li Chao Tree that can manage the second max
  • using a persistent version of a data structure called Line Container
  • enumerating the basis using Seidel’s LP algorithm (the author does not the details though…).

Now that we have obtained \(V_{i,j}\), all that left is to solve a zero-sum game represented by a \(3 \times N\) matrix.
This part also has various solutions. Unlike the subproblem, this time we may spend \(\mathrm{O}(N)\) or \(\mathrm{O}(N \log N)\) time, so for example we may:

  • use some algorithm to solve LP, or
  • binary search by the solution and boil down to halfplane intersection.

Here, we will introduce an easy solution that utilizes the property of the problem.
First, notice that \((V_{i,1}, V_{i,2}, V_{i,3})\) take only \(\mathrm{O}(1)\) distinct values, due to the discussion in “solution of the subproblem 1.” (This is because there are only a constant number of \(i\) such that \(v_i \neq y_0\) for each \(j\).) Removing duplicate columns do not change the solution of a zero-sum game, so we remove them. Now the problem is reduced to a zero-sum with \(3\) rows and \(\mathrm{O}(1)\) columns.
This is equivalent to an LP with \(3\) variables and \(\mathrm{O}(1)\) inequalities. By the convexity of the LP, we can prove that doubly-nested ternary search yields the optimal value.

(Proof) Consider an LP that maximizes \(\mathrm{LP}(x)\) on variables \(x = (\lambda, v_1, \dots, v_n) \in R^{n+1}\). Let \(f(\lambda)\) be the optimal value of \(\mathrm{LP}(x)\) when the value of \(\lambda\) is fixed. It suffices to show the convexity of \(f(\lambda)\).
Let \(x_1,x_2\) be the optimal solution of the LP for \(\lambda=\lambda_1, \lambda_2\). Also, let \(0 \leq \alpha \leq 1\).
Let \(x_3 = \alpha x_1 + (1-\alpha) x_2\). Since the constraints are all linear, one can assert that \(x_3\) is a feasible solution. Also, in that case \(\lambda\) takes \(\alpha \lambda_1 + (1-\alpha) \lambda_2\). Since

\[\mathrm{LP}(x_3) = \alpha \mathrm{LP}(x_1) + (1-\alpha) \mathrm{LP}(x_2),\]

we have

\[ \begin{aligned} f(\alpha \lambda_1 + (1-\alpha) \lambda_2) &\geq \mathrm{LP}(x_3)\\ &= \alpha \mathrm{LP}(x_1) + (1-\alpha) \mathrm{LP}(x_2) \\ &= \alpha f(\lambda_1) + (1-\alpha) f(\lambda_2), \end{aligned} \]

which implies that \(f(\lambda)\) is concave, concluding the proof. (End of proof)

The problem can be solved by appropriately implementing the algorithms above. The complexity depends on the solution, but the writer’s solution runs in \(\mathrm{O}(N \log N + S^2)\) time, where \(S\) is the number of steps in a ternary search. (Considering the solutions with relatively heavy complexity such as rational-number-based geometry or persistent dynamic Li Chao Tree, the constraints are set fairly loosely.)

posted:
last update: