公式

E - Roulettes 解説 by en_translator


First, we will show that there exists a strategy of Takahashi’s in which “if his current total point is \(i\), then he plays the \(A_i\)-th wheel” for some sequence \((A _ i) _ {i\in\mathbb N}\).

Proof In an optimal strategy, let \(e _ i\) be the expected amount of money that he pays until he reaches \(M\) points if he initially has \(i\) points. (For \(i\geq M\), let \(e _ i=0\).)
Now, suppose that he may choose \(x _ 1\)-th, \(x _ 2\)-th, \(\ldots\), or \(x _ k\)-th wheel if he initially has \(i\) points. Then, the expected amount of money until he reaches \(M\) points if he chooses the \(x _ i\)-th wheel is \[E _ {x _ i}=C _ {x _ i}+\dfrac 1{P _ {x _ i}}\sum _ {j=1} ^ {P _ {x _ i}}e _ {x _ i + S _ {x _ i,j}}.\] Without loss of generality, we can assume that \(x\) is sorted in ascending order of \(E _ x\). Then he can change his mind that he always chooses \(x _ 1\) when he initially has \(i\) points without increasing the expected value. Hence, the proposition is shown.

Hereinafter, we will consider the optimal strategy that satisfies the condition above.

Consider computing \(e _ i\coloneqq\) the expected amount of money to pay to reach \(M\) points if he initially has \(i\) points (where \(e _ i=0\) for \(i\geq M\)).

If \(Z _ i\) out of \(P _ i\) integers written on the \(i\)-th wheel are \(0\), we can remove the integers \(0\) and replace the cost with \(C _ i\dfrac{P _ i}{P _ i-Z _ i}\) yen.

Proof When he wants to play the \(i\)-th wheel in his strategy, he repeatedly plays the same \(i\)-th wheel until he gets a non-zero integer. The expected number of times that he plays this wheel without increasing his points is \(\dfrac{P _ i}{P _ i-Z _ i}\), after which he gains points equal to one of the non-zero integers written on the wheel uniformly at random, so the expected cost and the distribution of the points that can be obtained do not change after modifying the wheel as described above.

Therefore, after a proper preprocess we can assume that \(S _ {i,j}\) are all positive. Then, we can find \(e _ i\) for \(i=M-1,M-2,\ldots,1,0\) in this order. Specifically, we can find \(e _ i\) as follows: \[e _ i=\min _ {j=1,2,\ldots,N}C _ j+\dfrac 1{P _ j}\sum _ {k=1} ^ {P _ j}e _ {i+S _ {j,k}}.\] Since we can compute this in an \(O(\sum _ jP _ j)\) time for an \(i\), the answer \(e _ 0\) can be found in a total of \(O(M\sum _ jP _ j)\) time.

The following is a sample code.

Regarding errors

We first evaluate the errors when \(C _ i\) have no errors. Let \(\varepsilon _ i\) be the maximum relative error in the result of \(e _ j\ (i\leq j\lt M)\). (For convenience, let \(\varepsilon _ M=0\).)

expected is the sum of at most \(P\) \(e _ j\)s. The absolute error in this value is at most \(E(\varepsilon _ {i+1}+2 ^ {-52}(P-1))\), where \(E\) is the true value. (If one use double to do arithmetic operations resulting in \(k\ (k\geq1)\), there is an error of at most \(2 ^ {-52}\).) As we are dividing this by \(P\) and add it to \(C\), the result of \(e _ i\) has a relative error of at most \(\varepsilon _ {i+1}(1+2 ^ {-52})+2 ^ {-52}(P+2)\). Obviously, \(\varepsilon _ {i+1}\leq\varepsilon _ {i+1}(1+2 ^ {-52})+2 ^ {-52}(P+2)\), so \(\varepsilon _ i\leq\varepsilon _ {i+1}(1+2 ^ {-52})+2 ^ {-52}(P+2)\).

Therefore, \(\varepsilon _ 0\leq(1+2 ^ {-52})^M(P+2)\leq2.25\times10^{-12}\).

To take into account \(0\) from the wheels, we modify \(C _ i\), that causes an error of \(2 ^ {-52}\), so the error between e[0] in the sample code and the true value is at most \(2.5\times10^{-12}\).

Since there is an error of at most \(5\times10^{-6}\) in the output, this program is guaranteed to be accepted.

#include <iostream>
#include <vector>
#include <tuple>
#include <ranges>

int main() {
    using namespace std;

    unsigned N, M;
    cin >> N >> M;

    vector<tuple<double, unsigned, vector<unsigned>>> roulette(N);
    for (auto&&[C, P, S] : roulette) {
        cin >> C >> P;
        S = vector<unsigned>(P);
        for (auto &&s : S)cin >> s;
    }

    // Tweak to avoid 0 points
    for (auto&&[C, P, S] : roulette) {
        C *= P;
        erase(S, 0);
        P = size(S);
        C /= P;
    }

    vector<double> e(M, 10000 * M); // e[i] := expected amount of money starting from i points
    for (const auto i : ranges::views::iota(0U, M) | ranges::views::reverse) // compute in descending order of i
        for (const auto&[C, P, S] : roulette) {
            double expected = 0;
            for (const auto s : S | ranges::views::filter([i, M](auto s) { return i + s < M; }))
                expected += e[i + s]; // expected amount of money after playing the wheel

            e[i] = min(e[i], C + expected / P); // the minimum C + 1 / P ∑ e[i + s] is the sought e[i]
        }

    cout << e[0] << endl; // e[0] is the answer
    return 0;
}

投稿日時:
最終更新: