公式

C - 荷物運びの報酬 / Reward for Carrying Luggage 解説 by admin

gpt-5.3-codex

Overview

This problem asks you to find the maximum total reward while satisfying the conditions (number of items \(\le K\), ending with \(0\) items) using dynamic programming (DP) where the state consists of “which room we have considered so far” and “the current number of items held.”

Analysis

The key insight is that the only information affecting future choices is the current number of items held.
When deciding “accept / decline” at each room, we don’t need the details of which jobs were chosen in the past — knowing only “how many items we are currently carrying” is sufficient to determine the next transition.

Why brute force is infeasible

Since there are 2 choices at each room, a naive approach considers \(2^N\) possibilities.
With \(N\) up to \(10^5\), this is far too slow.

DP formulation

Define dp[c] = the maximum reward obtainable up to this point with c items currently held.
Then, for each job, we can transition as follows:

  • Decline: The number of items stays the same (reward does not increase).
  • Accept a loading job \((T_i=1)\): \(c \to c+1\) (only if \(c+1 \le K\)), reward \(+A_i\).
  • Accept an unloading job \((T_i=0)\):
    • If \(c=0\): \(0 \to 0\) (items do not decrease), reward \(+A_i\).
    • If \(c\ge1\): \(c \to c-1\), reward \(+A_i\).

By performing this update sequentially from room 1 onward, dp[0] at the end gives the answer (due to the condition that the number of items must be \(0\) at the end).

Algorithm

  1. Prepare an array dp of length \(K+1\) with initial values:
    • dp[0] = 0 (initially \(0\) items, \(0\) reward)
    • All other entries set to a very small value NEG representing unreachable states.
  2. Process each room’s job \((t, a)\) in order.
  3. Set ndp = dp[:] to first represent the “decline” transition.
  4. Add the transitions for accepting the job:
    • t == 1 (loading):
      • For each c = 0..K-1, update ndp[c+1] from dp[c] with dp[c] + a.
    • t == 0 (unloading):
      • From c = 0 to 0 with +a.
      • For each c = 1..K, update ndp[c-1] from dp[c] with dp[c] + a.
  5. Replace dp = ndp.
  6. After processing all rooms, output dp[0].

This implementation is efficient because we only need to consider the states for item counts \(0..K\) at each room.

Complexity

  • Time complexity: \(O(NK)\)
  • Space complexity: \(O(K)\)

(This is practical because of the constraint \(NK \le 10^7\).)

Implementation Notes

  • Manage unreachable states with NEG, and ignore transitions from such states.

  • To avoid the effect of multiple updates within the same room, create ndp as a copy of dp (next-step array).

  • Don’t forget the special transition for unloading jobs when c = 0 (items don’t decrease, but reward is still earned).

  • Only output dp[0] at the end (due to the “ending with \(0\) items” condition).

    Source Code

import sys

def main():
    input = sys.stdin.readline
    N, K = map(int, input().split())
    jobs = [tuple(map(int, input().split())) for _ in range(N)]

    NEG = -10**30
    dp = [NEG] * (K + 1)
    dp[0] = 0

    for t, a in jobs:
        ndp = dp[:]  # skip this job
        if t == 1:
            # take load job: c -> c+1 (if <= K)
            for c in range(K):
                v = dp[c]
                if v != NEG:
                    nv = v + a
                    if nv > ndp[c + 1]:
                        ndp[c + 1] = nv
        else:
            # take unload job:
            # c=0 -> 0 (reward gained)
            if dp[0] != NEG:
                v0 = dp[0] + a
                if v0 > ndp[0]:
                    ndp[0] = v0
            # c>=1 -> c-1 (reward gained)
            for c in range(1, K + 1):
                v = dp[c]
                if v != NEG:
                    nv = v + a
                    if nv > ndp[c - 1]:
                        ndp[c - 1] = nv
        dp = ndp

    print(dp[0])

if __name__ == "__main__":
    main()

This editorial was generated by gpt-5.3-codex.

投稿日時:
最終更新: