Overall Editorial
by
arr28
Rubbish sorting
This problem broke down into 3 separable parts.
- Placing the conveyor belts
- Choosing the sorter types
- Choosing the processor types
Placing the conveyor belts
I found this to be the trickiest part. My final solution was as follows.
- Compute the Delaunay Triangulation of all locations (inlet, sorters & processors). By restricting the solution to use the edges from the triangulation, it was guaranteed that there would be no crossing conveyor belts.
- Orient the edges: All edges incident on a processor were oriented towards the processor. (Edges between 2 processors would never be reached, so no tie-breaking was necessary.) All edges between two sorters were oriented away from the inlet at (0, 5000). These rules ensured that any subsequently calculated graph would be cycle free - i.e. a DAG.
- Choose a layout by random walk from the inlet - using directed edges from the DAG only. Repeat 100 times or until one of the random walks visited all the processors (typically taking single digits of iterations).
- Refine the output by removing redundant edges (e.g. simplifying complex sub-graphs where all output went to a single processor, so that there was no longer any branching).
Choosing the sorter types
I ended up solving this with hill climbing. (I also implemented simulated annealing, but it performed slightly less well, despite trying multiple hyperparameters.)
The hill-climbing neighbourhood was…
- Change the sorter type of a single sorter.
- Swap the outputs of a single sorter.
Choosing the processor types
This was the most straightforward section. Having chosen the DAG and the sorters, the amount of rubbish of each type arriving at each processor location is known (in an NxN matrix). Because the maximum value of N was small, the optimal assignment of processors to processor locations can be done quickly by calling SciPy’s linear_sum_assigment.
Scoring
The hill climbing step above required being able to calculate the score for as many sorter type & processor type assignments as possible. I increased this from single digits in 2 seconds using a naive Python implementation to ~10^5 by a mixture of caching intermediate values and using numba’s jitting facility.
Results
The raw score on the provisional test set (with 50 samples) was 27,011,777,237. On the final test set (with 2,000 samples), this left me 175th overall.
posted:
last update:
