Before we dive in, a quick shout out January’s winners—Donatas Gudauskas for solving the main puzzle, and the mysteriously named qwe for solving the bonus using dynamic programming. Our solution below is built from first principles, but illuminates the intuition behind the dynamic programming technique.

Main puzzle

PuzzleJan.png

In January's newsletter, we asked

Feeling lucky? A prize wheel features prizes worth $1, $3, and $10. The probability of spinning a $1 prize is 0.8, a $3 prize is 0.15, and a $10 prize is 0.05. With each spin, you can either take the prize you land on, or give it up and spin again. You get 8 spins total, and can collect only 1 prize. If you play the game optimally, what’s the expected value of your prize?

Let's start by examining two extreme scenarios:

In the first case, the expected prize value is $10$, and in the second it's ${V_1 = (1\times 0.8 + 3\times 0.15 + 10 \times 0.05) = 1.75}$, a pretty big difference. Between these extremes, we have to weigh the chance that we get a better spin in the future against the possibility that we never see a better spin.

Building up the logic

Let's analyze the case where we have two spins left. For each possible outcome:

So, the expected value of having two spins left is

$$ \begin{align*}V_2 &= 0.05\times 10 + 0.15\times 3 + 0.8 V_1 \\ &= 2.35\end{align*} $$

If we have three spins, it's the same analysis. $10$ and $3$ are greater than $V_2=2.35$, so we should take them if we get them, and re-spin if we get a $1$. This makes the expected value of three spins

$$ \begin{align*}V_3 &= 0.05\times 10 + 0.15\times 3 + 0.8 V_2 \\ &= 2.83\end{align*} $$