The Racetrack Problem

Off-Policy Monte Carlo Control with Importance Sampling.

Optimal Trajectory

Optimal Value Function