Practice Notebooks
Work through each notebook sequentially. Complete the exercises to unlock the next one.
0/3
1
RL Foundations: The Four Elements
Compare RL to supervised and unsupervised learning, understand the four elements of RL (policy, reward, value function, model), and build a Tic-Tac-Toe agent that learns through self-play.
45 min2 exercisesNarrated
MDPs, Rewards, and the Markov Property
Implement a complete MDP from scratch (the recycling robot), compute returns with and without discounting, verify the Markov property computationally, and find optimal policies through exhaustive search.
50 min2 exercisesNarrated
Your First RL Agent with Gymnasium
Use OpenAI Gymnasium to interact with the CartPole environment, compare random agents to heuristic agents, and implement a Q-learning agent that learns to balance the pole from scratch.
55 min2 exercisesNarrated