1 courses across 12 AI domains
MDPs, Q-learning, policy gradients, PPO, and deep RL. Train agents that learn from interaction.