Started out with the Intro to TF-Agents lecture; from this I first saw that the way RL generally works is that we have an environment and an agent. Each step has three things that happen: 1. the environment sends an observation to the agent, 2. the agent sends back an action, 3. the environment gives a reward.
Designing a RL Chess Bot
June 17, 2026