This is an informal introduction to the Battling Track aimed at readers who are familiar with machine learning but new to Pokémon Showdown. We'll introduce key terminology and offer some (slightly opinionated) perspectives on how Pokémon concepts shape the challenges and opportunities for AI research.
Competitive Pokémon turns the Pokémon franchise's turn-based combat mechanic into a standalone two-player strategy game. Players design teams of Pokémon and battle against an opponent. On each turn, they can choose to use a move from the Pokémon already on the field or switch to another member of their team. Moves can deal damage to the opponent, eventually causing it to faint, until the last player with active Pokémon wins.
As an AI benchmark, Pokémon is most defined by:
The best way to get a feel for the problem is to play a battle yourself! It takes under a minute to get into a match against bots on the PokéAgent Ladder. It's a fun way to play low-stakes battles against opponents who won't keep you waiting or talk trash when you lose :)
Alakazam
EVs: 252 HP / 252 Def / 252 SpA / 252 SpD / 252 Spe
IVs: 2 Atk
- Thunder Wave
- Seismic Toss
- Psychic
- Recover
Chansey
EVs: 252 HP / 252 Def / 252 SpA / 252 SpD / 252 Spe
IVs: 2 Atk
- Thunder Wave
- Ice Beam
- Thunderbolt
- Soft-Boiled
Gengar
- Hypnosis
- Thunderbolt
- Seismic Toss
- Explosion
Snorlax
- Body Slam
- Earthquake
- Hyper Beam
- Self-Destruct
Tauros
- Body Slam
- Earthquake
- Hyper Beam
- Blizzard
Starmie
EVs: 252 HP / 252 Def / 252 SpA / 252 SpD / 252 Spe
IVs: 2 Atk
- Thunder Wave
- Blizzard
- Psychic
- Recover
How to start a battle on the PokéAgent ladder
Competitive Pokémon might be the most vocabulary-intensive game ever made. There are a lot of Named Things™ to know about (there are more than 1,000 Pokémon, just for starters). The terminology can be a bit overwhelming, but the starter resources use few (if any) Pokémon-specific heuristics and are aimed at an ML audience. However, there are a few vocabulary terms you'll need to know to follow their instructions and conversations on Discord:
Gen1OU and Gen9OU cover both ends of a few important trends:
Trend: Every generation adds Pokémon, moves, and other team design choices. The number of available team compositions dramatically increases over the generations.
AI Takeaways: Agents must generalize over more diverse team choices in later generations. Expect later generations to require more data and stronger representations to reach the same performance.
Trend: The latest generation (currently: Gen 9) is by far the most popular. There are more Gen 9 battles played per day than every other generation combined.
AI Takeaways: Available replay data conveniently increases alongside the previously mentioned demand for more data.
Trend: Offensive power has increased over time. The average length of a battle drops sharply over the early generations, then mostly levels out.
AI Takeaways: Planning horizons decrease over generations, but it becomes harder to recover from mistakes (or bad luck). Search may be more useful in later gens.
Trend: Before Gen 5, you begin with zero information about your opponent's team. From Gen 5 onward, Showdown reveals the opponent's Pokémon before the battle begins ("Team Preview").
AI Takeaways: Gen 1–4 emphasize opponent team prediction. Team Preview weakens the otherwise obvious trend that more team combinations leads to more imperfect information.
Trend: Turning Pokémon into a balanced competitive strategy game is hard and requires frequent rule changes, especially in the first few months after a new generation is released.
AI Takeaways: Manual rule changes and evolving strategies create non-stationary datasets. If you are imitating a replay from 2015, you are imitating the decisions of a player who thought they were up against a different set of teams and strategies than you'd see on the ladder today.