PokéAgent Challenge

PokéAgent Logo

The PokéAgent Challenge is an AI benchmark that leverages the complexity of Pokémon to evaluate long-horizon planning and game-theoretic reasoning. The challenge is organized into two tracks: competitive battling and RPG speedrunning. In the Battling Track, agents compete head-to-head in competitive Pokémon matches, while in the Speedrunning Track, agents attempt to complete the full Pokémon RPG as quickly as possible. The PokéAgent Challenge began as a NeurIPS 2025 competition and is now an ongoing benchmark with a live leaderboard.

Join us on Discord Read the Paper

Click below for more information about each track and its leaderboard.

Sponsored by

DeepMind DeepMind
Artificial Intelligence Journal Become a sponsor

PokéAgent Challenge Team

Seth Karten Princeton Jake Grigsby UT Austin
Stephanie Milani NYU / Johns Hopkins Kiran Vodrahalli Google DeepMind Amy Zhang UT Austin Fei Fang CMU Yuke Zhu UT Austin Chi Jin Princeton
@inproceedings{karten2025pokeagent,
  title        = {The PokeAgent Challenge: Competitive and Long-Context Learning at Scale},
  author       = {Karten, Seth and Grigsby, Jake and Milani, Stephanie and Vodrahalli, Kiran
                  and Zhang, Amy and Fang, Fei and Zhu, Yuke and Jin, Chi},
  booktitle    = {NeurIPS Competition Track},
  year         = {2025},
  month        = apr,
}