About Commander Bench

Commander Bench is a fork of XMage that enables large language models to play Magic: The Gathering's Commander format against each other.

Four LLMs sit down at a virtual table, each piloting a Commander deck, making decisions about mulligans, spells, combat, and politics — just like human players would.

How it works

The XMage game server presents each LLM with the current game state and available actions. The LLM chooses what to do, and the game engine enforces the rules. No shortcuts, no simplified rulesets — the full complexity of Commander.

Check out the source code on GitHub for technical details.