BoardGym
Train your judgment. The way directors do.
A flight simulator for fiduciary duty. Step into real board scenarios, write your own reasoning, and get scored on the quality of your judgment.
AIGym is the website for people who refuse to let AI think for them. A family of training rooms — board, interview, negotiation, crisis, diplomacy — where you write the reasoning and K2 Think V2 grades the quality of your judgment.
The manifesto
Lazy use of AI does.
The popular complaint is that AI is hollowing out our thinking — that handing problems to a model trains the muscle of cognition to atrophy. There is real evidence for this. People who outsource every judgment do get worse at making judgments.
But the conclusion that AI is the problem is the wrong conclusion. It's a tool. A pen makes some people lazy and others write better. A calculator did not end mathematics. A great library doesn't make you smarter; it makes you smarter if you read demandingly.
The right use of AI is not to bypass thinking. It is to put your thinking under pressure. To find the gap between what you concluded and what a more rigorous version of you would have concluded — and to close it.
AIGym is a family of training rooms built on that idea. You enter a scenario. You write your reasoning. K2 Think V2 — one of the most demanding reasoning models in production — grades the structure, completeness, and honesty of your thinking, and tells you where it broke. Then you go back in.
This is the website for the people who refuse to let AI think for them. Who use it to think harder.
The gyms
Every gym is built on the same engine: step into a scenario, write your reasoning, get graded on the quality of your thinking. The sport changes — the discipline doesn't.
Train your judgment. The way directors do.
A flight simulator for fiduciary duty. Step into real board scenarios, write your own reasoning, and get scored on the quality of your judgment.
Rehearse before you walk in.
A curated panel interview built from your CV, the JD, the interviewers' LinkedIn profiles, and the company's latest annual reports. Briefed, asked, debriefed — like the real thing.
Train the move, not the script.
High-stakes negotiation scenarios. Hostile, friendly, multi-party. K2 grades the reasoning behind your move — anchor, concession, walkaway — not whether you closed the deal.
Think under the news cycle.
Hour-by-hour incident response simulator. Breach, recall, scandal — write the next move, and the move after that, while the situation evolves around you.
Reason like a foreign ministry.
Statecraft scenarios — sanctions, summits, treaty drafting, alliance management. Grades realpolitik against principle, not slogans.
Defend the thesis to the room.
A live investment committee. Present the deal, anchor the recommendation, take the questions. Graded on the reasoning a sharp IC actually grades — not on whether the deal closed.
Where the engagement turns.
Audit judgment under client pressure. Materiality, going concern, related-party calls, control failures — graded the way a senior partner grades a manager.
Reason like a permanent secretary.
Public-policy decisions under real constraints. Regulation, allocation, distributional tradeoffs — graded on ex ante reasoning quality, not on whether the result was popular.
Kill, run, or hold.
Editorial judgment under deadline. Sourcing weight, framing, harm calculus, correction policy — graded on the dimensions taught in serious journalism schools.
Listen before you fix.
Clinical-grade conversation under emotional pressure. Listening, validation, repair — graded against the rubrics therapists are trained on.
Train the rarest reasoning skill.
Calibrated forecasting. Probabilistic questions, Brier-score graded, with reasoning evaluated for base-rate use, reference-class fit, and updating discipline.
The lineup will keep growing — each new gym is a discipline where AI evaluating reasoning beats AI providing answers. Honourable mentions held for a later cohort: EthicsGym (applied moral reasoning), CounselGym (strategic advice), AdvocateGym (legal argument).
Methodology
Every gym in AIGym shares one design principle. The user does not pick an answer from a list and get a tick. The user writes their reasoning — and K2 Think V2 grades the structure, completeness, and honesty of the thinking. Your answer can match the key and still score low.
Genuinely difficult situations — not toy problems with a hidden 'right answer'. The kind professionals actually face.
Choose a move, but defend it. The reasoning is what gets evaluated. Multiple-choice is the lazy version of thinking.
K2 grades the quality of your judgment across five discipline-specific dimensions, and shows you exactly where it broke down.
Each gym tracks where your reasoning is sharpening — and where the same gap keeps appearing. That's the point.
Pick a gym. Step into a scenario. Write the reasoning. K2 will be the most demanding grader you've had.