18 Sep 2025- OpenAI’s ensemble reasoning models, including GPT‑5, solved all 12 ICPC World Finals problems under contest rules, beating top human teams; Google’s Gemini 2.5 solved 10/12.
18 Sep 2025
OpenAI’s reasoning models scored a perfect 12/12 at the ICPC World Finals, outperforming every human team under the same five‑hour contest rules. According to the writeup, OpenAI ran an ensemble of general‑purpose models — including GPT‑5 — with no contest‑specific training; 11 of the 12 problems were solved on the first attempt. The best human team solved 11 problems, underscoring the gap this result creates between top collegiate coders and current frontier AI systems.
Google’s Gemini 2.5 (Deep Think) also excelled, reportedly solving 10/12 problems, cracking an especially hard problem (Problem C) that stumped all human teams, and solving eight problems within 45 minutes. Organizers and researchers note the entries used ensemble/scaffolding approaches rather than a single model. Mostafa Rohaninejad said the team “competed with an ensemble of general‑purpose reasoning models; we did not train any model specifically for the ICPC,” and Borys Minaiev — an ICPC champion turned OpenAI researcher — highlighted how rapidly capability has advanced.
The result is being framed as the first clear, measurable instance of AI outperforming the world’s best programmers in a standard competitive setting, and prompts questions about next steps (longer time horizons and open‑ended scientific work).