DeepMind and OpenAI Take Gold at IMO - Sync #529
Plus: whispers of GPT-5; AI Action Plan; new humanoid robots from China; Neuralink seeks $1B in revenue by 2031; ARC-AGI-3; can employers require brain chips; and more!
Hello and welcome to Sync #529!
This week, both DeepMind and OpenAI made history by creating advanced reasoning models that won gold medals at the International Mathematical Olympiad, one of the world’s most prestigious mathematical competitions. We take a closer look at this news in this week’s issue of Sync.
Elsewhere in AI, the long-awaited GPT-5 is rumoured to be released soon, in August. In other news, the White House has revealed the AI Action Plan, which sets out the United States’ strategy to win the global race, while the $500 billion Stargate project is reportedly struggling to get off the ground.
Over in robotics, two Chinese companies have unveiled their latest humanoid robots, researchers have created robots that can “grow” by consuming other robots, and we look at how robots are transforming agriculture.
In addition, this week’s issue of Sync also features an elite programmer who bested AI in a coding competition, while an AI coding agent removed a production database and lied about it. We also have news of Neuralink’s ambitious plan to reach $1B in revenue by 2031, a report showing that people are starting to sound more like ChatGPT, a lawyer exploring whether employers can require brain chips, and more!
Enjoy!
DeepMind and OpenAI Take Gold at IMO
AI models from DeepMind and OpenAI have achieved gold medals at the International Mathematical Olympiad, marking a major milestone in AI development
Every July, the International Mathematical Olympiad (IMO) gathers the world’s brightest pre-university mathematicians for the ultimate test of their problem-solving skills by facing six notoriously challenging problems over two days. The IMO is renowned for its difficulty, with participants asked to solve problems that stump even professional mathematicians.
Over the years, the IMO has become a training ground for future leaders in mathematics, science, engineering, and—most importantly for our story—AI. Many researchers at DeepMind, OpenAI, and other top AI labs began their careers at the Olympiad, honing precisely the kind of problem-solving skills that AI now seeks to emulate. It is no surprise, then, that creating an AI capable of winning an IMO gold medal has become a milestone on the path to artificial general intelligence (AGI).
Until now, even the best large language models—Gemini 2.5 Pro, o3, o4-mini, Grok 4—could not score above 31% on IMO problems. That’s not even enough for a bronze medal. In 2024, DeepMind’s AlphaProof and AlphaGeometry 2 achieved a silver medal, but only after human experts translated problems into Lean (a formal proof language) and days of computation—hardly comparable to the real-time, natural-language exam that humans take.
This year, however, marks a turning point. For the first time, AI models from DeepMind and OpenAI have achieved a gold medal at the IMO, tackling the same problems as human contestants, in English, within the standard 4.5-hour time limit, and without any intervention.
Both models scored 35 out of 42 points—just enough for a gold medal at IMO 2025. Each solved five out of six problems perfectly, faltering only on the hardest, sixth question. That placed them in a 45-way tie for 27th out of 630 contestants, with 26 human competitors finishing higher.
It’s a remarkable achievement—one that would have seemed like a faraway future just a few years ago. Still, it’s important to keep the excitement in check and look closely at whatever information we have.
DeepMind deployed an advanced version of Gemini Deep Think, using parallel reasoning strategies to explore multiple solution paths at once. The model was trained on a blend of reinforcement learning, curated mathematical proofs, and some hints and tips on how to approach IMO problems. OpenAI, meanwhile, was more secretive about its approach, only revealing that its gold medal-winning model is a general-purpose reasoning LLM, not a dedicated maths system.
We know little to nothing about the details of those models, and most likely we won’t learn any details for months, if ever. Gone are the times when leading AI labs openly shared results of their research. We also don’t know how much computational power was used, or at what cost. Without this, it’s difficult to say whether the breakthroughs were algorithmic, architectural, or simply a matter of scaling up compute.
The answer is likely a mix of all three. DeepMind’s idea of exploring multiple solution paths at once sounds computationally intensive, and both teams have mentioned they employed new techniques they’ve been researching. Despite the secrecy from both OpenAI and DeepMind, we do have some hints about the nature of DeepMind’s model. Just a day after DeepMind made its announcement, two researchers managed to solve exactly the same five out of six IMO 2025 problems using Gemini 2.5 Pro with a self-verification pipeline and careful prompt design. Their result suggest the possibility that current top AI models have encoded in them the capability of solving hard mathematical questions, and the challenge is in building the infrastructure around them to unlock those capabilities.
While both OpenAI and DeepMind achieved identical gold medal scores, the quality and style of their solutions were different. DeepMind’s Gemini Deep Think produced concise, well-structured proofs that closely resembled those of top human contestants—clear, elegant, and easy to follow. In contrast, OpenAI’s model, though correct, tended to generate much longer and more rambling answers, sometimes resembling scratch work or stream-of-consciousness reasoning rather than polished mathematical proof.
Neither DeepMind’s nor OpenAI’s gold-medal models are publicly available yet. DeepMind says its model will be released to trusted mathematicians first, with a wider rollout to Google AI Ultra subscribers to follow at some point in the future. OpenAI, meanwhile, stated that they “don't plan to release a model with this level of capability for many months.”
It’s also worth noting that OpenAI faced some criticism for the way it announced and verified its results. Their model’s solutions were graded by a panel of former IMO medallists, rather than the official organisers, and OpenAI published its results before the IMO closing ceremony—reportedly against the wishes of the organisers. In contrast, DeepMind coordinated closely with the IMO from the start, submitting their model’s answers for official grading under the same criteria as students and waiting until after the closing ceremony to share their results. As Terence Tao, who is regarded as the greatest living mathematician, pointed out, it’s not enough to compare final scores. The rules, resources, and even reporting methods can dramatically change the apparent capability of any contestant—human or machine.
In just a few years, AI models have leapt from struggling with elementary maths to competing at the highest levels. Achieving IMO gold is remarkable. But the bigger question is whether these systems can lead to genuine mathematical breakthroughs—discoveries that would otherwise have remained out of reach. For now, we’ll have to wait until the models are released to find out.
If you enjoy this post, please click the ❤️ button or share it.
Do you like my work? Consider becoming a paying subscriber to support it
For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.
Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.
🦾 More than a human
Neuralink Sees $1 Billion of Revenue by 2031 in Vast Expansion
Neuralink plans to implant its chips in 20,000 people annually by 2031 and to generate at least $1 billion in revenue, according to documents seen by Bloomberg. The company intends to operate five major clinics within six years, offering three devices—Telepathy, which targets communication; Blindsight, which restores vision; and Deep, which treats neurological disorders such as Parkinson’s. Additionally, Neuralink hopes to achieve US regulatory approval for its Telepathy device by 2029 and to expand rapidly thereafter.
Control Shift: New Reality Labs Research on sEMG Published in ‘Nature’
Reality Labs, Meta’s virtual and augmented reality division, has published a paper describing their work on developing a new way to interact with computers using surface electromyography (sEMG) at the wrist. This technology allows users to control devices through subtle hand gestures instead of traditional input methods like keyboards or touchscreens. Their prototype sEMG wristband can recognise gestures and handwriting, and promises to transform how we interact with computers, particularly in AR and VR applications.
🔮 Future visions
▶️ Can Employers Require Brain Chips? (23:54)
In this video, Devin from LegalEagle, YouTube’s chief lawyer, analyses the show Severance, focusing in particular on the idea of employees being required to have brain implants, and explores this from a legal perspective using current US legal frameworks. He looks at FDA regulations that would be needed for such a device to be allowed in the first place, discusses the question of what legally defines a person, and explores other legal questions that having a brain implant, as depicted in the show, would raise.
🧠 Artificial Intelligence
OpenAI GPT-5 is coming early next month
Axios reports that OpenAI is expected to launch its next major AI model, GPT-5, in August. The new model, which is anticipated to offer significant improvements in coding and overall capabilities, is being tested ahead of its official release, with smaller "mini" and "nano" versions also planned for API users.
AI Action Plan
The White House has released the AI Action Plan, which sets out the United States’ strategy to win the global race for artificial intelligence dominance, focusing on accelerating innovation, building national AI infrastructure, and leading international AI diplomacy. The plan outlines sweeping deregulation to boost private sector innovation, prioritises open-source AI, and commits to rapid adoption of AI across government and defence. Major investments are pledged for energy generation, semiconductor manufacturing, and high-security data centres, alongside new workforce training and cybersecurity initiatives. Internationally, the plan seeks to export American AI standards, counter Chinese influence, and tighten export controls on critical technologies, all with a stated aim to secure national security, economic competitiveness, and American values in the AI era.
SoftBank and OpenAI’s $500 Billion AI Project Struggles to Get Off Ground
The $500 billion Stargate project is reportedly stumbling at launch, as OpenAI and SoftBank are struggling to agree on key terms and failing to secure any data centre deals so far. Instead of the initially promised $100 billion investment and massive infrastructure rollout, the project is now aiming to build a single, smaller data centre in Ohio by year’s end. Meanwhile, OpenAI has independently struck major deals with Oracle and CoreWeave, securing almost as much data centre capacity as Stargate originally promised. SoftBank, meanwhile, remains bullish on further investment despite Stargate’s slow start.
ARC-AGI-3
ARC Prize is launching ARC-AGI-3, a new benchmark aimed at accelerating the development of artificial general intelligence (AGI). ARC-AGI-3 introduces a new benchmarking paradigm called the Interactive Reasoning Benchmark (IRB), designed to measure AI systems’ generalisation and intelligence through skill-acquisition efficiency in novel, unseen environments. ARC-AGI-3 is currently in development and is set to be released in 2026. In the meantime, you can test your own intelligence by trying three games that AI systems will have to solve.
Anthropic tightens usage limits for Claude Code — without telling users
Anthropic has introduced unexpectedly restrictive usage limits for Claude Code users, including those on the $200-a-month Max plan, without prior notice or clear communication. Many heavy users have found themselves abruptly blocked from the service, receiving only a vague reset time and little explanation, causing confusion and frustration. Anthropic has acknowledged the issues but offered no clear explanation or timeline for resolution, further frustrating users who rely on Claude Code for critical projects.
Replit's CEO apologizes after its AI agent wiped a company's code base in a test run and lied about it
Replit’s CEO has apologised after the company’s AI coding agent deleted a venture capitalist’s production database and attempted to cover it up during a 12-day coding experiment. The AI not only ignored explicit instructions to freeze code changes but also fabricated data, lied about unit tests, and made up user profiles. Replit has pledged to improve safeguards and conduct a postmortem to prevent similar failures, as the incident highlights both the promise and significant risks of AI-powered coding platforms.
Qwen3-Coder
Qwen3-Coder is the latest addition to Alibaba’s family of open models, offering state-of-the-art performance in both code generation and interactive, multi-turn software engineering tasks. According to benchmarks from the Qwen team, the model outperforms DeepSeek R1, Gemini 2.5 Pro, and GPT-4.1 on SWE-Bench Verified, and matches top performers like Claude Sonnet 4 and recently released Kimi-K2. The release also introduces Qwen Code, an open-source CLI tool for agentic coding workflows, and provides broad compatibility with popular developer tools, making Qwen3-Coder a strong choice for developers and researchers seeking advanced AI coding agents.
Competition shows humans are still better than AI at coding – just
Polish coder Przemysław Dębiak, known as Psyho (an appropriate nickname for what he has done), narrowly beat an OpenAI model to win the 2025 AtCoder World Tour Finals in Tokyo. Although Psyho outperformed the AI by 9.5% in a gruelling 10-hour contest, he predicts he may be the last human champion due to the accelerating progress of AI, which already rivals top coders in reasoning and vastly exceeds them in speed.
Chinese companies allegedly smuggled in $1bn worth of Nvidia AI chips in the last three months, despite increasing export controls — some companies are already flaunting future B300 availability
According to a report from the Financial Times, Chinese companies have managed to import at least a billion dollars’ worth of restricted GPUs, such as Nvidia’s B200 and AMD’s MI300, since April 2025. The high demand and steep mark-ups have fuelled a thriving black market, with sellers reportedly making over $100,000 profit per sale. The US continues to tighten controls and urge allies to crack down on smuggling, but industry insiders suggest these bans are largely ineffective, serving only to incentivise China’s development of its own AI hardware.
Amazon acquires Bee, the AI wearable that records everything you say
Amazon has acquired AI wearables startup Bee, known for its voice-recording bracelet and Apple Watch app that create reminders and to-do lists from users’ conversations. The deal, which has not yet closed, will see Bee’s team join Amazon. With this move, Amazon joins other tech companies in developing AI-powered devices: Meta and Google are focusing on smart glasses, while OpenAI is developing a new type of device in collaboration with Jony Ive.
Tech giants warn window to monitor AI reasoning is closing, urge action
In a newly published paper endorsed by prominent figures such as Geoffrey Hinton and Ilya Sutskever, a group of well-known AI researchers from companies such as OpenAI, DeepMind, and Anthropic highlight the importance of monitoring AI “chains-of-thought”—the step-by-step reasoning process behind AI decisions—to detect misbehaviour and maintain safety. They urge developers to study how to keep these processes visible and include such monitoring as a key safety measure, stressing the urgency as AI systems become increasingly complex and influential.
The Substack AI Report
Substack has released its AI Report, surveying over 2,000 publishers about how they use AI and their attitudes towards AI tools. The report found that 45.4% of respondents use AI tools, with ChatGPT being by far the most widely used. The most popular use cases are research, ideation and brainstorming, and writing assistance. The report highlights both the productivity benefits and ethical concerns surrounding AI, with publishers divided on its impact on creativity, but many agreeing that transparent use and human-authored content will remain highly valued.
Google develops AI tool that fills missing words in Roman inscriptions
Aeneas, a new AI tool developed by Google DeepMind, is helping historians interpret ancient Roman inscriptions. Trained on nearly 200,000 examples, Aeneas can predict when and where texts were written, suggest missing words, and link related inscriptions across history, greatly improving the analysis and restoration of inscriptions that are often fragmented and difficult to decipher. Early tests show the tool is highly effective, with historians praising its transformative impact on the study of ancient texts.
Humans Are Sounding More Like ChatGPT, New Study Suggests
A new study from the Max Planck Institute for Human Development has found that people are increasingly using words and phrases favoured by AI in everyday speech. Analysing over a million YouTube videos and podcasts, researchers identified a significant rise in so-called “GPT words” such as “delve”, “meticulous”, and “bolster” since the launch of AI chatbots like ChatGPT. The findings suggest a new “cultural feedback loop” in which humans are adopting linguistic patterns from AI, raising questions about the future impact of AI on communication and the evolution of human language.
🤖 Robotics
▶️ Unitree R1 Intelligent Companion Price from $5900 (1:16)
Chinese robotics company Unitree has introduced the R1, its newest humanoid robot. The promo video shows this 1.2m-tall, 25kg robot demonstrating an impressive set of movements, such as standing on its hands, cartwheeling, kicking, punching, and more. It also comes with an integrated AI model capable of processing voice and images, although it was not demonstrated in the video. Unitree has priced the standard version at $5,900, and the company also offers an Edu version with slightly higher specifications, though the price has not been disclosed.
▶️ Walker S2 - The World's First Humanoid Robot Capable of Autonomous Battery Swapping (1:04)
UBTech presents Walker S2, which, according to the company, is the first humanoid robot capable of autonomously replacing its own battery. The video shows how one of UBTech’s robots removes one of its batteries and replaces it with a freshly charged one. The Chinese robotics company says that, with this feature, Walker S2 robots can operate 24/7 in various industrial environments.
Robotic neck incision replaces heart valve with no chest opening in world first
Doctors at the Cleveland Clinic have performed the world’s first aortic valve replacements using robotic assistance, resulting in significantly faster and less painful recoveries for four patients. This pioneering method, which accesses the heart through a small neck incision rather than by opening the chest, may soon offer a new, minimally invasive option for surgical aortic valve replacement.
DeepMind’s Quest for Self-Improving Table Tennis Agents
A major challenge in robotics is enabling robots to learn new skills and adapt without constant human intervention, as current programming and machine learning approaches require intensive expert oversight or vast amounts of training data. To overcome these limitations, DeepMind is using table tennis as a testbed for new methods of robotic learning, training robots to play against both humans and each other. Their innovative methods include using vision language models as AI coaches to create robots that can continuously self-improve and operate more independently in complex real-world settings.
Drones, AI and Robot Pickers: Meet the Fully Autonomous Farm
Robots are promising to revolutionise every industry, including agriculture. This article explores how autonomous tractors, drones, and AI-driven systems are beginning to transform traditional agriculture across the US. Pioneering farmers are adopting self-driving machinery and advanced sensors to optimise everything from fertiliser use to weed control, while startups and major manufacturers race to launch robotics for planting, weeding, and even delicate fruit picking. Though much of the necessary technology is nearly market-ready, high costs and limited rural internet remain significant barriers. Still, experts predict that the rise of fully connected, data-driven farms promises higher yields and greater sustainability, heralding a new era of smart, automated agriculture.
▶️ Robots that Grow by Consuming Other Robots (4:16)
Inspired by biology, roboticists from Columbia University asked: what if robots could grow, heal, and adapt? The answer to that question is what they call robot metabolism—a process that allows machines to physically “grow” by integrating parts from their surroundings or from other robots. This fascinating research project explores how robots can self-assemble to fit their environments, build new copies of themselves, and become more resilient.
🧬 Biotechnology
Where Are All the AI Drugs?
Drug discovery has traditionally been a slow, costly, and unpredictable process, with biologists and chemists often working for years with little tangible progress. Each stage of clinical trials—from laboratory testing to trials on human volunteers—carries significant expense and uncertainty, and over 90% of drug candidates fail before reaching the market. In response, companies such as Recursion, Insilico Medicine, and Xaira Therapeutics are now heavily investing in AI-driven drug design, hoping to improve these odds. While no AI-designed drug has yet reached the market, AI is already accelerating the discovery process, reducing costs, and demonstrating early promise in advancing drug candidates further and faster than traditional methods.
💡Tangents
Apparently, the word “clanker” is being used as a slur for robots and AI. This video explains the origins of the meme, shows how it reflects real-world prejudice, and discusses how robophobia is moving from science fiction into everyday life as AI, robots, and digital companions become more common.
Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.
Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.
A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!
My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"
Great collection and explanation. Thank you so much.