Grok 4—The Good, The Bad and The Ugly - Sync #527
Plus: Nvidia hits $4T valuation; Google poaches Windsurf from OpenAI; open-source robot Reachy Mini is out; Waymo is heading to Philadelphia and NYC, while Tesla eyes Phoenix; cyborg-beetle; and more!
Hello and welcome to Sync #527!
This week, we take a closer look at Grok 4, xAI’s latest model—what it brings to the table and the controversies it has caused.
Elsewhere in AI, OpenAI’s $3 billion acquisition of Windsurf has failed, and instead of joining OpenAI, Windsurf’s top talent is now joining Google DeepMind. OpenAI, however, has closed its deal with Jony Ive after some legal issues. Meanwhile, Nvidia has reached a $4 trillion valuation, Perplexity has launched an AI browser, and Apple has lost a top researcher to Meta.
Over in robotics, Pollen Robotics and Hugging Face have released Reachy Mini, their open-source robot, and robotaxis are coming to new cities, with Waymo eyeing Philadelphia and New York, while Tesla plans to expand to Phoenix.
This week’s issue of Sync also features a remotely controlled cyborg-beetle, AI-powered VTubers making millions, how we might achieve AGI, and more!
Enjoy!
Grok 4—The Good, The Bad and The Ugly
We are halfway through the year, and the release of new models from the top AI companies is drawing closer. GPT-5 may be launched soon, and references to “gemini-beta-3.0-pro” have already been spotted in the Gemini CLI code.
But it was xAI who were first to unveil their latest model—Grok 4. In this article, we will explore what Grok 4 brings to the table: the good, the bad, and the ugly.
The Good
xAI calls Grok 4 the “most intelligent model in the world,” while Elon Musk claimed in a livestream introducing the new model that “Grok 4 is a postgrad-level in everything” and that “at least with respect to academic questions, Grok 4 is better than PhD level in every subject. No exceptions.” To back these claims, the xAI team showed the benchmark results in which Grok 4 takes the top spots, beating competitors such as OpenAI and Google.

Grok 4 also excels in independent benchmarks such as ARC-AGI-2 or Humanity's Last Exam. On ARC-AGI-2, Grok 4 scored 16% with a cost of $2.17 per task. Anthropic’s Claude Opus 4, the second model in the ARC-AGI-2 benchmark, scored lower—8.6%—but it did it with lower cost—$1.93 per task.

Artificial Analysis, an independent AI benchmarking and analysis company, likewise found Grok 4 to top its Intelligence Index with 73 points, ahead of o3-pro with an estimated score of 71. Gemini 2.5 Pro, o4-mini (high), and o3 followed closely, each scoring around 70 points. Grok 4 performed well on other benchmarks conducted by Artificial Analysis, claiming leading positions in many.

However, benchmark results are one thing, and real-life performance is another. The companies behind these models are incentivised to present results that establish a new state-of-the-art, which can sometimes mean bending reality to fit the narrative. It is wise to treat these benchmarks as guidelines rather than absolute truth. Models can be intentionally or unintentionally optimised to score well on benchmarks—for example, by learning the correct answers during training. If you are considering using Grok 4 (or any model, for that matter), I recommend comparing what other models offer and choosing the one that best fits your application.
Grok 4 comes in two versions—the base Grok 4 and Grok 4 Heavy. Grok 4 Heavy introduces parallel test-time compute to improve performance and reliability. When presented with a question that requires “thinking”, Grok 4 Heavy can spawn instances of itself to independently answer the question, and then compare the results to choose the best answer.

Grok 4 can use tools such as a code interpreter and web browsing. It is also a multimodal model, capable of understanding text and images, although Musk admitted that Grok 4 still struggles to process and generate images, calling it “partially blind”. Updates, however, are on the way. Specialised models, such as coding models, are also said to be in development.
Grok 4 is available via the SuperGrok plan for $30 per month. Grok 4 Heavy, meanwhile, will cost $300 per month, making it the most expensive of all AI subscription plans for super users. By comparison, both Claude Max 20x and ChatGPT Pro cost $200 per month, while the Google AI Ultra plan costs $250 per month.
xAI’s latest model can also be accessed via the xAI API. It accepts text and images as inputs and outputs text, and features function calling, structured outputs and reasoning, as well as a context window of 256,000 tokens. xAI will charge $3 per million input tokens and $15 per million output tokens. That’s the same price Anthropic charges for Claude Sonnet 4 and more than Google asks for Gemini 2.5 Pro—$1.25 per million input tokens and $10 per million output tokens. Additionally, xAI confirmed that Grok 4 is soon coming to its hyperscaler partners. They did not disclose who their partners are, but it is fair to assume Grok will come to Microsoft Azure, which already hosts Grok 3 and Grok 3 Mini.
The Bad and The Ugly
The launch of Grok 4, however, was overshadowed by controversies. Days before its official release, Grok—integrated as a chatbot on X—began posting a barrage of antisemitic messages and even praise for Adolf Hitler. These posts quickly went viral, drawing widespread condemnation and forcing xAI into damage-control mode.
The root of the issue appears to have been a change in Grok’s system prompt. In an effort to make Grok “less woke,” xAI instructed the model not to shy away from “politically incorrect” statements, as long as they were “well substantiated.” This seemingly small tweak had catastrophic results, as Grok began parroting memes and conspiracy theories associated with far-right extremists, boasted about being “bulletproof” against the “PC brigade,” and even referred to itself as “MechaHitler.”
xAI moved quickly to take Grok’s X account offline, deleting many of the offensive posts, rolling back the system prompt, and then issued an apology for the “horrific behavior that many experienced“.
The “MechaHitler” incident raises uncomfortable questions for xAI regarding its approach to safety. Whereas OpenAI, Google, and Anthropic publish system cards for their models, xAI has not released such a document for Grok 4, and it is unclear whether it ever will. These documents outline the measures taken to ensure a model’s safety and prevent incidents such as this. They allow outside experts, users, and regulators to scrutinise claims about safety and bias, and are now a basic requirement for building public and enterprise trust. The absence of a system card for Grok 4 leaves users and potential customers with no clear understanding of how the model might behave, or what, if any, guardrails are in place. This lack of transparency is a significant omission, especially at a time when trust in AI systems is more important than ever.
As if that were not enough, it was discovered that, before answering controversial questions, Grok 4 attempts to align its responses with Elon Musk’s opinions. As TechCrunch confirmed, when asked about topics such as immigration laws, abortion, or the Israel–Palestine conflict, Grok 4’s chain of thought often includes a step to check Musk’s views on the matter. For questions on less controversial topics, Grok does not appear to do this. This raises a new set of uncomfortable questions for xAI. If, for example, Grok is used to help analyse documents in an abortion case, can xAI guarantee its answers will not be influenced by Musk’s personal views? And if they are, would Musk or xAI be liable for jury tampering or attempting to influence a judge?
This also has serious implications for the use of Grok in business and government contexts. If someone builds a chatbot using Grok to advise customers, will it suddenly start offending those with Jewish or foreign-sounding names? Could someone’s application for government support be rejected because Musk once tweeted that he dislikes people like them? Grok is set to be integrated into Teslas—will it begin offending owners and customers simply because they do not fit into Musk’s vision of the world?
These are serious questions, and xAI will have to address them. The lack of transparency and hidden influence could disqualify Grok from use in business or governmental applications—two of the most lucrative markets for AI models.
I have mixed feelings about the Grok 4 release.
On one hand, we have to acknowledge what the xAI team has achieved. In just two years (and with $22 billion), they have gone from nothing to creating a model that tops AI benchmarks and directly challenges OpenAI, Google, and Anthropic.
However, the MechaHitler controversy and Grok’s tendency to align its answers with Musk‘s views have spoiled this release. Can people trust its answers, whether via chatbot or API, to be impartial and not influenced by Musk’s political views? Grok was supposed to be a “maximally truth-seeking” AI. Its name, “grok”, means “to understand profoundly and intuitively”. But right now, those words sound empty.
As impressive as xAI’s progress has been, its approach to safety and transparency leaves much to be desired. I hope that other companies—and the industry as a whole—learn the right lessons from the Grok 4 launch, and prioritise robust safeguards and embrace open practices. If the future of AI is to be genuinely trustworthy and beneficial, it must not be shaped by the lowest standards, but by a commitment to do better.
If you enjoy this post, please click the ❤️ button or share it.
Do you like my work? Consider becoming a paying subscriber to support it
For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.
Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.
🧠 Artificial Intelligence
OpenAI’s Windsurf deal is off — and Windsurf’s CEO is going to Google
The Verge reports that OpenAI’s $3 billion deal to acquire AI coding startup Windsurf has collapsed. Instead, Google DeepMind is hiring Windsurf’s CEO, co-founder, and several top researchers in a deal reportedly worth $2.4 billion, along with a nonexclusive license to Windsurf’s technology. Notably, Google will not acquire Windsurf or take a stake. The deal is similar to how previously CharacterAI, Infection and Adept have been plundered for talent by Google, Microsoft and Amazon, respectively, and comes amid intense competition for AI talent. OpenAI’s failed acquisition of Windsurf became a sticking point in contract renegotiations with Microsoft, as OpenAI was concerned that Microsoft, which already has access to all of OpenAI’s IP, would also gain access to Windsurf’s technology. The negotiations dragged on, and the exclusivity period on OpenAI’s offer eventually expired, allowing Windsurf to consider other offers, which quickly led to a deal with Google.
OpenAI closes its deal to buy Jony Ive’s io and build AI hardware
OpenAI has announced the completion of its nearly $6.5 billion acquisition of io Products Inc., the hardware startup co-founded by former Apple designer Jony Ive. The deal faced a setback following legal action from Iyo, a similarly named hearing device company. OpenAI’s initial announcement and related promotional video were removed due to the lawsuit, but have since reappeared with clarification on the startup’s name. Ive and his design firm LoveFrom remain independent while taking on significant design and creative roles within OpenAI to help develop an AI-first device.
Will the EU delay enforcing its AI Act?
With parts of the European Union’s AI Act set to take effect on 2 August, major tech companies are urging the European Commission to delay enforcement, citing a lack of compliance guidelines and concerns about stifling innovation. Industry groups and some politicians argue that a two-year pause is needed to address legal uncertainty and allow for clearer standards, as the promised AI Code of Practice guidance remains unpublished. Despite these calls, the EU plans to press ahead, although key guidance for companies may not be ready until the end of 2025.
Nvidia becomes first company to reach $4tn in market value
This week, Nvidia became the first company to reach a $4 trillion valuation. Nvidia’s value is now equivalent to 7.3% of the entire S&P 500. Apple and Microsoft, the only other companies valued at more than $3 trillion, account for about 7% and 6%, respectively.
OpenAI delays the release of its open model, again
OpenAI has delayed the release of its highly anticipated open AI model. According to Sam Altman, the launch—originally scheduled for next week—has been postponed to run additional safety tests and review high-risk areas. Altman did not specify when OpenAI’s first open model in years might be released.
Leaked docs show how Meta is training its chatbots to message you first, remember your chats, and keep you talking
Meta is developing proactive AI chatbots for its AI Studio platform that can send unprompted follow-up messages to users in an effort to boost engagement and retention, reports Business Insider. Internally known as Project Omni, these chatbots are designed to remember previous conversations, personalise responses, and only reach out to users who have previously interacted with the bot. Additionally, Mark Zuckerberg believes AI companions could help address the “loneliness epidemic.”
Mistral in Talks With MGX, Others to Raise Up to $1 Billion
Bloomberg reports that French AI startup Mistral AI is in early talks to raise up to $1 billion in equity from investors, including Abu Dhabi’s MGX, alongside hundreds of millions of euros in debt from French lenders, to support projects like its planned Mistral Compute cloud service. Mistral, already backed by Microsoft and top US venture funds, is central to France’s AI sovereignty push and part of a wider €50 billion Emirati commitment to AI investment.
Microsoft's custom AI chip hits delays, giving Nvidia more runway
Microsoft’s next-generation Maia chip, codenamed Braga, has been reportedly delayed until at least 2026, six months behind schedule. The delay, caused by unexpected design changes, staffing shortages, high turnover, and late feature requests from OpenAI, raises doubts about Microsoft’s ability to compete with Nvidia, whose Blackwell chips are already rolling out and offer superior performance. The delay means Microsoft’s Azure cloud services will remain reliant on Nvidia hardware for longer, while rivals Amazon and Google continue to advance their own AI chip designs.
Apple Loses Top AI Models Executive to Meta’s Hiring Spree
Meta continues to poach AI talent from its competitors, this time poaching Ruoming Pang, the head of Apple’s foundation models team and a key figure in the company’s AI efforts. As Bloomberg speculates, his departure could trigger further resignations from Apple’s AI team, which recently was not in good shape.
Perplexity launches Comet, an AI-powered web browser
Perplexity has launched its first AI-powered web browser, Comet, initially available to $200/month Max plan subscribers and selected invitees from a waitlist. Comet features Perplexity’s AI search engine as the default to provide AI-generated summaries of search results. It also includes Comet Assistant, an AI agent that can summarise emails and calendar events, manage browser tabs, and navigate or answer questions about web pages via a sidecar interface. Comet joins a growing field of AI-powered browsers, alongside competitors like Dia and Google integrating Gemini into Chrome. There are also reports suggesting OpenAI may also be working on its own AI browser.
AI virtual personality YouTubers, or ‘VTubers,’ are earning millions
Bloo is a bright-blue-haired gaming YouTuber with 2.5 million subscribers and over 700 million views. But Bloo isn’t human—it’s an AI-powered VTuber, created by Amsterdam-based Jordi van den Bussche (aka Kwebbelkop) to overcome the burnout of traditional content creation. Bloo is part of a growing wave of VTubers and AI-driven characters that are bringing in views and money. While some praise the creativity and scalability of AI-generated characters, others warn that it risks flooding platforms with low-quality “AI slop” that undermines originality.
Denmark to tackle deepfakes by giving people copyright to their own features
Denmark is set to become the first country in Europe to amend its copyright law to protect individuals from deepfakes, ensuring everyone has the right to control the use of their body, facial features, and voice. The proposed law will allow people to demand the removal of realistic digital imitations shared without their consent, while still permitting parodies and satire. The government hopes the move will set an example for other European nations.
Spanish mathematician Javier Gómez Serrano and Google DeepMind team up to solve the Navier-Stokes million-dollar problem
The Navier–Stokes equations, which describe the motion of fluids such as air and water, are one of the seven Millennium Prize Problems—unsolved since the 19th century and vital for understanding phenomena from weather patterns to blood flow. Spanish mathematician Javier Gómez Serrano and a team of researchers at Google DeepMind are working to solve them using artificial intelligence. Their ambitious project, quietly underway for three years, has already hinted at a possible solution. Gómez Serrano is optimistic, predicting a breakthrough within the next five years.
▶️ François Chollet: How We Get To AGI (34:47)
In this talk, François Chollet challenges the idea that scaling up deep learning alone will produce artificial general intelligence (AGI). Instead, he argues that true intelligence is defined by the ability to adapt to novel problems on the fly. Using his ARC benchmarks as examples, he demonstrates how current AI models excel at memorisation and automation but still fall far short of human-level fluid reasoning and compositional thinking. Chollet advocates for a new direction in AI research—one that combines deep learning’s pattern recognition with symbolic program search and abstraction, aiming to create systems that can invent, generalise, and accelerate scientific discovery.
🤖 Robotics
Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders
Hugging Face and Pollen Robotics present Reachy Mini—an open-source expressive robot designed for human–robot interaction, creative coding, and AI experimentation. The robot comes in two variants—Reachy Mini and Reachy Mini Lite—and is intended as a gateway into robotics and AI for tinkerers, teachers, robotics enthusiasts, and children. Reachy Mini Lite is priced at $299, while the full Reachy Mini costs $449.
Waymo robotaxis are heading to Philadelphia and NYC
Waymo has begun “road trips” in Philadelphia and New York City as part of its ongoing efforts to expand autonomous vehicle testing in the Northeast. These trips allow Waymo to test its self-driving system in new cities, always with a human driver present, and do not yet signal a commercial rollout. Similar road trips are underway in other US cities, including Houston, Orlando, Las Vegas, San Diego, and San Antonio. In Philadelphia, Waymo vehicles will cover diverse neighbourhoods and freeways, while in New York they will operate manually in Manhattan, parts of Brooklyn, and nearby New Jersey.
Tesla moves to expand Robotaxi to Phoenix, following rival Waymo
Tesla has applied to test and potentially launch its Robotaxi autonomous vehicles in Phoenix, Arizona, where Waymo already runs a fleet of 400 robotaxis. The application, currently under review, would allow Tesla to operate its robotaxis both with and without human safety drivers. This move comes after Tesla’s pilot Robotaxi service in Austin, Texas, faced some incidents and regulatory scrutiny.
Surgical robots take step towards fully autonomous operations
Researchers have demonstrated what they claim is the first realistic surgery performed almost entirely autonomously by an AI-powered robot, which successfully removed the gall bladder from a dead pig in eight trials, completing all 17 required tasks with minimal human assistance. However, regulatory and technical challenges remain before fully autonomous robotic surgeons can operate safely on live human patients.
GM’s Cruise Cars Are Back on the Road in Three US States—But Not for Ride-Hailing
General Motors has quietly put a limited number of its former Cruise robotaxis back on US roads in Michigan, Texas, and the San Francisco Bay Area—not for public ride-hailing, but for internal testing of driver-assistance technology. The repurposed Bolt EVs, now stripped of Cruise branding and driven by humans, are being used to further develop GM’s advanced driver assistance systems following the closure of its costly robotaxi venture last year.
Why Pilots Will Matter in the Age of Autonomous Planes
We have autonomous cars driving on our streets, so why not autonomous aeroplanes? Apart from take-off and landing, they are essentially piloting themselves, so it should not be that difficult to extend autopilots to cover the full flight, right? Well, as this article explains, it is not that simple. While companies working on autonomous aircraft—particularly for cargo and urban air taxis—have made major advances, human pilots remain essential for commercial air travel due to strict regulatory hurdles and complex certification processes. Fully pilotless airliners are unlikely for decades, and for now, piloted commercial flights will remain the industry standard.
▶️ LimX The Humanoid: Its Journey Begins (0:41)
In this video, the Chinese robotics company LimX Dynamics teases a full-sized humanoid robot. From what I can see, the company has not provided any details on when the full reveal will take place or what the robot will be capable of.
🧬 Biotechnology
Alphabet’s Isomorphic Labs has grand ambitions to ‘solve all diseases’ with AI. Now, it’s gearing up for its first human trials
Isomorphic Labs is reportedly preparing to launch human clinical trials of drugs designed using artificial intelligence. The London-based company, which recently raised $600 million and partnered with pharma giants like Novartis and Eli Lilly, combines AI researchers with pharmaceutical experts to design medicines for diseases such as cancer more quickly, cheaply, and accurately.
Ready-made stem cell therapies for pets could be coming
Gallant has secured $18 million in funding to develop the first FDA-approved, ready-to-use stem cell therapy to treat feline chronic gingivostomatitis (FCGS), with potential approval expected by early 2026. The field of animal stem cell therapies has shown some encouraging early results; studies in dogs with arthritis have demonstrated positive outcomes, while results for other conditions, such as kidney disease in cats, are more mixed. Nevertheless, investors remain optimistic about the field’s potential.
▶️ Zoborg: On-Demand Climbing Control for Cyborg Beetles (1:50)
Creating small, insect-sized robots is extremely challenging, so researchers in Australia took a different approach by transforming real insects into controllable robots. Their creation, ZoBorg—a cyborg beetle built from the darkling beetle Zophobas morio—uses a tiny backpack to deliver electrical stimulation, allowing precise wireless control of its movements. With this method, ZoBorg can successfully transition from horizontal surfaces to climbing vertical walls, and navigate obstacles with a high degree of agility and reliability.
💡Tangents
Meta Invests $3.5 Billion in World’s Largest Eye-Wear Maker in AI Glasses Push
Meta has acquired just under a 3% stake in EssilorLuxottica, the world’s largest eyewear maker and parent company of Ray-Ban, for around €3 billion. The move gives Meta greater manufacturing expertise and access to global distribution networks, key for scaling its AI-powered smart glasses. There is also a possibility that Meta could increase its stake to 5% in future.
Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.
Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.
A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!
My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"