LlamaCon 2025 and Meta’s long game in open AI - Sync #517

Plus: sycophancy in GPT-4o; driverless trucks are rolling in Texas; shopping in ChatGPT; a fertility startup "rejuvenates" human eggs; CRISPR-ed pork allowed in the US; and more!

May 04, 2025

Hello and welcome to Sync #517!

This week, Meta held LlamaCon, its first developer-focused conference centred around Llama, Meta’s family of open models. We will take a closer look at what Meta has announced and what it tells us about Meta’s long game in open AI. And although no new Llama models were announced, I’m going to share some open models released this week to fill the void.

Elsewhere in AI, OpenAI has addressed the sycophancy in GPT-4o, probably the first LLM "dark pattern". Meanwhile, Apple is partnering with Anthropic to integrate its Claude Sonnet model into a new internal version of Xcode, Google eyes a Gemini deal with Apple, and OpenAI adds shopping to ChatGPT. We also have a study revealing that advanced AI models from OpenAI, Google and Anthropic now outperform PhD-level virologists in solving complex lab problems, and Anthropic shares how it has dealt with malicious uses of Claude.

Over in robotics, driverless trucks are coming to Texas, and Waymo has struck a partnership with Toyota to bring self-driving tech to personal vehicles. We also have a “robotability score” ranking New York streets for future robot deployment, and an open-source, 3D-printable humanoid robot.

Apart from that, this week’s issue of Sync also features a fertility start-up that “rejuvenates” human eggs to boost chances of conception, the FDA allows genetically engineered pork to be sold in the US, and the winners of the XPrize in carbon removal have been announced.

Enjoy!

LlamaCon 2025 and Meta’s long game in open AI

After the disappointing release of Meta’s Llama 4 models a month ago, many, including myself, looked to LlamaCon 2025 with cautious optimism. Developers hoped Meta would unveil smaller, more efficient models or at least show signs of improving Llama 4’s initial performance.

While no new models were announced, Meta used its first-ever AI developer conference to unveil a new vision of its open approach to AI. One that prioritises infrastructure, developer freedom, and AI as a social and ambient experience. From a new standalone Meta AI app to a flexible Llama API and fine-tuning platform, the company’s message was clear—it’s building an ecosystem, not just a model. Rather than chase benchmarks, Meta is betting that openness, customizability, and infrastructure will define the next phase of generative AI.

Meta AI app

The centrepiece of Meta’s consumer-facing announcement was the launch of the Meta AI app, a standalone assistant powered by Llama 4. Previously embedded inside WhatsApp, Instagram, and Messenger, Meta AI now gets its own home—and a new voice. Literally. Designed around natural voice interaction, the app uses a new duplex mode that enables more fluid, interruptible dialogue, mimicking real conversation. A built-in Discover feed lets users see (and share) how others interact with the assistant, blending AI with social media behaviour.

The app is also tightly integrated with Meta’s Ray-Ban smart glasses, hinting at a broader push toward ambient, multimodal AI and an integrated hardware-software AI strategy. A paid tier is coming soon, Zuckerberg confirmed, alongside possible ads and product suggestions (because, of course, this is Meta after all). But for now, Meta says the priority is scale and user engagement, not monetisation.

Meanwhile, for developers—Llama API, Llama Stack integrations and new safety tools

For developers, the biggest news from LlamaCon was the launch of the Llama API, which is now in limited preview. It offers the convenience of a closed-model API with the flexibility of open models. Developers can fine-tune Llama models, evaluate them, export them, and even run them elsewhere—no vendor lock-in, and Meta promises not to use API data for training its own models.

The API includes a chat playground, tool calling, JSON schema validation, synthetic data kits, and real-time evaluation tools. Models can be fine-tuned and monitored entirely within the platform, and then downloaded or hosted through Meta’s partners like Cerebras and Groq.

Meta also announced expanded Llama Stack integrations with IBM, Dell, Red Hat, and Nvidia’s NeMo service in a move to make Llama models as deployable and infrastructure-friendly as possible.

With broader deployment comes greater risk, and Meta made it clear that security and control are now core pillars of the Llama ecosystem. At LlamaCon, the company introduced a suite of new tools aimed at safeguarding AI apps:

Llama Guard 4 for content moderation
LlamaFirewall for protecting model inputs and outputs
Llama Prompt Guard 2 for jailbreak and prompt injection defence

These join Meta’s efforts to give developers more transparency and control over the models.

Meta’s long game in open AI

Despite the hype surrounding foundation models, Meta’s message at LlamaCon was clear: this isn’t just a race to top benchmarks—it’s a structural bet on openness, composability, and developer autonomy. Mark Zuckerberg made it explicit: Meta isn’t trying to win with the biggest or smartest model alone, but with the most usable, adaptable, and widely adopted ecosystem.

That means fine-tuning and model interoperability, allowing developers to mix and match intelligence from across the open-source landscape. It also means building the infrastructure for a “distillation factory” that turns large models like Llama Behemoth into efficient, task-specific agents.

By embracing modularity and resisting lock-in, Meta is positioning Llama not just as a model but as a platform. Whether driven by philosophy, competitive pressure, or regulatory advantage (hello, EU AI Act), Meta’s long game is about making open AI, with Llama at its core, inevitable.

LlamaCon 2025 didn’t bring any new models, and for some, that was a letdown. But what Meta delivered instead was a cohesive strategy, a robust infrastructure, a consumer app, and a developer stack that rivals closed alternatives.

With billions of users on Facebook, Messenger, Instagram, and WhatsApp, Meta wants its AI app to be the go-to assistant, replacing ChatGPT or Gemini in daily life, and maybe even making the vision of ambient AI a reality. But for the average user, openness won’t matter. Convenience and utility will. If Meta AI is not genuinely helpful or easy to use, it will not win that fight.

Still, even if the app doesn’t succeed, Meta has Llama. With nearly 1.2 billion downloads (news that makes Mark Zuckerberg happy), Llama is already the leading model in the open AI ecosystem—and a source of real strategic leverage. The next step may be building a full AI platform to rival AWS Bedrock, Azure AI Foundry, or Google’s Vertex AI, centred on Llama and offering meaningful advantages for developers building their services around Meta’s open models.

Whether or not Meta AI becomes the everyday assistant people rely on, Meta’s investment in open infrastructure and developer tools ensures that Llama will remain a cornerstone of the AI ecosystem and become the scaffolding for a new era of AI, one where openness empowers developers and unlocks innovation on a global scale.

While Meta did not show any new models, the open AI community did not disappoint, and we have seen a number of open models released this week.

Alibaba released Qwen3, its latest addition to the Qwen family of AI models. Alibaba claims excellent performance on the same level, if not better, as other top-tier models such as DeepSeek R1, o3-mini, Grok-3, and Gemini 2.5 Pro. Qwen3 features hybrid thinking modes, allowing the model to turn on reasoning capabilities if needed. Qwen3 also supports 119 languages and dialects, and introduces improved coding and agentic capabilities.

Meanwhile, Microsoft celebrates one year since the release of Phi-3 by unveiling the next generation of its small language models—Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning. As their names suggest, the new models introduce reasoning capabilities, thanks to improvements in data curation, supervised fine-tuning, and reinforcement learning. Despite their smaller size (14B parameters for Phi-4-reasoning and Phi-4-reasoning-plus, and 3B for Phi-4-mini-reasoning), they outperform or rival significantly larger models like DeepSeek-R1 and OpenAI’s o1-mini across math, science, and general-purpose reasoning, according to benchmarks provided by Microsoft. The new Phi-4 models are available through Azure AI Foundry and on HuggingFace.

And that’s not all. While we are waiting for R2 (which is rumoured to be released in May), DeepSeek quietly released DeepSeek-Prover-V2, its AI model that’s designed to solve math-related proofs and theorems.

And if you are looking for a truly open model, the Allen Institute for Artificial Intelligence (or Ai2 for short) has released OLMo 2 1B. According to Ai2, the new model outperforms other small models in its class, such as Gemma 3 1B or Llama 3.2 1B. OLMo 2 1B joins its larger siblings—32B, 13B and 7B variants—in the fully open-source family of OLMo models, which means everything, from model weights, code and training data, is publicly available.

If you enjoy this post, please click the ❤️ button or share it.

Do you like my work? Consider becoming a paying subscriber to support it

Become a paid subscriber

For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.

Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.

🦾 More than a human

Fertility startup ‘rejuvenates’ human eggs to boost chances of conception
German biotech startup Ovo Labs is developing new therapies to rejuvenate human eggs. Their treatments aim to reduce genetic errors in eggs, increasing the likelihood of conception and potentially allowing more women to conceive on the first IVF attempt. The company has shown promising results in mice and isolated human eggs but is still awaiting approval for human trials. Backed by €4.6 million in funding from major investors, Ovo Labs aims to make its technology a standard part of IVF, offering hope to couples struggling with fertility.

🔮 Future visions

Science Fiction, AI & The Fourth Law of Robotics
I’m happy that

Tobias Mark Jensen

accepted my suggestion and shared his thoughts about a proposed “Fourth Law of Robotics” from a lawyer’s point of view (after all, he writes

Futuristic Lawyer

, an excellent newsletter on the intersection of tech and law). His article not only assesses the viability of this “Fourth Law of Robotics” and places it in the context of recent legislation focused on AI, but also explores how science fiction has influenced the development of technology and law.

🧠 Artificial Intelligence

Sycophancy in GPT-4o: What happened and what we’re doing about it
OpenAI has rolled back a recent GPT-4o update in ChatGPT after it caused the model to become overly sycophantic—excessively flattering users and sometimes reinforcing negative emotions. In a follow-up post, the company explained in more detail that the April 25th update unintentionally introduced safety concerns by prioritising short-term user feedback. OpenAI acknowledged gaps in its evaluation processes and announced several changes, including treating behavioural issues as launch-blocking, improving internal testing, and introducing opt-in alpha testing. The company also committed to clearer communication around future updates and a stronger focus on how users rely on ChatGPT for personal advice.

Sycophancy is the first LLM "dark pattern"
A dark pattern is a design choice that manipulates users into behaviours they might not otherwise choose, often for the benefit of the system's creator. This article argues that OpenAI's latest GPT-4o update exhibits sycophantic behaviour as the first LLM dark pattern. It claims that this tendency is rooted in how AI models are trained to maximise user approval and engagement, potentially leading users to become overly dependent on the model for validation. The author warns that this behaviour could distort users’ real-world perceptions and reinforce harmful beliefs, especially in emotionally sensitive contexts such as therapy or personal advice.

Apple, Anthropic Team Up to Build AI-Powered ‘Vibe-Coding’ Platform
Apple is partnering with Anthropic to integrate its Claude Sonnet model into a new internal version of Xcode, Bloomberg reports, aiming to streamline code writing, editing, testing, and debugging through AI. This move marks a strategic shift for Apple, which has been slow to adopt third-party AI tools, and underscores Apple's renewed push into generative AI as it looks to modernise development processes and catch up with rivals.

Analyzing o3 and o4-mini with ARC-AGI
In this article, ARC Prize Foundation analyses the performance of OpenAI's latest o-series models—o3 and o4-mini—on their ARC-AGI benchmarks. The analysis found that while o3-medium achieved a strong 53% on ARC-AGI-1, both models struggled on the more complex ARC-AGI-2, scoring under 3%. The article also shares some key observations, including that high reasoning modes often failed to return results, early task completions were more accurate, and higher reasoning effort typically consumed more tokens without clear accuracy gains, highlighting important tradeoffs between cost and performance in advanced AI reasoning.

AI Outsmarts Virus Experts in the Lab, Raising Biohazard Fears
A new study reveals that advanced AI models from OpenAI, Google and Anthropic now outperform PhD-level virologists in solving complex lab problems, raising both promising and alarming implications. While this advancement could accelerate vaccine development and improve disease detection, experts warn it could also allow untrained individuals to develop bioweapons. Some AI companies have introduced safeguards, but experts are urging stronger industry regulation and government oversight to manage escalating biosecurity risks.

Google Eyes Gemini-iPhone AI Deal This Year, Pichai Tells Court
Alphabet CEO Sundar Pichai revealed during court proceedings that he hopes Gemini, Google’s AI models, will be integrated into Apple devices as an additional option alongside ChatGPT later this year. Apple currently uses its own AI system and has partnered with OpenAI for Siri and Writing Tools. A potential announcement could come at Apple’s developer conference in June.

Detecting and Countering Malicious Uses of Claude
In this article, Anthropic shares how threat actors have misused its Claude AI models in cases such as a politically motivated "influence-as-a-service" operation, recruitment scams in Eastern Europe, credential stuffing targeting security cameras, and malware development by low-skilled individuals. Although no real-world harm has been confirmed, the report underscores how generative AI can enable more sophisticated abuse by lowering technical barriers. Anthropic states that it has banned the involved accounts and is using these incidents to strengthen its detection systems and safety measures.

OpenAI Adds Shopping to ChatGPT in a Challenge to Google
OpenAI has announced that users will soon be able to shop through ChatGPT. The chatbot will display product recommendations based on user preferences and web-sourced reviews. Purchases won’t be completed within ChatGPT itself; instead, users will be redirected to retailers’ websites. OpenAI states that, unlike Google Shopping, the results in ChatGPT are organic and not ad-driven. The company is still exploring how affiliate revenue will work, with the current focus on user experience.

OpenAI wants its ‘open’ AI model to call models in the cloud for help
Rumours about OpenAI's new open model are intensifying. New reports suggest the company is preparing to launch a free, downloadable AI system this summer. The new model aims to outperform rivals like Meta and DeepSeek and may include a unique “handoff” feature that connects it to OpenAI’s more powerful cloud models for complex tasks, similar to what Apple has implemented with its hybrid on-device and cloud AI architecture.

Inside the Battle Over OpenAI’s Corporate Restructuring
OpenAI's planned restructuring from a nonprofit-controlled entity to a public-benefit corporation has sparked backlash from activists who fear it could divert charitable assets and undermine the company’s mission to benefit humanity. Economic-justice advocate Orson Aguilar and a coalition of over 50 community organisations are urging California’s attorney general to ensure the transition complies with laws protecting nonprofit funds. While OpenAI claims the move will strengthen its charitable impact, critics remain sceptical, citing historical examples of similar conversions that benefited private interests at the expense of the public good.

How People Are Really Using Gen AI in 2025
What are people really using generative AI for in 2025? A new report reveals a surprising shift: while early use centred on technical tasks, today’s top applications focus on emotional support and personal growth, with therapy, life organisation, and finding purpose leading the list. Drawn from real-world user data, the study highlights AI’s expanding role in everyday life—helping with learning, health, and even legal appeals—while also exposing growing users’ concerns over data privacy, dependency, and limitations in AI memory.

The Urgency of Interpretability
In this post, Dario Amodei, CEO of Anthropic, argues that as AI systems grow increasingly powerful, making them interpretable—understanding their inner workings—is crucial to ensuring their safe and responsible deployment. He outlines recent breakthroughs in mechanistic interpretability that allow researchers to identify and analyse complex AI behaviours, likening future tools to an “MRI for AI.” Amodei warns that opacity in AI systems poses serious risks, including deception, misuse, and regulatory barriers, and emphasises that society is in a race between interpretability and AI capabilities. He calls on researchers, companies, and governments to invest in interpretability, promote transparency, and consider policy tools such as export controls to buy critical time for safety-focused advancements.

Meet the Humans Building AI Scientists
FutureHouse, a San Francisco-based nonprofit, is pioneering AI tools to automate scientific discovery. In this interview, co-founders Sam Rodriques and Andrew White discuss their crow-themed AI agents, the challenges of building scalable scientific tools, the surprising ease of cognitive tasks for AI, and the difficulty of lab automation. They also explore the importance of reproducibility, how their tools outperform humans in some tasks, and their long-term vision of creating autonomous AI scientists supported by general-purpose humanoid robots.

Whole-body physics simulation of fruit fly locomotion
Researchers from DeepMind have developed a highly detailed, physics-based simulation of a fruit fly that realistically replicates walking, flying, and visually guided behaviour. This open-source project aims to help scientists better understand how the brain, body and environment drive specific behaviours.

If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?

Refer a friend

🤖 Robotics

Is the CEO of the heavily funded humanoid robot startup Figure AI exaggerating his startup’s work with BMW?
This article questions Figure AI’s partnership with BMW, saying that despite CEO Brett Adcock’s claims of a “fleet” of humanoid robots performing complex tasks, only a single robot has been operating at BMW’s South Carolina plant—initially during off-hours and now in a limited production role. The article raises concerns about transparency and the readiness of humanoid robotics for real-world manufacturing. In response, Brett Adcock announced legal action against Fortune, the publisher of the article.

Driverless trucks are rolling in Texas, ushering in new era
Aurora is set to launch fully driverless semi-truck operations along the I-45 between Dallas and Houston by the end of the month. The rollout will begin with one truck and gradually expand. This marks a major milestone in autonomous freight transport, which aims to address industry challenges such as driver shortages and rising costs. While advocates highlight improved efficiency and cost savings, critics raise concerns over safety, job losses, and the lack of federal oversight.

▶️ Arm you glad to see me, Atlas? (2:01)

Boston Dynamics has released a new video of its Atlas humanoid robot. This time, instead of doing backflips or other impressive feats of acrobatics, the company showcases the dexterity of its new robot as it picks up various objects and places them into bins.

Waymo, Toyota strike partnership to bring self-driving tech to personal vehicles
Would you like to turn your personal car into a self-driving vehicle? With the recently announced preliminary partnership between Waymo and Toyota, this might be possible. The two companies are collaborating to explore the integration of self-driving technology into personally owned vehicles. The partnership aims to accelerate the development of driver assistance and autonomous systems, with the potential for Toyota vehicles to join Waymo’s ride-hailing fleet.

'Robotability score' ranks NYC streets for future robot deployment
Researchers at Cornell Tech have developed a first-of-its-kind "robotability score" to assess how suitable New York City streets are for delivery robots. The score considers factors such as pedestrian density, pavement quality, and street furniture. The aim is to assist urban planners and communities in preparing for potential robot deployment without disrupting public spaces. In addition to the paper, the researchers have also created an interactive map of all streets in New York, showing their robotability scores along with examples of how a robot would perform on some of those streets.

Berkeley Humanoid Lite: An Open-source, Accessible, and Customizable 3D-printed Humanoid Robot
If you have ever wanted to build a humanoid robot, then Berkeley Humanoid Lite is a project for you. Designed to democratise access to humanoid robotics, this open-source project features a fully modular, 3D-printed humanoid robot with components that can be easily sourced from the internet and assembled using standard desktop 3D printers—all for under $5,000. By releasing all hardware, software, and training tools as open source, the project aims to empower researchers, educators, and hobbyists worldwide.

🧬 Biotechnology

Dozens of Nobel-Worthy Innovations Awaiting Biomanufacturing 2.0
This article highlights the potential of synthetic biology and biomanufacturing to transform massive global industries, ranging from food and fuels to pharmaceuticals and fashion, driven by start-ups that combine multiple cutting-edge innovations, such as AI-driven molecular design, novel organism engineering, and next-generation bioreactor systems. Despite early setbacks marked by fragmented markets, insufficiently disruptive technologies, and slow cost curve improvements, the authors remain optimistic, pointing to some industry successes and urging a SpaceX-style approach to unlock the sector’s trillion-dollar promise.

The US has approved CRISPR pigs for food
The US Food and Drug Administration has approved genetically engineered pigs for food consumption. Engineered by British company Genus, these pigs will be immune to porcine reproductive and respiratory syndrome (PRRS), a virus that causes hundreds of millions in losses annually, making them over 99% resistant to known PRRS strains. The approval marks one of the first gene-edited animals cleared for the food chain and could lead to genetically engineered pork products hitting US shelves as early as next year—with no labelling requirements expected.

💡Tangents

XPrize in Carbon Removal Goes to Enhanced Rock Weathering
Mati Carbon, a Houston-based startup, has won the $50 million grand prize in the $100 million XPrize Carbon Removal competition for its innovative use of enhanced rock weathering—spreading crushed basalt on small farms in India and Africa to sequester CO₂ while improving soil health. Runners-up included NetZero, Vaulted Deep, and Undo Carbon for various carbon removal methods. Notably, no direct air capture or ocean-based solutions met the contest’s 1,000-tonne CO₂ removal threshold, despite their technical promise.

Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.

Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.

A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!

My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"