It's Strawberry Summer at OpenAI - Weekly News Roundup - Issue #476

Plus: GPT-4o-mini; first Miss AI contest sparks controversy; lab-grown meat for pets approved in the UK; Tesla delays robotaxi reveal until October; 'Supermodel granny' drug extends life in animals

Jul 19, 2024

Hello and welcome to Weekly News Roundup Issue #476. This week, we will focus on recent leaks from OpenAI hinting at a new and powerful AI model known internally as Strawberry.

In other news, scientists have extended the lifespan of mice by nearly 25% and given them a youthful appearance.

Over in AI, OpenAI released a new model—GPT-4o-mini, while Apple, Nvidia, and Anthropic have been caught using captions from videos of popular YouTubers to train their models. We also have the story of Graphcore, one of the first AI hardware startups to reach unicorn status, and what caused its downfall.

In robotics, the reveal of Tesla’s robotaxi service has been delayed until October, and we will learn how much it would cost to hire a Digit humanoid.

We will finish this week’s news roundup with UK regulators approving lab-grown meat for pets and with a startup that makes butter using CO2 and water.

Enjoy!

It's Strawberry Summer at OpenAI

Creating artificial general intelligence (AGI) is the explicit goal of many AI companies. OpenAI is one of those companies which describes its mission as “to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity.”

However, to know if AGI has been achieved, we first need to be clear on what AGI is in the first place. Modern AI systems would have been considered AGI five or ten years ago, but no one calls them that today, creating a need to clearly define what AGI is and what it is not. Google DeepMind has already proposed its definition of AGI, and now it is OpenAI’s turn to do the same.

OpenAI introduces Five Stages of AI

In November 2023, researchers from Google DeepMind published a paper titled Levels of AGI in which they attempted to define rigorously what AGI is. They have analysed nine different existing definitions of AGI, their strengths and weaknesses, to then promptly propose a tenth definition. Their definition combines both the performance, or the depth of an AI system’s capabilities, and generality—the breadth of an AI system’s capabilities. In order to better communicate the capabilities of an AI system, the paper also introduces Levels of AGI—a six-tier matrixed levelling system to classify AI on the path to AGI, similar to what the car industry uses to describe the extent of self-driving cars’ autonomous capabilities.

According to that system, we have already created narrow AI systems at the highest, Superhuman level, in which AI outperforms all humans. This category includes systems such as AlphaFold, AlphaZero or Stockfish. However, the picture looks different for the general AIs, where we have reached Level 1, Emerging AGI, with systems such as ChatGPT or Gemini.

Last week, during an all-hands meeting with employees, OpenAI introduced its own framework to track the progress towards building AGI using a five-level tier system. At Level 1 we have Chatbots—AI systems that are good at understanding and using conversational language but lack human-level reasoning and problem-solving skills. The next level, Reasoners, defines AI systems with enhanced reasoning capabilities. After Reasoners, we have Agents (systems that can take actions), then Innovators (systems that can aid in invention), and finally Organisations, where AI systems can perform the work of an entire organisation.

According to this framework, current AI systems are on the first level. However, OpenAI executives believe we are on the cusp of reaching the second level, Reasoners, Bloomberg reports. Reuters later followed up with its own report, saying that OpenAI is already working on and testing a Level 2 system, a system displaying human-level reasoning skills.

Strawberry—OpenAI’s new mysterious AI model

This new AI system is known internally under the codename Strawberry and was formerly known as Q*. We learned about Q* in November last year when it was rumoured it was this AI model that prompted some of the OpenAI board members to remove Sam Altman from his position as the CEO of OpenAI. According to two sources quoted by Reuters, some OpenAI employees have seen Q* in action, saying it was capable of answering tricky science and math questions out of reach of today’s commercially available AI models. Another source said one of OpenAI’s internal projects produced an AI system that scored over 90% on a MATH dataset, a benchmark of championship math problems. Additionally, during the same meeting that introduced OpenAI’s stages of AI framework, the employees saw a demo of a new AI system that had new human-like reasoning skills.

However, it is unclear if these were different models or if they are all the same model. It is also unknown how close this new model or models are to being released to the public. Previous reports were pointing at the second half of 2024 as a possible release date for GPT-5, which might incorporate Project Strawberry or some parts of it.

In either case, OpenAI is cooking something, and the recent releases of GPT-4o and GPT-4o-mini (more on that one later) might play a role in creating reliable and powerful AI models capable of reasoning like a human.

OpenAI’s questionable safety practices

At the same time when OpenAI is developing a new generation of AI models, the company is also dealing with whistleblowers accusing the company of placing illegal restrictions on how employees can communicate with government regulators. The whistleblowers have filed a complaint with the U.S. Securities and Exchange Commission, calling for an investigation over the company's allegedly restrictive non-disclosure agreements, according to Reuters.

The questions around restrictive non-disclosure agreements at OpenAI were asked already a couple of weeks ago when the company was dealing with the implosion of its Superaligment team and the departure of people associated with the team, including OpenAI’s co-founder, Ilya Sustkever. Vox reported in May that OpenAI could take back vested equity from departing employees if they did not sign non-disparagement agreements. Sam Altman responded in a tweet saying the company “have never clawed back anyone's vested equity, nor will we do that if people do not sign a separation agreement (or don't agree to a non-disparagement agreement).”

Additionally, OpenAI has been recently plagued by safety concerns. The implosion of the Superalignment team and Sutskever’s departure were the highlights that gathered public attention, but there were other things happening, too. Earlier this year, two other people working on safety and governance left OpenAI. One of them wrote on his profile on LessWrong that he quit OpenAI "due to losing confidence that it would behave responsibly around the time of AGI." Jan Leike, a key OpenAI researcher and the leader of the Superaligment team, announced his departure from OpenAI shortly after the news of Sustkever’s departure broke.

There are also reports about OpenAI prioritising quick product releases over proper safety tests. The Washington Post shares a story of the company celebrating the launch of GPT-4o in May before safety tests were complete. “They planned the launch after-party prior to knowing if it was safe to launch,” said one of the OpenAI employees to The Washington Post. “We basically failed at the process.”

I hope that when GPT-5 is released, with Strawberry or not, OpenAI has sorted out its safety practices and protocols. I don’t think anyone wants to experience an AI meltdown on a scale of Google AI Overview disaster, but with an AI system that is more capable than the best models available today.

If you enjoy this post, please click the ❤️ button or share it.

Do you like my work? Consider becoming a paying subscriber to support it

Become a paid subscriber

For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.

Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.

🦾 More than a human

'Supermodel granny' drug extends life in animals
Researchers have developed a drug targeting a protein called interleukin-11, which, when administered to mice, increased their lifespans by nearly 25%. The treated mice were healthier, stronger, and developed fewer cancers than their unmedicated peers. Additionally, the treated mice gained a “youthful appearance,” earning them the nickname “supermodel grannies.” The drug is already being tested in people, but whether it will have the same anti-ageing effect in humans remains unknown.

Tiny implant fights cancer with light
A new implantable device, the size of a grain of rice and activated remotely by an external antenna, offers a novel way of treating deep-seated cancers with light. When combined with a light-sensitive dye, the implant not only destroys cancer cells but also mobilises the immune system’s cancer-targeting response. In future studies, the device will be used in mice to determine whether the cancer-killing response initiated in one tumour will prompt the immune system to identify and attack another cancerous tumour on its own.

🧠 Artificial Intelligence

GPT-4o mini: advancing cost-efficient intelligence

GPT-4o-mini benchmark results. Source: OpenAI

OpenAI has released a new small model named GPT-4o-mini. According to OpenAI, GPT-4o-mini matches the performance level of the original GPT-4 model released over a year ago while also outperforming similar models in its class, such as Google’s Gemini Flash and Anthropic’s Claude 3 Haiku. OpenAI says that the new model is the “most cost-efficient small model” and offers “superior textual intelligence and multimodal reasoning.” GPT-4o-mini is available through the OpenAI API and in ChatGPT, where it replaces GPT-3.5 Turbo as the model suited for everyday tasks.

Meta won't offer future multimodal AI models in EU
Meta won’t be releasing its next and all future multimodal AI models in the EU, citing regulatory concerns as the main reason. "We will release a multimodal Llama model over the coming months, but not in the EU due to the unpredictable nature of the European regulatory environment," Meta said in a statement to Axios. With this statement, Meta joins Apple as the next big tech company to withhold their most powerful AI models from the EU due to the regulatory landscape in Europe.

Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI
An investigation by Proof News has found that subtitles from 173,536 YouTube videos from more than 48,000 channels have been used to train AI models from Apple, Nvidia, Anthropic, and Salesforce. The subtitles were part of the Pile dataset, which AI companies use to train their language models. Many of the creators whose content was included in the dataset were unaware of its use, with some of them calling this situation to be a theft. Additionally, YouTube forbids harvesting materials from the platform without permission.

▶️ The Downfall of AI Unicorns: Graphcore exits to Softbank (15:27)

In this video, Dr Ian Cutress tells the story of Graphcore, one of the first AI hardware startups to reach unicorn status, and analyses what caused its downfall and eventual acquisition by Softbank a week ago. Additionally, the video contains an analysis of AI hardware startups, how the downfall of Graphcore could affect the scene, and answers the question of whether this is the first sign of market consolidation.

Text to Image AI Model & Provider Leaderboard
There is a new leaderboard for AI models, this time focusing on text-to-image models like Midjourney or DALL·E. It works similarly to the LMSYS Leaderboard for large language models—two images are presented to a human, who is then asked to pick the better one. The models are judged based on quality, generation time, and price.

Andrej Karpathy is starting Eureka Labs - an AI+Education company
Andrej Karpathy, a well-respected figure in the AI community who previously worked with OpenAI and Tesla, has launched Eureka Labs—a company that is “building a new kind of school that is AI native.” The company's current focus is its first product, LLM101n, an AI course designed with an AI Teaching Assistant.

First “Miss AI” contest sparks ire for pushing unrealistic beauty standards
The world’s first Miss AI contest has taken place and crowned its first winner—Kenza Layli, a Moroccan lifestyle influencer who amassed 200,000 followers on Instagram and a further 45,000 on TikTok. The contest aimed to “celebrate the technical skill and work behind digital influencer personas from across the world.” However, it also gathered some criticism from the AI community.

“Superhuman” Go AIs still have trouble defending against these simple exploits
Researchers at MIT and FAR AI have identified weaknesses in top-level AI Go algorithms that allow even novice players to exploit them using simple "cyclic" strategies. Their study explored three methods to strengthen the KataGo algorithm's defences, and in all three cases, the AI failed. These findings highlight the importance of addressing "worst-case" performance in AI systems, where weak adversaries can expose significant vulnerabilities despite overall strong performance. The research suggests that while current methods aren't foolproof, ongoing training against a diverse range of attacks could eventually create more robust AI.

Google DeepMind’s AI Rat Brains Could Make Robots Scurry Like the Real Thing
A team of researchers from Google DeepMind and Harvard University has built a realistic virtual rat, trained on tens of hours of neural recordings from actual rats running around in an open arena. This virtual rat brain can be used by neuroscientists to better understand how brains work and can also be used to create a new generation of embodied agents to control real-world robots.

If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?

Refer a friend

🤖 Robotics

Here’s what it could cost to hire a Digit humanoid
Recently, Agility Robotics’ Digit became the first humanoid robot to land a commercial job at a Spanx facility in Flowery Branch, Georgia. This article from The Robot Report asks how much this “small fleet” of humanoid robots costs to run and did some number crunching. Based on the statement from the Agility Robotics CEO that the company is currently charging a “fully loaded $30 per hour,” the article estimates that it would cost $62,400 per year to run Digit, assuming a 40-hour work week.

Tesla reportedly delaying its robotaxi reveal until October
Those waiting for the long-promised fully autonomous robotaxi service from Tesla will have to wait a bit longer. Bloomberg reports that the reveal of the robotaxi service, originally scheduled for August 8th, has been moved to October.

Google says Gemini AI is making its robots smarter
In a recently published paper, researchers from Google DeepMind describe how they improved robot navigation and task completion using Gemini AI, achieving a 90% success rate with over 50 user instructions in a 9,000-square-foot area. Additionally, Gemini allows users to more easily interact with its RT-2 robots using natural language instructions, though processing commands currently takes 10-30 seconds.

U.S. Marine Corps testing autonomy system for helicopters
The Naval Air System Command (NAVAIR) has chosen Near Earth Autonomy to demonstrate an advanced autonomy system for helicopters, targeting the U.S. Marine Corps’ Aerial Logistics Connection (ALC) program. Over the next 20 months, a Leonardo AW139 helicopter will be enhanced with an advanced autopilot designed by Honeywell, enabling it to autonomously transport up to 3,000 lbs (1,360 kg) of cargo over a 200 NM (370 km) radius and autonomously take off and land.

Xiaomi's self-optimizing autonomous factory will make 10M+ phones a year
Xiaomi unveiled its new, 100% automated factory, which will work 24/7 to produce Xiaomi's upcoming foldable phones at a rate of about one every three seconds. The company says that this 80,000-square-meter (860,000-sq-ft) facility is smart enough to diagnose and fix problems, as well as optimize its own processes to "evolve by itself."

Robot Dog Cleans Up Beaches With Foot-Mounted Vacuums
Researchers from the Italian Institute of Technology in Genoa have presented VERO, a robot dog with a vacuum mounted on its back, designed and trained to spot and clean areas from cigarette butts. What’s unique about this robot is that the vacuum cleaner hoses are attached to the robot’s legs, allowing it to suck in the cigarette butts using its feet. Researchers see potential in this kind of robot design and suggest a variety of other potential use cases, including spraying weeds in crop fields, inspecting cracks in infrastructure, and placing nails and rivets during construction.

🧬 Biotechnology

Lab-Grown Meat for Pets Was Just Approved in the UK
London-based startup Meatly has received approval from UK regulators to sell its lab-grown chicken meat as an ingredient in pet food. This marks the first approval of a lab-grown pet food ingredient anywhere in the world, allowing the company to sell their product to approved pet food manufacturers as an ingredient.

Fats from thin air: Startup makes butter using CO2 and water
Savor is a startup that has developed a thermochemical process to form animal-like fat without the environmental impact associated with dairy or plant-based alternatives. The company is backed by Bill Gates, who highlights the potential of these lab-made fats to reduce carbon footprints, as they produce no greenhouse gases and use minimal water.

Where's the Synthetic Blood?
The creation of synthetic blood is one of the holy grails of biomedical research, as it would alleviate shortages and overcome current barriers to obtaining and storing blood products. In this article from

Asimov Press

, Dr. Keith Neeves gives a history of blood transfusions and explains why it will be difficult to scale up the creation of synthetic blood.

Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.

Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.

A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!

My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"