Nvidia aims to become the world's first AI foundry - Weekly News Roundup - Issue #459
Plus: Microsoft + Inflection AI; Apple MM1; Grok is open source; a human with Neuralink implant plays chess and Civilisation; employees at top AI labs fear safety is an afterthought; and more!
Hello and welcome to Weekly News Roundup Issue #459. This week was dominated by Nvidia’s GTC event, during which Nvidia showed and announced numerous new AI products and services.
However, that was not the only notable development this week. In the world of AI, the founders of Inflection joined Microsoft, Elon Musk open-sourced Grok and employees at top AI labs fear safety is an afterthought. Apple demonstrated their multimodal language model and discussed the potential use of Gemini with Google. Meanwhile, Neuralink introduced the first human with their neural implant who showed how he can play games using the Neuralink device.
Before we dive into this week’s news roundup, I want to acknowledge that this issue is being released later than usual. A combination of various events has prevented me from not only delivering the Weekly News Roundup on time but also the in-depth articles. However, that is always an opportunity to review my processes and put in place new processes to minimise the chances of this happening again.
I hope you'll enjoy this week's issue.
2023 was a great year for Nvidia. Every AI company, from large players like OpenAI and Microsoft to newly founded startups, was vying for Nvidia’s top H100 or at least A100 GPUs. Riding the AI wave, the company exceeded expectations quarter after quarter in 2023, became a key supplier of highly sought-after GPUs for training top AI models, and joined the elite club of companies valued at over one trillion dollars. All just in time for the company to celebrate the 30th anniversary of its founding.
With this in the background, Nvidia held GTC this week, inviting everyone to join and see what new products and services the company is releasing this year. The most anticipated release was the new generation of high-end GPUs based on the new Blackwell architecture but Nvidia has also surprised with some other interesting products and services.
I noticed two themes after watching Jensen Huang’s two-hour-long, densely packed keynote (which also contained Jensen’s bad jokes and occasional cringe). First, on the hardware side, Nvidia decided to go bigger. Bigger chips, bigger computers, more memory and more computing power to train even bigger AI models. Secondly, Nvidia is refocusing itself on combining hardware and software to become what Jensen Huang said is the world’s first AI foundry.
“We need bigger GPUs”
The star of the show was Blackwell, Nvidia’s newest GPU architecture. Unlike previous generations of Nvidia GPUs, Blackwell GPU has two chips connected to make one massive chip, marking it the first multi-die chip for Nvidia. Each chip has 104 billion transistors, resulting in a total number of 208 billion transistors. For comparison, the previous Hopper-based H100 has 80 billion transistors. According to Huang, these two chips are connected in such a way they think they are one big chip, not two separate chips. In total, Blackwell GPU offers 20 petaFLOPS of AI performance, and 192GB of HBM3e memory with 8 TB/s of memory bandwidth.
Next, two Blackwell GPUs can be combined with one Grace CPU to create a GB200 superchip. Two of GB200 go inside Blackwell Compute Node which, together with Nvidia’s NVLinks, are the basic building blocks of Nvidia DGX SuperPOD, which offer a total of 11.5 exaFLOPS of AI performance (with FP4 numbers). These DGX SuperPODs can then be scaled up even further to create the “full datacenter” with 32,000 Blackwell GPUs offering 645 exaFLOPS of AI performance, 13 petabytes of memory and superfast networking connecting all these chips. According to Nvidia, DGX SuperPOD is ready for processing trillion-parameter models.
That’s a lot of numbers and names thrown at you. Tera this, peta that. This part of the keynote nicely explains visually how Nvidia scales from a single Blackwell chip up to the full datacenter.
To better illustrate what kind of jump in performance Blackwell is bringing to the table, Jensen Huang shared that training what was labelled as GPT-MoE-1.8T on one of the slides—which is almost certainly GPT-4, thus confirming the rumoured structure of OpenAI’s current top model—required 90 days and 8,000 H100 GPUs. With Blackwell, the same task would only need 2,000 GPUs in the same timeframe. Moreover, a Blackwell-based supercomputer would also be more energy-efficient; instead of the 15MW consumed by the Hopper-based system, a Blackwell-based system would require just 4MW.
The Blackwell system introduces several interesting features. With the RAS Engine, the GB200 superchip can monitor the system, perform self-tests and diagnostics, and notify users if something is about to go wrong, thus improving the utilization of the entire supercomputer. The GB200 superchip also introduces a second-generation Nvidia Transformer Engine, which can automatically recast numbers into lower precision FP4 4-bit numbers, speeding up calculations. Additionally, it includes built-in security features to encrypt data at rest, in transit, and during computation.
Even though Blackwell was the star of the show, Nvidia also showcased interesting developments in chip-to-chip communication. Modern AI models are too large to fit into a single GPU and need to be distributed across multiple GPUs. However, these GPUs must communicate with each other and exchange vast amounts of data, creating a bottleneck that limits the entire system's performance.
To solve this bottleneck, Nvidia built the NVLink Switch Chip which allows every GPU in the system to communicate with any other GPU, effectively turning the entire system into one big GPU.
For more details into Nvidia’s new Blackwell architecture and how it compares to previous generations, I recommend checking out this deep dive from AnandTech, which has all the numbers compared alongside excellent in-depth technical analysis.
NIMs
In addition to accelerating the AI industry with hardware, Nvidia also aims to support the industry with software. Setting up AI models can be challenging, especially for companies lacking in-house AI expertise. To address this issue and simplify the deployment of AI models, Nvidia introduced NIMs.
NIMs, short for Nvidia Inference Microservices, are pre-trained models ready to use in various tasks. The goal is to create easy-to-use AI models as containers that can be used straight away by enterprises or to use them as a base for custom AI workflows.
NIM currently supports models from Nvidia, A21, Adept, Cohere, Getty Images, and Shutterstock, as well as open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI. Nvidia is collaborating with Amazon, Google, and Microsoft to make these NIM microservices available on their platforms. Additionally, NIMs will be integrated into frameworks like Deepset, LangChain, and LlamaIndex.
No one expected humanoid robots
One development I didn't anticipate was Nvidia jumping onto the humanoid robotics hype train. Humanoid robots are just around the corner, with some already being tested by companies like Amazon, BMW, and Mercedes-Benz. Nvidia sensed an opportunity here and introduced Project GR00T, a foundation model for humanoid robots. GR00T (short for "Generalist Robot 00 Technology" and spelt with zeroes to avoid any copyright issues with Disney) is intended as a starting point for these robots. Robots equipped with GR00T are designed to understand natural language and emulate human movements by observation—quickly acquiring coordination, dexterity, and other skills necessary to navigate, adapt, and interact with the real world, according to Nvidia.
In addition to GR00T, Nvidia announced a new computing platform named Jetson Thor, designed specifically for humanoid robots. The company states that this new platform, based on the Nvidia Blackwell architecture, can perform complex tasks and interact safely and naturally with both people and machines. It features a modular architecture optimized for performance, power, and size.
The Robot Report has a very good in-depth article looking at all things robotics, including GR00T, announced at GTC 2024. If you're interested in learning more about what Nvidia has to offer for robotics, this is a great resource.
Nvidia as an AI Foundry
The generative AI revolution (or bubble, depending on your perspective) propelled Nvidia to become the third most valuable company in the world, valued at $2.357 trillion, trailing only Apple and Microsoft.
Nvidia sees its future in AI. The company has effectively become the sole provider of hardware used to train the world's most advanced models and continues to deliver even more powerful hardware for training and deploying these sophisticated models. However, Nvidia does not want to stop at hardware. It also aims to introduce its own proprietary software solutions to help businesses fully leverage AI. By combining both AI hardware and software, Nvidia aspires to become the world’s first AI foundry.
One thing is for certain - the new hardware presented at GTC 2024 will enable the training and deployment of even more powerful AI models. Who knows, maybe there are already a couple of thousand Blackwell GPUs in Azure data centres crunching numbers for the upcoming GPT-5.
If you enjoy this post, please click the ❤️ button or share it.
Do you like my work? Consider becoming a paying subscriber to support it
For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.
Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.
🦾 More than a human
Watch Neuralink’s First Human Subject Demonstrate His Brain-Computer Interface
In a brief livestream on X, Neuralink introduced the first human with their neural implant who showed how he can play online chess and the video game Civilization using the Neuralink device. It is the first time Neuralink has shown its implant working in a human since Elon Musk shared the news of the successful implantation of the device in a human at the beginning of the year.
How one streamer learned to play video games with only her mind
Meet Perrikaryal, a Twitch streamer who plays various games using only her mind. Using a non-invasive BCI headset, she has successfully played Minecraft, Elden Ring, Trackmania, Palworld, and Tetris using only her thoughts, without touching any controllers.
The quest to legitimize longevity medicine
The field of longevity has come a long way from the fringes of science to where it is now. However, much remains to be done for longevity to be recognized as a credible medical field. There is disagreement on how ageing should be assessed and treated. When treatments are available, they are expensive and only accessible to the wealthy. Moreover, a number of longevity clinics have recently opened, ranging from high-end spas offering beauty treatments to offshore clinics providing unproven stem cell therapies.
First Genetically Engineered Pig Kidney Transplanted into Living Patient
eGenesis successfully transplanted the first genetically engineered pig kidney into a human. According to Massachusetts General Hospital, where the operation took place, the patient is recovering well. This milestone, enabled by advanced genome editing, could significantly reduce waitlist mortality and address the critical organ supply gap. The operation heralds a new era in transplant medicine, offering hope to the millions suffering from kidney failure worldwide.
🧠 Artificial Intelligence
Mustafa Suleyman, DeepMind and Inflection Co-founder, joins Microsoft to lead Copilot
Two out of three founders of Inflection, Mustafa Suleyman and Karén Simonyan, are joining Microsoft to form a new organization called Microsoft AI, focused on advancing Copilot and other consumer AI products and research. Inflection is building Pi, a personal AI assistant, which recently received an update. The news that Suleyman and Simonyan are joining Microsoft might be the first sign of the upcoming consolidation in the AI industry.
Grok is open-source now
Last week, Elon Musk promised to open-source Grok, xAI’s large language model, as part of his ongoing feud with OpenAI. Musk kept his word, and Grok’s source code is now publicly available on GitHub, with the weights available on Hugging Face. However, the training data used to train this 314-billion-parameter model has not been released, which diminishes the notion of Grok being fully open-source.
Apple’s MM1 AI Model Shows a Sleeping Giant Is Waking Up
Researchers from Apple have published a paper describing MM1, Apple’s first generative AI model which can work with text and images. The model can answer questions about photos and displays the kind of general knowledge skills shown by chatbots like ChatGPT. The model’s name is not explained but could stand for MultiModal 1. MM1 is another signal that Apple is gearing up to join the generative AI race later this year.
Apple Is in Talks to Let Google Gemini Power iPhone AI Features
According to Mark Gurman, a well-respected journalist focusing on Apple, Apple is in talks with Google to use Gemini, Google's latest AI model, to power generative AI features on Apple devices. Additionally, the company has also held discussions with OpenAI to potentially use GPT models, Bloomberg reported.
Employees at Top AI Labs Fear Safety Is an Afterthought, Report Says
A report surveying over 200 experts from leading AI labs such as OpenAI, Google DeepMind, Meta, and Anthropic reveals that workers at these companies have significant concerns about the safety of their work and the incentives driving their leadership. The report highlights a "lax approach to safety," stemming from a desire not to slow down the lab's efforts to build more powerful systems. Other respondents in the report shared their concerns about insufficient containment measures in place to prevent an AGI from escaping their control and cybersecurity issues, despite these companies publicly claiming they take security seriously.
India asks tech firms to seek approval before releasing 'unreliable' AI tools
India now requires tech companies to get government approval before releasing "unreliable" or trial-phase AI tools, emphasizing the need for clear labelling about their potential inaccuracies. This move, reflecting wider efforts to regulate AI and social media, came after Google's AI tool controversially depicted Prime Minister Modi. The directive, aiming to safeguard electoral integrity, coincides with upcoming elections where the ruling party is expected to dominate.
Google DeepMind’s new AI assistant helps elite soccer coaches get even better
After conquering protein folding and weather prediction, Google DeepMind has released a new AI model named TacticAI. TacticAI is an AI football coach assistant designed to predict the outcome of corner kicks and provide realistic and accurate tactical suggestions in football matches. It also recommends where to position players during a corner to maximize their effectiveness or the best combination of players to position upfront.
‘A landmark moment’: scientists use AI to design antibodies from scratch
Researchers have used generative AI to create entirely new antibodies for the first time. The team from the University of Washington adapted the AI tool RFdiffusion to design antibodies targeting specific pathogens, such as SARS-CoV-2. Despite a low success rate (one in 100), this proof-of-concept represents a promising step toward AI-assisted pharmaceuticals, demonstrating the potential of AI in generating new therapeutic antibodies, a market worth hundreds of billions of dollars.
If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?
🤖 Robotics
Autonomous auto racing promises safer driverless cars on the road
Nothing advances technology like good competition. This is the idea that led teams from multiple universities to compete against each other in racing with autonomous IndyCars at speeds over 140 mph (approximately 230 kilometres per hour). Apart from being a fun challenge for the students, this competition can also help advance the research and development of self-driving cars, making them safer for everyone on the road.
A snake-like robot designed to look for life on Saturn's moon
The Exobiology Extant Life Surveyor, or EELS for short, is a 4-meter-long, snake-like robot designed to explore Saturn's moon, Enceladus. Designed for icy terrains and capable of autonomous navigation, robots like EELS could one day search for signs of life beneath the moon's surface. The robot features a corkscrew design for mobility and has been successfully tested under extreme conditions, including temperatures of –198°C.
Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.
Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.
A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!
My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"