Small language models are coming - Weekly News Roundup - Issue #446
Plus: UK court rules that AI cannot be an inventor; the age of Crispr medicine is here; what if someone steals GPT-4?; three Spots making art; and more!
Welcome to Weekly News Roundup Issue #446. Today, we will take a closer look at the emerging trend of small language models. In other news, a UK court declared that AI cannot be listed as an inventor and Meta announced Purple Llama, a new initiative aimed at promoting open trust and safety tools in generative AI development. Other stories featured in this week’s news roundup include: the age of Crispr medicine is here; a brain-computer interface that is already on the market and is easy to insert; three Spots make art; and more!
Generally speaking, larger models tend to perform better. A 2020 paper from OpenAI demonstrated that the performance of large language models depends on three factors: the number of model parameters (excluding embeddings), the size of the dataset, and the amount of computing power used for training. This led to the development of enormous models with over 100 billion parameters. For instance, GPT-3 has 175 billion parameters, while Google's first PaLM model boasted 540 billion parameters, reduced to 340 billion in its second version, PaLM 2. The exact parameter counts for top models like GPT-4, Claude, or the recently released Gemini from Google are unknown, but it's safe to assume they are in the hundreds of billions, if not over a trillion.
However, simply increasing the number of parameters, training data, or computing power is not always the best approach. Larger models require more resources and time for training, increasing the cost. Additionally, these huge models are more resource-intensive to operate.
But in 2024, we might witness a reversal of this trend. Recent developments have shown that fine-tuning can lead to smaller models outperforming larger ones in specific areas. We have already seen early examples of what could be a significant shift in 2024 – improved training methods and more efficient architectures. The recently released Phi-2 model from Microsoft is a good example of the former. This 2.7 billion-parameter open-source model matches or outperforms models up to 25 times larger, thanks to more efficient training methods. Mixtral, on the other hand, presents how a well-implemented architecture can lead to improved performance. Mixtral is a Mixture of Experts model, which consists of eight Mistral 7B models working together (Mistral 7B on its own is a very good and popular open source model). According to HuggingFace benchmarks, Mixtral currently ranks as the best open-source model, on par with GPT-3.5.
These promising results could mark the beginning of a broader trend: moving away from training ever-larger models towards more efficient, smaller models or architectures. The advantages of this approach include smaller, easier-to-operate, and more cost-effective models. This also improves accessibility and opens opportunities for wider use, particularly benefiting the open-source community. It demonstrates that with the right architecture, open-source models can rival top proprietary models.
While it's theoretically possible to create bigger and more capable models, there comes a point where making larger models is not practical. 2023 was the year of large language models. 2024 might well be the year of small but very capable language models.
If you enjoy this post, please click the ❤️ button or share it.
I warmly welcome all new subscribers to the newsletter this week. I’m happy to have you here and I hope you’ll enjoy my work. A heartfelt thank you goes to everyone who joined as paid subscribers this week.
The best way to support the Humanity Redefined newsletter is by becoming a paid subscriber.
If you enjoy and find value in my writing, please hit the like button and share your thoughts in the comments. Please consider sharing this newsletter with others who might also find it valuable.
For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.
Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.
🦾 More than a human
The Brain-Implant Company Going for Neuralink’s Jugular
This article from IEEE Spectrum introduces Synchron, a company creating brain-computer interfaces. Synchron's implants, known as Stentrode, are already available on the market and are used by patients suffering from locked-in syndrome to communicate with the world. What’s interesting about Stentrode is how it is inserted into the brain. A 16-electrode array is inserted into the jugular vein in the neck and then snaked up a blood vessel near the brain's motor cortex. Upon reaching its destination, the Stentrode unfolds and begins reading electrical signals from nearby neurons. Compared to its competitors, such as Neuralink, Stentrode offers a lower resolution (16 electrodes versus 1024 in Neuralink's device). However, it has a one big advantage: it does not require open brain surgery and is easier to insert into the patient's brain.
MIT designs robotic heart chamber
MIT engineers have developed a robotic replica of the heart's right ventricle, designed to mimic the heart's beating and blood-pumping actions. This "robo-ventricle" combines real heart tissue with synthetic, balloon-like muscles, allowing precise control over contractions and observation of natural valve functions. The team's study offers a promising avenue for advancing cardiac care and device development. Their long-term vision includes pairing it with a similar model of the left ventricle to create a fully tunable, artificial heart.
🧠 Artificial Intelligence
AI cannot be patent 'inventor', UK Supreme Court rules in landmark case
Stephen Thaler, a US computer scientist, has lost his bid to register patents for inventions created by his artificial intelligence system in a landmark case in Britain about whether AI can own patent rights. The UK's Intellectual Property Office refused Thaler's patent applications because the rules stipulate that an inventor must be a human or a company, not a machine. Thaler had previously lost a similar case in the United States. The article notes that courts in Europe, Australia, and the U.S. have made similar rulings - an inventor must be a natural person. However, Justice David Kitchin clarified that the ruling did not address whether technical advances made autonomously by AI should be patentable, focusing only on the definition of an "inventor."
Announcing Purple Llama: Towards open trust and safety in the new world of generative AI
Meta has announced Purple Llama, a new initiative aimed at promoting open trust and safety tools in generative AI development. The initiative includes the release of CyberSec Eval, a set of cybersecurity benchmarks for LLMs, and Llama Guard, a safety classifier for input/output filtering. The project collaborates with major industry players like AI Alliance, AMD, AWS, Google Cloud, and others, to enhance and share these tools with the open-source community.
Offering a good and useful AI model can provide an enormous advantage for a company or a state. One way for others to catch up is to invest in developing their own models. Another option is to simply steal the model. In this video, Asianometry explores how cybercriminals could steal an AI model and what companies can do to prevent this from happening.
Artificial intelligence makes gripping of prosthetic hands more intuitive
Researchers from the Technical University of Munich have combined a network of 128 sensors and AI-based techniques to interpret muscle activations, enabling more fluid hand and wrist movements in advanced prosthetic hands. Previously, prosthetic hand control was limited to simpler movements using just a couple of sensors. With the help of new learning algorithms, researchers can better interpret muscle activities and translate them into seamless, intuitive movements of prosthetics.
If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?
Meet the artist training Spot robots to make their own art
Agnieszka Pilat, a Polish-born artist based in the US, uses three Boston Dynamics Spots to create art. The robots, Basia, Omuzana, and Bunny, have distinct personalities and roles. Basia, living with Pilat in New York, is a painter. Omuzana acts as a protector, observing and overseeing the exhibit. Bunny, the youngest and lightest, enjoys posing for selfies with attendees. In the current exhibit, named “Heterobata”, the robots autonomously compose paintings using 16 characters designed by Pilat, blending software, robotics, machine learning, and generative AI (but avoiding pure generative AI due to controversy in the art world).
New robotic system assesses mobility after stroke
A team of computer science and biokinesiology experts has developed a robotic tool to assist stroke survivors in accurately tracking their recovery progress. The new technology utilizes a robotic arm to gather precise 3D spatial data on arm usage. Machine learning processes this data to create an "arm nonuse" metric, aiding clinicians in assessing rehabilitation progress. The system, tested with stroke survivors, involves a socially assistive robot (SAR) that provides instructions and encouragement. This novel approach, combining quantitative data collection with a motivating SAR, offers a more accurate and engaging method for stroke patient assessment. The technology's potential for personalized rehabilitation could significantly enhance the recovery process for stroke survivors.
XtalPi and ABB Robotics to automate laboratory workstations with GoFa cobots
XtalPi and ABB Robotics have announced a strategic partnership to develop automated laboratory workstations in China, combining ABB’s GoFa collaborative robots with XtalPi's software. This collaboration aims to enhance research and development productivity in fields like biopharmaceuticals, chemical engineering, and new-energy materials. The global market for laboratory robotics is expected to grow significantly, with XtalPi and ABB playing a key role in this advancement.
The Age of Crispr Medicine Is Here
The recent approvals of Crispr-based therapies by authorities in the UK, EU and the US mark the beginning of a new era in Crispr medicine. The treatment, now approved under the brand name Casgevy, is the first publicly available Crispr-based medical treatment, represents a significant breakthrough in treating sickle cell disease and opens the possibility for more gene therapies based on Crispr to come soon. However, the long-term effects of Crispr treatments are still unknown, and further research and patient monitoring are necessary to fully understand their impact.
Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.
Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.
Humanity Redefined is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!