AI Week In Review 24.02.03
CodeLlama 70B, Bard image gen & ImageFX, Amazon Rufus, Hugging Chat Assistant, AppleGPT, AI2's OLMo, Meta-prompting, SymbolicAI, Neuralink implanted!
AI Tech and Product Releases
After a few slow weeks on the AI product release front, we have a full stack of AI releases and progress.
Meta released Code Llama 70B, the highest-performing open AI coding model yet. They released 3 distinct models under the same license previous Code Llama models had, to support both research and commercial use:
CodeLlama-70B, the most performant base for fine-tuning code generation models. A state-of-the-art SQL-coding fine-tune of CodeLlama-70B model has already been released on HuggingFace; more will follow.
CodeLlama-70B-Python Python-specific model.
CodeLlama-70B-Instruct achieves 67.8 on HumanEval, close to GPT-4’s score, making it one of the highest performing coding models available today.
VentureBeat calls Code Llama 70B “an open-source behemoth to rival private AI development”, and notes that its large 100,000 token context window enables it to process and generate longer and more complex code.
Bard generates images now. Bard, now powered by Gemini Pro, has integrated the new Imagen2 image generation AI. Easy to try at bard.google.com. Reviews on X are: It’s good at photo-realistic images, close to Midjourney-6; it won’t follow prompts as faithfully as Dalle-E but can do lettering; it will often refuse to draw images (Pixar-style Trump and Biden cartoons are right out).
Google announced a new way to discover places with generative AI in Google Maps. This new feature suggests places based on user preferences, using generative AI to glean relevant information about 250 million locations.
Amazon's AI shopping assistant Rufus was launched into beta this week. Rufus is an AI chatbot trained on Amazon's product catalog, customer reviews, and web information to offer personalized shopping recommendations to customers. Think of it like CoPilot or chatGPT for shopping.
HuggingFace released Hugging Chat Assistant, announcing “Build your own personal Assistant in Hugging Face Chat in 2 clicks!” These are customized chatbots, similar to OpenAI GPTs or the customized AI bots in Meta, with initial Assistants like “Website Designer.”
MacRumors exposes Apple GPT: What We Know About Apple's Work on Generative AI. They are working on LLMs, including an internal chatbot dubbed “Apple GPT”. Apple is aiming to integrate AI with Siri, but Siri’s “cumbersome design” is an impediment. Expect generative AI releases from Apple in 2024, including a big Siri refresh and OS updates.
“I think there’s a huge opportunity for Apple with generative AI and with AI.” - Apple CEO Tim Cook
In the meantime, Apple has released the Vision Pro AR/VR headset, and Apple announced over 600 apps for it. What’s it actually like? Marques Brownlee reviews.
Allen Institute’s AI2 released a new large language model called OLMo 7B, and shared all the software components and their 3 trillion token training dataset, called Dolma, on GitHub and Hugging Face. The OLMo model itself is not better than prior open models like Llama2 7B, but AI2’s new open-source LLM may reset the definition of ‘open AI’.
There was a leak of a powerful open AI model Miqu-70B this week, that led to mystery and speculation, which got resolved when Mistral CEO confirmed the ‘leak’ was one of their internal AI models.
Elon Musk announced the first human received a Neuralink implant, a brain-computer-interface (BCI) chip to allow people to control devices and communicate through thought. Nature discussed what scientists think of first human trial, which includes concerns over a lack of transparency.
Top Tools & Hacks
Google’s AI Test Kitchen is Google’s experimental sandbox for pre-released AI tools. They have MusicFX music generation AI there, and recently added ImageFX, based on Imagen2. It’s a good place to play around with these new tools.
AI Research News
China creates world's first AI child which shows human emotion. Developed by the Beijing Institute for General Artificial Intelligence (BIGAI), Tong Tong is a virtual AI agent model designed as a step towards a general artificial intelligence (AGI) agent that can think and reason like a human being:
The aim is to craft an entity endowed with autonomous abilities in perception, cognition, decision-making, learning, execution, and social collaboration.
Simultaneously, it seeks alignment with human emotions, ethics, and moral concepts, embodying a multifaceted and ethically resonant approach to AI.
When Tong was showcased, she displayed behavior and capabilities similar to those of a three- or four-year-old child, as reported by the South China Morning Post.
Materials science work at Berkeley that used DeepMind’s GNoMe AI model, and which we wrote about in November, has been called into question here: DeepMind AI GNoMe helps cook up 'novel' compounds – with sides of controversy. The work claimed to have made 41 novel materials, but others are saying they weren’t made, based on the diffraction results presented.
New work on Transforming and Combining Rewards for Aligning Large Language Models helps align models across multiple dimensions and properties. They transform the reward model in a probabilistic way that emphasizes improving poorly-performing outputs and aggregates rewards to favor an output that is “good” in all measured desired properties.
The paper SymbolicAI: A framework for logic-based approaches combining generative models and solvers bring together ideas from formal AI systems with LLMs to enhance reasoning capabilities of LLMs.
“SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI.”
The paper Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding enhances the output of LLMs like GPT-4 from a single high-level prompt input through ‘meta-prompting.’ Meta-prompting is an approach to solving complex tasks / prompts in an LLM as follows:
The LLM breaks down complex tasks into smaller sub-tasks.
Subtasks are then handled by distinct "expert" instances of the same LLM, each given specific, tailored instructions. Experts can include the use of tools, such as a Python interpreter.
The LLM itself acts as the conductor, communicating with these expert models and integrating their outputs.
The LLM employs critical thinking and verification to review, refine and authenticate the final result.
This approach builds on and is similar to prior collaborative LLM techniques such as HuggingGPT and AI agent systems such as Autogen. They both show “the superiority of meta-prompting over conventional scaffolding methods,” and also show that while this works with GPT-4, there are limitations with GPT-3.5, indicating collaborative problem-solving and iterative refinement requires GPT-4-level capability.
AI Business and Policy
Microsoft, Amazon and Alphabet earnings reports this week showed the power of AI. These companies showcased strong financial performance driven by AI adoption, and high-lighted AI initiatives. Microsoft's Azure Cloud, powered by AI, saw a 30% growth, while Google Cloud benefited from AI investments and innovations.
Nvidia is the biggest winner in AI, thanks to their near-monopoly on AI chips. But AMD is challenging Nvidia with new MI300X AI GPUs, a competitor to Nvidia’s H100, and AMD is seeing big demand for the MI300X, forecasting around $2 billion in sales for the AI chip over the coming year.
Bloomberg reports Humanoid Robot Startup Figure AI in Funding Talks With Microsoft, OpenAI. Figure AI hopes to raise up to $500 million.
As these tech giants ramp up spending in AI, both to sell to other companies as well as to help run and simplify their own internal work, they are, at best, slowing hiring in non-AI areas and, at worst, cutting jobs in those divisions.
EU’s AI Act passes last big hurdle on the way to adoption. This has been expected since the ‘deal’ announced in December. Some thoughts on this below.
AI Opinions and Articles
I couldn’t leave the news above about the EU’s AI Act without a little push-back, and the X peanut gallery has generously provided it. First, here’s the self-congratulatory comment from EU Commissioner Thierry Breton:
“Today all 27 Member States endorsed the political agreement reached in December — recognizing the perfect balance found by the negotiators between innovation & safety.”
- EU Commissioner Thierry Breton
Riposte from Preston Byrne:
Perfect balance? A technology is brand new and you really think you got the regulatory balance right on the first try, under circumstances where the EU has no homegrown tech giants because it over-regulated the Internet the first time around?
From Based Alf, a Belgian:
AI is new and its potential is not even fully known.... On what values or data did you base this decision ? What study did you do to get this conclusion?
ZigaTurk says:
If Europe wanted a world class car industry it would not develop it by focusing on speed limits and other traffic signs. Proposed legislation is harmful.
And as another X user put it “open source/science is all you need.”
A Look Back …
The implanting of the Neuralink completed this week is not the first brain-computer interface. According to this History Of Brain-Computer Interface Technology, the EEG (electroencephalography) was invented and used to record brain activity for the first time in 1924. The first use of brain waves to control devices was accomplished in 1965:
In 1965, American composer Alvin Lucier used electroencephalogram (EEG) data and analog signal processing techniques to control acoustic instruments, demonstrating the first usage of what would later be called a “Brain-Computer Interface.” This was the first piece of evidence that brain activity could be used to control machines.