AI Week In Review 24.04.27

SenseNova 5.0, Snowflake Arctic, Meta SmartGlass update, custom models for Bedrock, Firefly Image 3, Phoenix Gen 7, Rabbit R1 reviewed, Phi-3. Funding for X.AI, flexAI Cognitive Labs, and Augment.

Apr 28, 2024

Figure 1. Vidu AI video generation screenshot.

AI Tech and Product Releases

Chinese tech firm SenseTime announced SenseNova 5.0. SenseNova 5.0 is a mixture-of-experts LLM with a 200k context window, and it was trained on more than 10 trillion tokens of data. It deserves attention for its claims of beating GPT-4 Turbo across nearly all key benchmarks, including MMLU (84.8), HumanEval (78.0), and GSM8K (92.5).

SenseTime’s model is for the China market, so it won’t be relevant to US consumers, but it is another example of yet another AI competitor achieving a GPT-4-class AI model.

Figure 2. SenseNova 5.0 performance on benchmarks shows it outperforming GPT-4.

Snowflake released Arctic, an enterprise-grade open LLM for complex business tasks. Snowflake’s Artic delivers top-tier performance in SQL generation, coding, and instruction-following benchmarks. They tout the efficiency in both training and inference from using a differentiated MoE design of 128 different experts, specifically:

Arctic combines a 10B dense transformer model with a residual 128 x 3.66B MoE MLP resulting in 480B total and 17B active parameters chosen using a top-2 gating.

A mixture-of-experts architecture with 128 expert is more extreme than other MoE models (such as Mixtral 8x22B or 8x7B), but Snowflake shows it’s a win for efficiency, by achieving the same performance on their desired benchmarks with a fraction of total training compute. Their model is an open AI model available on HuggingFace.

Figure 3. Arctic achieves high enterprise intelligence with efficient training.

Meta adds Styles and AI feature updates for their Ray-Ban Smart Glasses. Meta has added new styles to the Ray-Ban smart glasses collection, including Skyler and Headliner models. In tech and AI features, they are adding video calling with WhatsApp and Messenger, enabling AI instructions like “play music”, and are rolling out Meta AI with Vision which can understand and process visual surroundings.

Amazon Bedrock adds Custom Model Import for Generative AI, in order to host companies’ custom generative AI models. Amazon Web Services (AWS) launched the Custom Model Import feature in Amazon Bedrock to enable organizations to import and access their custom generative AI models as fully managed APIs. This provides proprietary models benefit with the same infrastructure in Bedrock as leading generative AI models, such as Llama 3 or Claude 3.

Canadian robot maker Sanctuary AI launched Phoenix Gen 7, the seventh generation of its humanoid robot. It has improvements in its physical design, AI capabilities, and how it's trained:

The time it takes for new tasks to be automated has gone from weeks to less than 24 hours, marking a major inflection point in task automation speed and autonomous system capability

Perplexity launched Enterprise Pro, a B2B solution that promises to revolutionize enterprise research with verifiable answers and multimedia. The company also announced raising $62.7M and expanding strategic partnerships.

Adobe announced Firefly Image 3 Foundation Model, a new version of their image generation AI model that delivers high-quality images with improved detail and variety and better understanding of prompts. Firefly Image 3 is available in beta in Adobe’s Firefly web app.

Cohere releases toolkit to accelerate generative AI app development in the enterprise. The Cohere Toolkit is a collection of pre-built components that make it easy to create and deploy RAG applications, available on GitHub.

ShengShu-AI and Tsinghua University announced Vidu AI, a video generation AI model that can create 16 seconds long HD video with 1080p resolution. Here are 10 wild examples on X of what Vidu AI can do. See our cover art for a still of one video.

Top Tools & Hacks

The AI gadget Rabbit R1 has arrived and gotten early tests and reviews. Reviewers have complimented the $199 Rabbit R1 device for its design, done by Teenage Engineering, and its basic AI functionality and connectivity. However, it just can’t do much; the main complaint is “where are all the features we were promised?”

Since it’s not quite ready to live up to its own hype, this week’s top tool is not the Rabbit R1, but instead the tiny but mighty Phi-3, which is small enough to fit on your smartphone. I run Phi-3 locally via ollama.

AI Research News

AI research highlights from this week’s AI Research Roundup:

Apple’s OpenELM: An Efficient Language Model Family
Phi-3 Technical Report: A Highly Capable Language Model on Your Phone
AutoCrawler: A Web Agent for Web Crawler Generation
MHMoE: Multi-head mixture of experts
AI and the Problem of Knowledge Collapse
From r to Q∗: Your Language Model is Secretly a Q-Function
CT-Agent: Clinical Trial Multi-Agent with LLM-based Reasoning

The biggest contribution to open AI model development this week was the release by HuggingFace of the FineWeb 15 trillion token dataset, which we covered in our “Data Is All You Need” article.

AI Business and Policy

Apple could use its own on-device LLM for AI features in iOS 18 says one report; Apple Intensifies Talks With OpenAI for iPhone Generative AI Features says another. Apple rumors keep swirling, but given Apple’s moves on small AI models (REALM, OpenELM, DarwinAI), and lack of GPT-4 LLM in-house, both moves makes sense:

AI features in iOS 18 will run on-device, ensuring data privacy and faster response times.
Apple may partner with cloud-based services for AI features that require more data processing.
iOS 18 will also offer new features like customizable app icons, RCS support, and upgrades to Apple Maps and Notes.

The real story will be known in June at Apple’s WWDC.

Startup funding news:

Elon Musk is reportedly close to securing $6B in funding for xAI, valuing the year-old AI startup at $18B. Notably, on a recent X space, Elon said they'd need 100,000 NVIDIA H100s to train Grok 3.0 (they currently have roughly 20,000).
AI startup Cognition announced a new funding round valuing the 6-month-old company at over $2B. This is despite a backlash over the ‘Devin’ AI coding agent demos.
French startup FlexAI has raised $30 million in funding to “rebuild computing infrastructure” and facilitate AI application development and training.
Nvidia is acquiring AI infrastructure firm Run:ai, which does GPU orchestration for AI workloads, for approximately $700 million to enhance its DGX Cloud AI platform.

AI coding assistant startup Augment, a GitHub Copilot rival, launches out of stealth with $252 million in funding.

Greg Brockman announced on X: “First Nvidia DGX H200 in the world, hand-delivered to OpenAI and dedicated by Jensen.”

Coca-Cola and Microsoft announced a strategic partnership to accelerate cloud and generative AI initiatives. This five-year, $1.1 billion strategic partnership aims to transform Coca-Cola's technology, using Microsoft Cloud.

Moderna and OpenAI promote their collaboration, with Moderna touting how OpenAI's generative AI improves Moderna's operations. OpenAI has a video showing how Moderna’s use of OpenAI accelerates the development of life-saving treatments by accelerating treatment development.

In AI crime news, High School Athletic Director Uses AI to Frame Principal for Making Racist, Antisemitic Remarks. An athletic director used AI to make a deep-fake audio recording of the principal, then spread it to others so it went viral. The fakery was uncovered and the perp arrested, but only after much damage was done and the principal was temporarily removed from his post.

Catholic Answers Renames AI Chatbot After Backlash. An AI bot that took the persona of a Catholic Priest had to be ‘defrocked’ of his AI-imagined post after hallucinating answers and hearing confession. In one case, "The AI priest also told one user that it was okay to baptize a baby in Gatorade." The AI bot is now a lay theologian, so it can give bad answers without it reflecting on the priesthood.

AI Opinions and Articles

New York Times tells us Meta’s A.I. Assistant Is Fun to Use, but It Can’t Be Trusted:

Despite Mark Zuckerberg’s hope for the chatbot to be the smartest, it struggles with facts, numbers and web search.

It’s true, but that’s not unique to Llama 3; any AI chatbot is not a search engine.

Reid Hoffman created a digital replica of himself called REID AI that can look, talk, and sound just like him. Then he interviewed his AI digital twin. The interview is here and covers several topics in and around AI. It’s an impressive demonstration.

AI Changes Everything

Discussion about this post