AI Week in Review 23.09.16
Apple's iPhone 15 and Watch 9 are AI-powered, EvoDiff protein gen, NeXT-GPT, MADLAD-400 Dataset, Databricks raises funds, Zuck on open source AI
Cover Art is a snapshot from a cool AI-generated video from SandyToes2211.
Top Tools
HeyGen enables AI-powered video generation from scripts, using video generation with avatars that lip-sync very well. One user used HeyGen to generate a translated video of them speaking French, with a convincingly good lip-sync. Another tool for content creators.
AI Tech and Product Releases
Apple 2023 launch announcements for the iPhone 15 and Apple Watch 9 included a lot of improved AI under the hood:
Apple Watch Series 9: It has a powerful new chip with twice as fast 35 trillion OPS neural engine and beefier GPU. Apple Watch can run AI models on their on-chip neural engine. This processes more Siri requests on the Watch instead of in the cloud, resulting in faster responses.
iPhone 15: Without highlighting the AI that powers them, Apple touts many AI-driven features: Voice isolation features, live voicemail transcription, personalized voices, more accurate predictive typing and camera technology features that requires AI.
One takeaway: "Apple doesn't like mentioning AI on conference calls or product events, which has led to speculation that the company is falling behind when it comes to capitalizing on the new paradigm. The reality is Apple is aggressively pursuing AI." - Gene Munster, Deepwater Asset Management.
Adobe announced its AI application Firefly is out of beta and is fully available in the company’s Creative Cloud apps.
More pre-release progress on Google’s Gemini: “Google has given a small group of companies access to an early version of Gemini,” The Information reported.
Microsoft open sources EvoDiff, a novel protein-generating AI. EvoDiff is based on a 640 million parameter diffusion model, similar to modern image-generation AI models. Kevin Yang, co-creator of EvoDiff, said, “We envision that EvoDiff will expand capabilities in protein engineering beyond the structure-function paradigm towards programmable, sequence-first design.”
AI Research News
Würstchen is a fast diffusion model for text-to-image generation that uses a fraction of the compute used to train and generate AI images compared with Stable Diffusion. The novelty of their method is using a VQGAN to compress images into a small latent space to achieve a high amount (40x) of compression, which in turn makes the process much more efficient.
Shared via arankomatsuzaki on X(twitter), the paper Generative Image Dynamics presents a way to create natural movement of items in images. The visual impact of this is stunningly realistic AI generated movement of objects in images. It’s worth checking the video to see.
NExT-GPT: Any-to-Any Multimodal LLM is an any-to-any multi-modal LLM system, that combines other LLMs and other AI models to create an end-to-end multi-modal AI model that can “perceive inputs and generate outputs in arbitrary combinations of text, images, videos, and audio.” The additional tuning for the combined system is low-cost (about 1% of the total parameters) and shows a path to building an “AI agent capable of modeling universal modalities.”
If nothing else, the NExT-GPT example suggests that a multi-modal AI model could be developed as a federated AI model combining smaller sub-models, and need not be a singular unified foundational AI model.
Google researchers have created MADLAD-400: A Multilingual And Document-Level Large Audited Dataset, that was carefully and manually audited from text data from CommonCrawl, that spans 419 languages. In addition, they trained and released “a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages using publicly available data, and find that it is competitive with models that are significantly larger.”
This working paper about AI’s impact on work is a mouthful: “Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality.” Their results show that AI covers an expanding but uneven set of tasks in improving worker productivity, and that it can enhance productivity “without substantial organizational or technological investment.” Bottom line: AI helps productivity.
We suggest that the capabilities of AI create a “jagged technological frontier” where some tasks are easily done by AI, while others, though seemingly similar in difficulty level, are outside the current capability of AI. For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities, consultants using AI were significantly more productive (they completed 12.2% more tasks on average, and completed tasks 25.1% more quickly), and produced significantly higher quality results (more than 40% higher quality compared to a control group).
In “Large Language Models as Optimizers,” researchers show how LLMs can solve optimization problems via Optimization by PROmpting (OPRO), where the optimization task is described in natural language and solved iteratively. This is used to solve multiple optimization tasks such as linear regression, but can be used on prompt optimization itself, so the LLM can bootstrap better performance out of the LLM:
With a variety of LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
AI Business and Policy
Databricks raises $500M more, at a valuation of $43B. Maintaining this high valuation is a testament to the data analytics company’s moves to embrace AI as well as ramping up sales, with a latest quarter revenue surpassing a $1.5 billion run rate.
In another big funding round, Defense AI startup Helsing breaks the record for European AI, raising $223M:
The investment could potentially make Helsing the largest European AI company and also the largest European defense tech unicorn. In June, France’s Mistral AI raised a $113 million seed round at a $260 million valuation. The pre-money valuation for Helsing was said to be €1.5 billion, but post-money would put that figure at over €1.7 billion.
LastMile AI, a platform to develop AI applications and integrate generative AI models into them, has raised $10 million in a seed funding round.
Coca-Cola claims their latest flavor “Y3000” had help from AI to to help determine the flavor and packaging. I don’t think it will help sales to say that.
Replace our employees with AI? Let's replace our CEOs with AI. This op-ed argues that the CEO role is broken, and AI could fix that.
This week was Congress’ week of AI. An AI Summit hosted by Senator Schumer brought several Tech moguls to Capitol Hill, including Nvidia co-founder Jensen Huang and former Google CEO Eric Schmidt, and they brought a variety of perspectives:
Elon Musk, famously an AI doomer, “reiterated his stance that AI threatens humanity.” He called for a "referee" to "ensure that companies take actions that are safe and in the interest of the general public."
Bill Gates “went full tech evangelist reportedly saying that generative AI systems will—somehow—end world hunger.”
Mark Zuckerberg, in his AI Forum remarks, pointed to two primary concerns, AI Safety and access to AI, and defended Meta’s approach to open AI tools, advocating for access to AI technology and US leadership in open AI development.
“Having access to state-of-the-art AI is going to be an increasingly important driver of opportunity in the future, and I think that’s going to be true for individual people, for companies and for economies as a whole. … I think it’s important that America continue to lead in this area and define the technical standard that the world uses. … It’s generally accepted that open source software is safer and more secure, because more people can scrutinize it to identify issues and then share and propagate solutions that can then be used to harden systems.” - Mark Zuckerberg
AI Opinions and Articles
A high school student asks: Why aren’t more teachers embracing AI?
“But as a senior in high school who has been using ChatGPT since January of this year, I view it as an essential tool in education that must be incorporated into curriculums. AI chatbots have already made me a stronger student, and I have had no formal training with them.”
DeepMind’s Mustafa Suleyman in another interview: “Generative AI is just a phase. What’s next is interactive AI. … conversation is the future interface … instead of just clicking on buttons and typing, you’re going to talk to your AI.”
A Look Back …
The news on Microsoft’s EvoDiff is an advance in the development of AI for biology that builds on prior work in applying AI to the fields, in particular the ground-breaking accomplishment by DeepMind in cracking the Protein Folding problem with Alpha-Fold.
DeepMind’s efforts to solve the protein folding problem was a five-year quest and began in earnest in 2016. After they were able to beat the world’s human Go champion with AI, they were ready for this challenge and formed a team.
In 2018, DeepMind entered AlphaFold for benchmarking in the 13th Critical Assessment of Protein Structure Prediction (CASP13), and it placed first.
DeepMind continued to improve their models and in 2020, AlphaFold2 was evaluated in CASP14, won the benchmark evaluation more importantly was “recognised as a solution to the 50-year-old protein-folding problem” for its high accuracy, which was comparable to experimental methods.
In 2021, DeepMind published the paper Highly accurate protein structure prediction with AlphaFold, released the code for AlphaFold2.
In 2022, DeepMind announced in AlphaFold reveals the structure of the protein universe that they were releasing a database of 200 million protein structures:
In partnership with EMBL’s European Bioinformatics Institute (EMBL-EBI), we’re now releasing predicted structures for nearly all catalogued proteins known to science, which will expand the AlphaFold DB by over 200x - from nearly 1 million structures to over 200 million structures - with the potential to dramatically increase our understanding of biology.
As a big step forward in our understanding of biology, Alpha Fold has been the most important scientific advance credited to artificial intelligence so far. It suggests there is a huge role for AI to play in scientific progress yet to come.