Predictions for 2024
In late 2022, ChatGPT’s release created a viral sensation that sent AI hype to new heights in 2023. Once the novelty of ChatGPT wore off, we put it and other AI models and applications to work for us in both mundane and creative ways. 2024 is the year when AI really makes our lives easier and better in profound ways.
In 2024, we will see more AI models, more advances and features, more quality and capability, more tools and applications than ever. We will see more AI chips, from Nvidia and others, powering the training of new models. We will see AI touch more parts of our lives than ever before. AI truly arrived in 2023, and in 2024, AI will be here to stay, being a part of our daily lives.
Here’s a dive into what advances we can expect in AI in the coming year.
1. AI image generation gets to perfection
AI image generation made huge strides in 2023 in new models, new research and new products that got faster, more accurate, higher-quality and higher resolution.
Generative AI has become a standard feature in image editing tools: Infill or out-fill, up-scale resolution, erase and regenerate a region or object, mix-and-match (style, characters, etc.) from different images. Attention to correct lettering, easier prompting (using Dall-E approach to internally create an image generation prompt from user prompt), and more precision in results will permeate these tools.
In 2024, we will see all these advances consolidate into powerful on-demand AI image generation tools that will be ‘perfect’ in image quality and magically responsive to prompts and commands, combining AI image generation and editing into a single powerful system, obsoleting pre-AI image editors.
Photo-realistic AI image generation is getting so perfect, it passes the visual Turing test; you cannot tell AI versus real photos. AI images are displacing stock photos for advertising and marketing copy. In 2024, convincing deep fake images will become more of a problem, abused in personal harassment, political misinformation, and more.
Generating consistent characters across multiple images will become the norm, and will be used for storyboarding, illustrated stories, creation of AI ‘influencers’ such as AI Instagram models. This will also aid in AI video generation in movies.
3D image and model generation will gain from AI image generation and AI 3D model generation, impacting various fields, such as CAD product design, 3D printing, architecture, and more.
2. Generative AI for music and video gets real
If 2023 was the year AI image generation got really good, 2024 will see the same in video and movies - they will get much better.
Runway Gen2 kicked off AI video generation as a product just nine months ago, and Pika 1.0 and Stable Diffusion Video joined recently, with others like Midjourney to join the fray soon. These AI tools can generate short (5 second), jerky, low-resolution video clips, which is impressive as a novelty, but not ready for ‘prime time.’
By the end of 2024, AI video generation will make leaps and bounds improvement in video quality, speed, length, attention to detail, and adherence to prompts. Within two years, expect Hollywood animation-level quality generations, and AI features permeating professional studio productions in same way computer graphics dominates special effects and animation now.
AI music generation is also on the cusp of going vertical. The recent Suno AI release that can generate whole songs has been a game-changer. We went from interesting but low-quality sound clips to whole songs that sounds like real music - consistent music, lyrics, chords, bridges, and more.
This will only get better. With tighter controls and sound editing, ability to generate based on specific detailed prompts, and use of other modalities, the remixing and creativity options for music is endless.
3. More competition on multi-modal frontier AI models
Google Gemini Ultra++: Google had a chance to beat out GPT-4 with Gemini Ultra, but an honest comparison suggests it is a GPT-4 equivalent, not a GPT-4-beater. Announced in December, Gemini Ultra will arrive soon, later in January. Google won’t stop there; they need an AI model that will finally outpace GPT-4. Look to the next iteration from Google - Gemini Ultra++ - in 6-9 months.
OpenAI GPT-4.5: OpenAI will continue to outpace others, and to do so, it will have to release an improvement to GPT-4 this year. OpenAI will release something big in 2024, but it won’t be GPT-5. OpenAI will release GPT 4.5 in the first half of 2024, and it will be a fully native multi-modal frontier AI model.
The two leading Frontier AI models GPT-4 turbo and Gemini Ultra are now multi-modal, and you can expect all future frontier AI models to be natively multi-modal with text, vision and audio.
Multi-modality and reasoning on images will get much more powerful and yield different interesting use cases in large multi-modal models (LMMs) far beyond the text prompt: Developing a prompt from one image to use in another image; answering a question in one image with an answer in another image, etc. Google will live down their infamous faked up Gemini demo by making audio-video AI interaction mode a real capability of their AI models in 2024.
Anthropic’s Claude 2 managed to get as close to GPT-4 as any competitor, while other players like X’s Grok have managed to match a GPT-3.5 level equivalent. None has matched GPT-4 yet, but Anthropic and others will this year, as “fast followers”.
Now that Nvidia’s new H100s are in the hands of many of the top AI tech companies, expect a flood of new training runs to make ever more capable models. The smart players are absorbing the latest research to optimize training (with FlashAttention, FlashAttention2, etc.), and some will try out new efficient architectures beyond transformers such as Mamba.
The combined effect will be a flood of new very capable models approaching GPT-4 levels, with several leaders exceeding where GPT-4 is today. GPT-4 level multi-modal AI will be a cheap commodity available from multiple AI model providers by the end of 2024.
4. Open-source GPT-4 equivalents will be released in 2024
This is less of a prediction than a matter of roadmaps:
Meta is working on Llama 3. The rumor back in August 2023 was that Meta was targeting Llama 3 to be as good as GPT-4. Meta will release Llama 3 soon.
Mistral announced they will release a GPT-4-level open source model in 2024. Mistral’s “medium” AI model, better than their open source 8 x 7B MoE model, is available via API already. So far, the team has moved fast.
Other players: The UAE team behind Falcon 40B and 180B will continue to release better models, and 01AI, who released the highly-capable Yi-34B model, will continue to release the best open-source AI model out of China.
Apple quietly embraced open source with the release of multi-modal Ferret this fall. Although they have been slow to move forward with generative AI, Apple will surprise us in 2024 with excellent open-source AI models that run on their edge devices as well as major upgrades to Siri.
Open-source goes MoE: Thanks to Mistral’s 8 x 7B Mixture of Experts AI model, called Mixtral, MoE based LLMs have been proven effective. Expect future larger AI models to go MoE, so they can more easily run on smaller devices and do inference more efficiently.
Open source data expands: The Red Pajama V2 30 Trillion token dataset released a few months ago provides the huge data quantity needed for open source AI models to match and go beyond GPT-4. This is ten times the data used to train prior open-source AI models like Llama 2. Quality and multi-modality are also important, so large-scale open source multi-modal datasets (audio, video, images) will be released to help build open source multi-modal AI models.
Synthetic data helps improve AI models: Since the release of Alpaca, we have seen how outputs from a larger AI model can be distilled and used to improve other AI models, in a teacher-student. Synthetic data from AI models themselves can be tuned to improve data quality, used to expand training sets, and bootstrap other AI models to get better. Synthetic data will play a more prominent role in AI model training in 2024 and beyond.
5. Small models become a Big Deal as AI models get more efficient
Efficiency matters. In 2023, we saw multiple smaller models, like Mistral 7B, Phi-2 2B, and Yi 34B outperform prior larger models. Efficiency in models was under-appreciated when “all you need is scale”, but model efficiency will get more attention in 2024. We will see more surprisingly great smaller models in 2024 that train longer on more high-quality data but have fewer parameters.
Small AI models like Phi-2 2B outperformed models 10-50x their size, thanks to very close attention to higher-quality training data input, showing us that Textbooks are all you need. Quality beats quantity, but in 2024 we will see AI models that scale both to get GPT-4 level capability in surprisingly small packages.
Smaller efficient AI models are motivated by inference costs and running AI models on-device. Quantized AI models can cut parameter memory usage significantly at only small performance degradation and have been the key to getting more AI models onto edge devices from PCs to smartphones. More highly capable quantized AI models will be ready for edge devices. The Bloke will keep delivering!
A desire for lower inference costs will drive top tier AI model companies like OpenAI towards efficient AI models based on mixture-of-experts and improves inference techniques. This may include AI model routing, channeling prompt requests to different AI models based on the specific query type, i.e., complex versus simple, topic, etc. Expect to see AI model routing adopted in AI chat interfaces and applications.
The real magic will happen when you combine open-source AI models with expanded higher-quality large datasets (take the Red Pajama 30 trillion token dataset curated and synthesized to a high-quality multi-trillion token dataset), more compute, with architectural refinements. Open source AI models will have GPT-4 level capabilities in a more efficient package. With mixture-of-experts and fast low-memory inference of such AI models, we will get a small, efficient, GPT-4 equivalent AI model running on your local PC or laptop by the end of 2024.
6. Agents, assistants and custom AIs go from toy to tool
2024 will be the year of AI Agents. AI agents rode a lot of hype in 2023, but they were challenged by the limits on AI model reliability. Research has been solving some of these challenges to make AI models and agent frameworks more robust and reliable; chain-of-thought, re-prompting, reasoning, etc. This will greatly improve in 2024, so that agents and agent swarms become much better at solving useful tasks and become part of workflows.
There are many applications for AI agents in business workflows. Thus, enterprise adoption and integration of AI, automating more complex workflows, will accelerate in 2024. Some will say “AI will take our jobs” when they see this, but our view is it will automate the repetitive portion of many jobs, boosting productivity.
Enterprises need custom AIs that understand their corporate data, business goals, internal code base, workflows, etc. Fine-tuned open-source custom AI models with retrieval-augmented generation (RAG) are getting framework. These will have a huge impact on business workflows.
Custom GPTs: The buzz around the GPT store was stolen by OpenAI drama in November. Custom GPTs serve a useful role for both personalized custom AI models and for specialized AIs and character-based AIs. There will be some Custom GPTs, likely specialized AI tools or ‘character’ GPTs, that become ‘hits’ in 2024.
Character-based AIs: Custom GPTs may have some hits around ‘character AI’ but there is a lot of competition in this space from Character AI, Meta and others. Character AIs are useful for education and entertainment, and they will become more prevalent as people use them for advice, self-help, and enjoyment. Expect to see a rise of many “Tutor AIs” in 2024 to act as teaching assistants.
8. AI-based robotics will make major strides
“2024 is the year of robotics,” says Jim Fan, as he shares a video show-casing ALOHA, an open-source robotic platform that stir-fried shrimp and made a meal. In 2024, robots will ride a lot of well-deserved hype as they make major strides in flexibility and generality, thanks to AI.
Robotics has for many years struggled to get robots to reason in ways that goes beyond pre-programmed behaviors, a barrier that limits utility of robots. Now, multi-modal AI is infusing robots with the AI brain they need to perceive their environment visually and to make higher-level decisions, overcoming those barriers. Robots combined with the latest multi-modal AI in 2024 will show stunning improvements.
Humanoid robots are still a few years out from becoming useful, but specialized robots are already making it into warehouses and factories, and robotics will expand every year this decade. Expect AGI-level humanoid robots by the end of the decade.
8. AI Hardware
AI isn’t just about software, but about the hardware. AI will be the talk of CES this month, providing a good preview of what to expect from AI hardware and electronics this year. AI PCs area coming out, AI-enabled devices from the Humane Pin and Rewind to the smart glasses from Meta and others will be showcased. Expect the unexpected in 2024 - AI as close to the user as possible.
9. AI applications for science and healthcare
AI will accelerate science and technology progress in 2024 in measurable and fundamental ways. Just as DeepMind’s GNoME, expanded our material science understanding many-fold in one fell swoop, there will be at least one big AI breakthrough, where AI will crack at least one great problem in science or math in 2024.
AI will also accelerate progress in many incremental ways: Assisting with research, automating data analysis, reducing physical simulations, acting as a ‘search engine’ for all sorts of questions or problems. AI is particularly attuned to help in areas of complex data and understanding, like genetics, bio-technology and physics. Thus, both fundamental science and medicine will benefit.
Adopting AI will boost productivity for researchers, writers, educators, and professionals in many fields. Thanks to AI, the pace of technology evolution and advance is accelerating - we are in the pre-singularity era.
Conclusion
We still don’t know where the ceiling is on scaling of LLMs or the benefits of more data. However, we will not hit that ceiling or run out of data anytime soon. Nor will we see AGI in 2024. I still predict that we will get to AGI before the end of the decade.
So my final predict is that 2024 will be an amazing year for AI, but we are still in the early innings of the AI revolution. There will be more to come beyond 2024. Buckle up, the ride won’t end soon.