As more more AI models and LLMs roll out, with new AI models and fine-tunes getting released almost every day, it’s becoming more difficult to keep track of them all, and even more difficult to determine which are worth looking into or using. So I thought a cheat-sheet to keep track of the current AI models that are worth your attention right now would be helpful.
Here’s a listing of current main AI Chatbots and LLMs that you can use, with links to access them.
LLMs and AI Chatbots
For a consumer or business user wanting to use a leading-edge AI chatbot or LLM, GPT-4 is the obvious current leading LLM, but there are a number of other great options:
OpenAI: ChatGPT and GPT-4 are available via OpenAI’s chat interface, and signing up to chatGPT Plus can give you all of the OpenAI features, including plugins and data analytics combined with GPT-4. Business users can also get the Enterprise version. Scoring 87 on MMLU, GPT-4’s prowess is thus far unmatched.
Microsoft: Bing Chat (in Microsoft Edge) is still the easiest free way to use GPT-4 and combine it with a search engine.
Google: Bard chatbot is available directly. Google is also offering Duet AI within Google Workspace, to power many Google applications with AI.
Anthropic: Claude and Claude2 with 100k context window are competitive alternatives to chat-GPT, and the larger context window, far larger than from GPT-4 currently, makes it good for long text summarization and other things chatGPT cannot do. Claude also has a Claude-pro version.
Poe.com: Launched earlier this year, Poe.com has access to a number of AI models that they offer via a thin-layer interface: chat-GPT, GPT-4, Claude-instant, Claude-2-100k, Llama2, Code Llama, other AI models and custom models. You can create your own customized bot, save history, etc. This is useful for trying out various models.
You.com: An AI assistant that you can connect on the web or as a Chrome, with YouChat, YouWrite, YouImagine, YouCode. YouChat is now a GPT-4 interface.
FastGPT is a fast ‘answer engine’ with search under the hood, a useful AI / search hybrid, reviewed here.
Open AI Models and LLMs
The HuggingFace Open LLM leaderboard provides benchmark metrics for open LLMs, giving a good sense of which open AI models are most capable. This is a good resource for checking open LLM model creation activity and their quality.
Currently (Sept 2023) the top of the open LLM leaderboard is dominated by various fine-tunes of the Llama 2 70B model. For example:
ORCA QLoRA fine-tune of Llama-70B gets 70.2 on MMLU.
Platypus QLoRA fine-tune of Llama-70B gets 71.0 on MMLU.
Platypus2-70B-Instruct-GPTQ by the Bloke gets 69.9 on MMLU.
The LLaMA2 Wizard 70B QLoRA gets 69.0 on MMLU.
StabilityAI’s StableBeluga2, Llama2 70B model finetuned on an Orca style Dataset, get 68.8 on MMLU.
The leading open base AI models are Falcon 180B and Falcon-180B chat as well as Llama2 70B chat and Llama2 70B.
Other open LLMs of note available via HuggingFace: MPT-30B, Falcon-40B, Llama2-13B chat, Koala, Vicuna 13B.
Leading coding models: Phind’s Code Llama 34B “fine-tuned on an additional 1.5B tokens high-quality programming-related data, achieving 73.8% pass@1 on HumanEval.”
To get chatbot access to a number of open LLMs you can go to chat.lmsys.org. It has a web chatbot interface and can connect you to:
Llama 2 and Code Llama by Meta.
Vicuna: a chat assistant fine-tuned from LLaMA.
WizardLM: an instruction-following LLM using evol-instruct by Microsoft.
ChatGLM: an open bilingual Chinese-Englis dialogue language model.
In addition, https://chat.lmsys.org/?arena allows you to chat with two anonymous models side-by-side and vote for which one is better.
Running Open LLMs on your local CPU
Aside from running an AI model via a cloud-hosted API or chatbot AI interface, you can run an AI model locally. If you have a laptop, e.g., Macbook, or desktop computer with sufficient power and memory, you can run some of the available open LLMs on your machine, and you don’t even need a GPU. The Llama.cpp project has been able to enable inference of many model in pure C/C++ on CPUs by quantizing models to 4 bits, even up to the 180B Falcon model.
For example, you can take a model like Platypus2-70B-Instruct-GGUF, download the GGUF format model parameters, and plug it into text-generation WebUI , a “A Gradio web UI for Large Language Models” that supports GGUF, and run it locally and interface to it from your browser. We will go into more details on this in a followup.
Using a specific model on your local data as a local AI assistant is very useful. Even if the model is less general or powerful, this flexibility of open LLMs is very beneficial.
AI Models in Development
I labelled this Q3 2023 Edition at the outset. I recognize that AI models and applications are evolving at a fast-moving pace. The best chat-bots and AI coding models right now include both proprietary models like Claude2 and open models like Code Llama and Llama 2 that were released in the past two months, and the best models in the next quarter will likely be new ones.
As a reference, our earlier article from April Accessing AI models: Answer engines & LLM chatbots covered the then-available LLMs, AI models and their access and availability. The article Opera AI Brower Features and the Rise of Answer Engines covered the new way of interacting with search engines, now answer engines.
The AI landscape will continue to evolve quickly. We can anticipate new foundation AI model announcements from Meta, Google’s Gemini, and possibly an AI model from Apple called Ajax, all by end of year. OpenAI will have their first developer conference in November; will they finally open up GPT-4 multi-modal for general availability or go beyond 32K to create an extended context version of GPT-4?
On the open source front, expect a continued flurry of fine-tuned models from many developers, because the cost is minimal. AI2 Labs announced Dolma, a 3 trillion token dataset to support their open LLM called OLMo. At 70B parameters, it might not be better than Llama2, but will be a more truly open AI model. Other open LLM projects and organizations like MosaicML, Together (Red Pajama), and Abu Dhabi’s Technology Innovation Institute (TII), the group behind the Falcon models, have the capability of spinning out more models in coming months as well.
So, as with everything in AI, stay tuned - it’s only getting better.