Using AI Agents pt 1: BabyAGI, AgentGPT and Beyond
Checking in on the rapidly evolving space of AI Agents
Some of the most interesting and viral eye-popping developments to come out of the AI explosion in recent months has been various AI agent developments.
The hype is understandable. Getting an AI agent to automatically order something online, write a technical paper, create a website, offer a marketing plan, write code and then fix its bugs iteratively, or perform other complex real-world tasks opens up a world of possible productivity revolutions. It excites the imagination to see the power of AI to change our daily work and daily life in ways that go far beyond chatting with an AI.
This has been area of rapid development and change since March, when some of the first popular GPT-4-based AI agents came on the scene. There have been many different developments and iterations of releases. We have seen the release of BabyAGI, AutoGPT, AgentGPT, GoalGPT, Microsoft Jarvis/HuggingGPT, and most recently SuperAGI.
These are just some of the more popular and note-worthy. There are many others, many of them open source and available on Github for other developers to try out, modify and create their own versions. It’s early stages, and the hobbyists, startup geeks and developers are still working out many kinks.
Are these intelligent agents ready for real use by end-users? While there’s been many enticing stories of agents doing incredible feats, these tools have been known to rack up API costs, run away from users intent, get in infinite loops, and be generally unreliable. It’s a sorcerer’s apprentice scenario. Ready for “let’s give it a try” nothing-to-lose use, but not anything mission-critical yet.
Here’s a rundown on some of the AI Agent tools, how to access them, and how well they perform:
BabyAGI
BabyAGI was created by non-programmer investor Yohei Nakajima, who in a remarkable inception-like manner used GPT-4 iteratively to do all the work of coding the AI Agent framework. BabyAGI’s remarkable story, which we covered in an April article on AI Agents, kicked off the AI Agent hype.
The BabyAGI system itself is simple: Give the system a large task; the LLM (GPT-4) acts as a planner (task creation Agent) and breaks it down into sub-tasks; then the system prioritizes the task (with a task prioritization Agent); then it calls upon the LLM/GPT-4 (as task execution Agent) to execute each sub-task one-by-one.
All current AI Agents including AutoGPT use similar constructs: Planning; storing partial results and history (typically using a vector database); executing using the LLM as a subroutine; reviewing, checking, and revising those results.
The beauty of it all is that GPT-4 is powerful and general enough to be the execution engine for all three Agents, and all it needs is to be prompted appropriately. One powerful-enough Foundation AI model is All You Need.
To setup and use BabyAGI you’ll need to be comfortable with using software development tools and:
The code is on Github BabyAGI repo. You download it using
git clone.
You can run the python script directly or set up a docker image to run.
You need to bring your own OpenAI (or open source LLM) API keys to make it execute the LLMs.
You can roll-your-own development by forking the code or using your own LangChain BabyAGI. The script is fairly lightweight, so it’s not out of reach to Python programmers and why many variations have spawned.
While it’s worth checking out as the first step on the AI Agent journey, it’s neither the easiest nor most capable AI Agent to try, so if you are looking for a useful tool, move on.
AgentGPT
AgentGPT is AutoGPT directly in the browser, with the pitch “Assemble, configure, and deploy autonomous AI Agents in your browser.”
One of the nice things about AgentGPT is how easy it is to use, thanks to the browser interface. You sign in, give it a prompt and go. At the free tier, it is connecting to GPT-3.5, but the paid tier ($40/mo) gives you access to GPT-4.
You can give it a fairly complex task, and it will break it up into bite-sized chunks. It does it in the linear order, with a fixed plan of prompts, so in terms of capabilities, it is a bit like babyAGI.
AgentGPT has connections to a few tools: Search, code, and image generation. It seems to be good for asking various multi-part questions or detailed planning support tasks: market trends, resource factors, budgets, and writing and research tasks. As far as real-world interactions of powerful plug-ins, it’s not there yet.
Asking for a two-week vacation travel plan, it was able to do the research and then report back with a plausible plan.
I have so far been only able to get it to do in browser things. When I compared its results to Bing Chat for my “travel plan” query, the Bing Chat answer found a link to a page via a search engine result that answered the query just as well. Results didn’t knock my socks off, but perhaps that’s due to using only using the free tier, so I’m getting GPT-3.5 iterative results.
In conclusion, AgentGPT is interesting and helpful enough to be worth a try, especially since trying it out is easy and free.
Conclusion
These AI Agent tools are still in the ‘bleeding edge’ early-adopter stage. They are semi-useful toys now, and are rapidly getting better and more reliable; each week, there is a new iteration, new research or new tool in this space. For example, last week’s progress report noted the AI research on Voyager, a learning AI Agent in the Minecraft.
As they improve, these generative AI Agents will drive AI’s impact into all types of work and every industry. Already, AI tools have been shown to boost worker productivity by 14% at a Fortune 500 company according to Stanford research. This is just the start. AI Agents is the paradigm for how we will leverage AI for automating how we work, and as such, will become the most important aspect of the rise of foundation AI models.
There are many more AI Agents and projects to cover and review than we can do in one article. So I’ll discuss several of them - AutoGPT, GoalGPT, SuperAGI, and others - in a followup article. Stay tuned for Part 2!