Agent Intelligence
After the release of GPT-4 in March, there was a lot of hype and hope around agents built on GPT-4. BabyAGI, AutoGPT, and others showed some promise in putting GPT-4 in a loop to perform autonomous complex actions. Since then, a number of frameworks such as AutoGen have been released, moving towards capable AI Agents.
OpenAI’s launch of custom GPTs changes everything about AI agents. This is the next “viral moment” for OpenAI since the ChatGPT Moment almost a year ago, and the reason for it is similar: They made building GPTs really easy. OpenAI has now lowered the barrier to custom Agents and bots, thanks to the natural language interface of ChatGPT itself.
As we’ll explain further, and as you can and should try out yourself, practically anyone can create a custom GPT. GPTs can be built in a ‘no code’ way, not because of a limited framework, but because powerful AI can do the GPT building process as a GPT-4 customization. Literally anyone can create a custom GPT.
Trying GPTs
From the OpenAI explore tab, the “My GPTs” are brought up, and a number of OpenAI-made GPTs are available. The OpenAI published GPTs are eclectic, with a heavy emphasis of helpful and playful GPTs. Trying some of them out:
Game time GPT - Rules for games; it was able to play tic-tac-toe, using markdown.
Coloring Book Hero - Creates coloring-book style drawings based on a prompt; you can think of this as an image generation prompt or style.
Math Mentor & Creative Writing Coach - Sam Altman mentioned a tutoring chatbot in his presentation, and it does seem that coaching, mentoring, tutoring and education are good use cases. The reason being, you can add specific information to ensure reduced hallucination and clarity on a topic.
Data Analysis GPT - It had trouble reading both an XLS and CSV. I went back to GPT-4 data analysis, to try it. At the same point of reading, it struggled, along the way recognized it as tab-separated, found the mistake in the code for analyzing, and reported:
“I should have used the
StringIO
class from theio
module to read the string as if it were a file. Let's correct this and try parsing the sections again. The 'Combined accounts' section has been successfully parsed into a DataFrame.
The bottom-line here is that many custom GPTs will have just a sub-set of what fully-loaded GPT-4 has under the hood, and in the case of Data Analysis GPT, it seems it was crafted to just have that capability.
What can custom GPTs do?
Are these just customized chatbots? One one level, they remain like chatbots. They aren’t fully autonomous, they don’t iterate automatically, and so they are not a full ‘fire and forget’ Agent. They retain the chatbot text-based and image-based interface.
If a custom GPT were just is a subset of what GPT-4 can do, that would only be marginally helpful. You can add precision in behavior due to customizing the personality, so customizations are enough to be useful. But there is a lot more.
GPTs have all the capabilities of GPT-4 turbo at their disposal and then some. First, it also can add specialized knowledge and data useful for specific use cases, as well as set up the system prompt to leverage that specific knowledge. Beyond GPT-4 itself are features on hand: Code Interpreter for executing code, image reading and understanding, reading in various file types, like PDF. DALL-E is on hand to generate images. The ability to connect via web browsing and via Actions, which are API interfaces to connect to the LLM to get services.
GPTs therefore have more than a language interface, they have memory and execution capability that can make them workflow tools. Call them General Information Flow or Workflow Bots:
Information in (language text, voice, image, data files) —> Information / Execution out (images, text, actions)
So custom GPTs can do far more than just act as custom chatbots. They can encompass anything GPT-4 turbo with vision can do, which is quite powerful in its own. They are customized GPT-4 turbos but with even more. You add in the power of connecting knowledge, web browsing, code execution, and other actions and you have the makings of a basic yet powerful Agent.
What GPTs To Build
we were planning to go live with GPTs for all subscribers Monday but still haven’t been able to. we are hoping to soon. - Sam Altman on X, 7/7/23
After DevDay, OpenAI has had challenges of outages and capacity, including a DDOS attack last week. It took many of us several days to get access of the GPT builder, but now it’s generally available. With just about a week of custom GPT building being available, there are already thousands of custom GPTs popping up.
What would you want to build with GPTs? Some ideas come to mind:
Financial Tracker Bots - If a bot could take care of going through my bank statements and credit card statements and coming out with a financial report of expenses, that would be nice. Wes Roth in a YouTube video builds a receipt-reading track bot that uses image recognition to read receipts just from a picture then report expenses. Threads, the long-term storage of memory, can keep information persistent, so you can track your expenses and enter receipts one-by-one.
What is this thing/plant/car part/widget Bot - This combines image recognition with a special-purpose twist. OpenAI on DevDay said that custom GPTs will be available on mobile devices, so you can imagine that any custom GPT that relies on image recognition can simply get fed a picture taken on your phone. What I’ve done sometime in the past when I saw a Texas plant or snake I wasn’t sure about, I’d share my pic on social media. Now we have some custom GPT be our botanist, car mechanic, or other expert recognizing an issue or item from a snapshot.
Research Assistant Bot - Some of my uses of Bing Chat and some of the “PDF reader” bots built as a thin layer on GPT-4 are most useful for tasks such as science paper summarization. It would help me to read in a paper and get out: 3 sentence and longer summaries in non-jargon layman terms; Extended summary or full abstract; key terms; ACM citation (to add to bibliography); science context; images of key figures and tables. Addition to knowledge base.
Business Assistant Bot - This kind of GPT would help send emails, set up meetings, organize your todo list, send slack or discord messages, etc.. The Zapier API connector to many services can act as a Swiss Army knife that will make many GPTs very useful; using Zapier in a custom GPT was presented in OpenAI’s DevDay.
Informational Bot - Whether it’s your car mechanic, your therapist, your business onboarding and HR advisor, customer service, your legal advisor, an informational bot could be the simplest of the ‘design patterns’ for bots, but also very general. There could literally be thousands of these.
This informational bot is perhaps the real ‘low-hanging fruit’ use case, because for the past decades such chatbots already exist and are useful, but they had to be hand-coded. OpenAI has just obsoleted prior chatbot technology.
Creating GPTs
The GPT creation process is easy:
Go to OpenAI’s ChatGPT interface, click on upper left link “Explore” to get to custom GPTs page
At the top, Click on “Create GPTs” and you will get a Create GPT interface.
GPT Builder will talk you through the GPT Builder ‘create’ process. It will let you enter your information in the chatbot by describing in English what you want.
You can also click on “Configure” to update or change any of the GPTs characteristics, name, icons, pre-set prompt questions, etc.
You can add knowledge by uploading informational files. Some of the features like actions may require knowing API specifics, depending on the use-case, but beyond that, everything can be done via the natural language by talking to the GPT Builder bot itself.
I have now built a few GPTs. For example, I built an informational GPT on DeSantis, with custom instructions and loading some DeSantis information. While it got off to a promising start, OpenAI Usage Policies forbid political campaign uses for GPTs, so you can’t see it. You will be spared a profusion of campaign GPTs.
They also limit GPTs in other ways. They limit medical, legal, and financial advice, for legal liability reasons, and won’t help with NSFW and privacy-violating apps. Check with the Usage Policies if you have questions.
This is still early days, but already there are many creative and useful GPTs out there in the wild. We will write a run-down on all the GPTs out there in a follow-up, as well as a more complete GPT builder how-to, as both topics deserves their own full report.
One More Thing - Assistants API
Custom GPTs lower the barrier for building custom AI bots, but that’s not the only big OpenAI announcement from DevDay. The path for developers will be Assistants API. This takes the same capabilities that custom GPTs have but allows them to be embedded in larger flows that have yet more capability.