The Problem with Prompting
When I started “AI Changes Everything”, one of my earliest articles declared that: “The prompt is the interface.” This has been the cornerstone of generative AI, both for LLMs and for text-to-image and text-to-video AI generation models.
The AI-enabled natural language interface, like the smartphone touchscreen and the GUI (graphical user interface) before it, allows for new interactions and applications.
Natural language as an interface is liberating and massively expands what is possible, allowing a truly general application: You can ask it any question, write on any topic, analyze any article or document. However, the natural language prompt’s broad expansiveness yields its own set of problems as an interface.
It’s tedious: Typing is tedious. When anything is possible, the specifics need to be explicit. It’s a lot more work to choose words than choose from a radio button list.
It lacks guidance: There is no guidance, clues or indicators to tell you your prompt is correct. Unlike a slider, pull-down or other GUI interface, you have to work out what the right input is. Only trial-and-error of running it through the generative AI model tells you if you are on the right track. Iterative trial and error prompts multiplies the tedium.
It’s vague: Language has some level of precision, but expressiveness can also create vagueness, especially if you are not sure what you are looking for. AI models themselves are black boxes and cannot read minds (yet).
Prompt Engineering
From a user experience perspective, the prompt interface is both a joy and a trap. What are some solutions? The answer lies both in what we can do as users, and what AI can do.
We are still learning how to best interact with AI models. Any new tool or interface takes some getting used to, but our prior experience with search engines and dumb chatbots leads us astray. I’m guilty of this, writing short keyword-laden prompts and hoping the LLM will get the topic hints and my drift and spit out what I’m looking for.
The vast options in inputs and the unreliability of AI outputs led early on to the concept of Prompt engineering, crafting input prompts to guide AI models like GPT and DALL-E to produce desired outputs efficiently. Then, when some good prompts seem to hit the mark, the idea of prompt templates, prompt libraries, and prompt sharing took off.
An example of this is applying prompt engineering to a document writing workflow for tech writers, breaking a complex problem down into steps and guiding the AI to assist on each step - outlines, note-taking, research, word-smithing, etc.
There’s been a whole cottage industry built around prompt engineering, and it will continue, but we may be entering a new phase of the natural language interface. An essay last June noted that “future generations of AI systems will get more intuitive and adept at understanding natural language, reducing the need for meticulously engineered prompts.”
Getting the right prompts to elicit a desired answer from AI is a language and text task just like any other, and with a sophisticated enough AI model we can apply AI itself to the task. There’s a few ways AI can help:
AI prompt re-writing and magic prompts: Reformulating human input prompts into better prompts for more effectiveness.
AI prompt generation: The AI generates the text details needed for a complex task prompt.
RLHF and fine-tuning to adapt the AI model to various styles of queries, in effect, bringing the AI to understand whatever the human input is.
Overemphasizing the crafting of the perfect combination of words can even be counterproductive, as it may detract from the exploration of the problem itself and diminish one’s sense of control over the creative process. Instead, mastering problem formulation could be the key to navigating the uncertain future alongside sophisticated AI systems. - Olug Azar
Magic Prompts
The Ideogram interface for AI image generation shares several elements to make the user experience (UX) better than a simple prompt input box for image generation: They have a GUI that presents prompt options (as tags / labels) and required parameters (aspect ratio, etc.); you can ‘re-mix’ a prior image to make a new one; and they have magic prompts, which uses AI to rewrite prompts.
Consider this example: You have a picture in your mind of a dog picture you want AI to generate. A simple “generate a dog picture” prompt won’t generate your mental image, you need specific and vivid details and parameters. So you describe the dog’s breed, pose, setting, the image type (stylized oil painting or photo), etc.
I did that in Ideogram, asking:
Generate a picture of a Great Dane, white with back and dark brown spots, in an English manor. 18th-century. oil painting.
They turned the above specific prompt into a more vibrant, compelling and descriptive prompt - their “magic prompt”:
A stunning oil painting of a majestic Great Dane, white with back and dark brown spots, sitting regally in an 18th-century English manor's library. The room is filled with wooden bookshelves, some reaching the ceiling, and a large fireplace with a crackling fire. The dog's eyes gaze intently at the viewer, while its ears are perked up, creating a sense of alertness and loyalty.
What’s great about this is that it helps the user along but doesn’t hide the prompt from the user. It allows further editing of the prompt, the image, and remixing and re-generation using AI from both. The final AI image generation result is found in Figure One.
I have become a fan of this combined AI interface that uses GUI and natural prompts together, and I expect this becomes a standard interface for generative AI. The ‘magic prompt’ concept leads inevitably to AI-based prompt engineering displacing human prompt engineering.
Claude 3, AI Prompt Engineer
The magic prompt is all about AI helping you craft the best prompt to make the correct or ideal output. When the AI gets good enough that the AI does your prompts for you, then it can obsolete hand-crafted prompt engineering.
Human prompt engineering is a tedious trial-and-error process of finding the right incantations for a specific outcome. Needing magic incantations shows a failure of good UX; we should have better AI and better UX. And now we have an AI model
But to get to good AI-generated prompts, you need a good AI model. Enter Claude 3. Some users trying out recently-released Claude 3 are finding adept at creating good prompts. One user on X put it this way:
if I ask claude 3 for the perfect prompt for task X it usually comes up with much better prompts then what I would have thought of (it is usually much more detailed and to the point)
Lots of LLMs are good at code, but Claude 3 Opus is the first model I've used that’s very good at prompt engineering as well.
If you are a developer frustrated with how tedious prompt engineering can be, try using Opus to help you out.
He shared a meta-prompt Colab Notebook that uses Claude 3 Opus to create new prompts. He also shared a process to test and evaluate those prompts and test questions based on a system prompt.
The nice thing about an LLM generating prompts is that this can feed on itself. You can get Claude 3’s advice on creating effective prompts, use generated prompts to test and run. Then you can share learning to build more good prompts for AI to learn from. AI helps bootstrap the best prompts.
Anthropic has a Claude prompt library, with useful shared prompts for their models.
Simon Willison observed that Claude 3 should be better at creating prompts than GPT-4, because a newer model will have more material on prompt design for it to learn from. It’s not unique to Claude; any newer AI model should learn from prior prompting techniques.
Beyond-the-Prompt - LLM Workflows
When faced with the challenge of a workflow of AI coding assistance on a Github code repo, Pietro Schirano developed a simple-genius solution for getting from repo to specific LLM inputs - a Github repo-to-text for LLMs. This converts a GitHub repo into a text file to feed as a prompt. RepoToTextForLLMs is itself a Github repo, automating a whole process:
Not only does it return the full repo in one file, but it also appends a super prompt for analysis and understanding.
This is a good example of thinking outside the prompt. Whether it is creating a work document, debugging a coding application, or researching a topic, a real work flow goes far beyond one prompt. For AI to solve real problems, AI and its interfaces will have to go outside the prompt as well.
Current LLMs are discrete prompt - response iterations, but real workflows for a specific task goes beyond one prompt to a whole conversation. Often LLMs do better when they ask and get more specific guidance iteratively. Some AI applications take steps towards maintaining state (RAG, memory, etc.) and building solutions across a single inference, but there’s more to do.
One solution may be something like Artificial Intelligence Controller Interface, AICI, a way to control and improve LLM inference more directly in real time:
Controllers are flexible programs capable of implementing constrained decoding, dynamic editing of prompts and generated text, and coordinating execution across multiple, parallel generations. Controllers incorporate custom logic during the token-by-token decoding and maintain state during an LLM request. This allows diverse Controller strategies, from programmatic or query-based decoding to multi-agent conversations to execute efficiently in tight integration with the LLM itself.
This approach is a reminder that controlling LLM outputs is not just an AI interface concern. It’s a control issue. We are limited in how we can control LLM outputs not because of limitation of language so much as the limitations of understanding what prompts can elicit the best desired outputs.
Summary
Natural language is a powerful mechanism for conveying user intent. Our most natural way of interacting would be to use our own voice to control tools. While this is an important part of directing generative AI via text, its not all of it. AI UX is about using natural language, GUI, and other interfaces to convert user intent into a smoothed whole flow. These elements all work together.
Lack of clarity of how and what LLMs output has led to prompt engineering as an art form, a way of creating the right incantations to obtain the best outputs. It has great utility for current LLMs, but also limits.
A recent IEEE Spectrum article, which we mentioned in our most recent AI Weekly, declared prompt engineering dead. Prompt engineering is deemed to be a dead end because newer LLMs go beyond human capability in prompting. AI itself is a better prompt engineer.
We see that trend now, with the latest AI models generating prompts and outputting “magic prompts” to improve upon human input.
AI prompt generation improves AI interaction, but it won’t end there. AI interfaces continue to evolve to make AI user experience better, but real improvement requires better LLM control.
To be effective, AI applications must support specific workflows from start to finish, automating as much as possible to reduce the friction for the whole flow. Ultimately, it’s about making AI intelligence better, having better control over the LLM, and making the interface better integrated throughout the flow.
Thus, future AI models will improve their capabilities no just in reasoning and intelligence, but also in interfaces and controllability. This might include mechanisms for control along the lines of AI Controller Interface (AICI). This will make for more useful LLMs and allow for smoother AI interfaces.
AI will only get better.