Seven months ago, when Jason Allen used a new kind of generative artificial intelligence (AI) program to win an art competition in Colorado, he was immediately accused of cheating.

At the time, few had used these tools, and few understood what he had done. Text-to-image programs like Midjourney or Stable Diffusion were weeks old. ChatGPT hadn’t been launched.

In the period since, a remarkable change has taken place. The tools have become widely popular, and an industry has sprung up around how to best use them.

Rather than “cheating”, this skill goes by another name: “prompt engineering.”

We want to hear from you:

How are you using generative AI like ChatGPT in your everyday life?

Already, jobs sites feature ads for prompt engineers, while Australian marketing companies are asking for “know-how on prompting ChatGPT”. Many experts say knowing how to talk to an AI program to get the best results will be a basic skill, like email or googling.

This new-found appreciation of prompt engineering skills is colliding with interpretations of copyright law. US authorities recently rejected Mr Allen’s application to copyright his prize-winning image, saying “it does not contain any human authorship”.

“I thought we were going to be basking in the celebration of AI,” said Mr Allen, based in Colorado, about the creation of this image.

“Instead, I’m kind of like the guy that opened Pandora’s Box.”

So, what is prompt engineering?

And how in-demand will this new skill become?

Learning the words of power

Most AI tools available now either generate text, such as ChatGPT, or images, such as Midjourney.

A few others do text-to-song, text-to-video, and so on.

Bing Chat AI is an exception to this. It is multi-modal, meaning it can generate both text and images from a text prompt.

First up, let’s focus on text-to-image generation. 

To understand how to communicate with the AI program, we need to know a little bit about how it works.

Text-to-image AI programs have been trained on a vast number of images scraped from the internet, along with any associated text.

Images scraped from, for instance, a forum for the Unreal gaming engine, will be automatically tagged with “Unreal” (along with other words, such as their caption).

Because many images have this tag (Unreal is a popular engine, so there are lots of images), the word “Unreal” has “weight” within the AI-training dataset. 

This weight makes the word a powerful prompt for the AI program, generating a certain predictable aesthetic.

Including the prompt “Unreal” generates an image that looks like game concept art.()

The same is true for any words related to cinematography, art direction, graphic design and art criticism.

However, there’s no definitive guide or manual to what prompts work, or have what effect. Instead, learning how to speak to the AI is a process of trial and error.

In the months since programs like Midjourney became available, communities devoted to unpicking and sharing the secrets of AI prompts have sprung up online.

One of the largest of these is PromptHero. It started in September and has 150,000 users, of which 10,000 are active.

“I spotted this problem that when you first try to get something done with this, the first thing you usually do is pretty bad,” said its co-founder Javier Ramirez, based in Portugal.

“You need to prompt in the right way to get a high-quality output.”

He referred us to one of the community’s members, a man who lives in the US Midwest and prefers to be identified by his PlatformHero profile, JHawkk.

The prompt included “analog style” and “Canon EF 50mm f/1.8 STM Lens”.

The image above, for instance, was made in Stable Diffusion with a string of 15 prompt phrases, from “analog style” to “cyberpunk”. 

JHawkk also used 31 negative prompts, describing what the image should not contain, from “disgusting” to “poorly drawn feet”.

The trick to AI art was knowing the right words, JHawkk said. Like an engineer translating a design into mathematical figures, he converts the discrete aesthetic elements of an image (“ray tracing”, “rim lighting”) into the peculiar language of the model.

Sometimes you see an image, and so you start to break apart the images into smaller phrases,” he said.

“Essentially, it’s how you might describe that image and especially in a way that the actual model itself could interpret that.”

The making of Théâtre D’opéra Spatial

Jason Allen said he will “never” share the string of prompts that generated his prize-winning image, Théâtre D’opéra Spatial, but was willing to talk about the process of creation.

As with JHawkk, he started by learning the right words.

“I wanted to create a cinematic scene like you might see out of a movie,” he said.

“So I’m going online and finding a keyword dictionary of everything related to cinematography. I’m essentially learning to be a cinematographer.”

He spent “weeks” testing different aesthetic elements in Midjourney, until he was confident he could accurately reproduce the images he imagined.

We’re looking for results coming out of your mind and seeing it in the final piece.”

One of Jason Allen’s other entries at the Colorado State Fair Fine Arts Competition.()

Then it was time to introduce the subject.  

I was in a hypnagogic state, I was half asleep, half dreaming these women in Victorian dresses wearing space helmets.”

He wanted to combine the fashion of 19th-century England’s Victorian era and the romance and melodrama of a Star Wars-like space opera.

He then made many variations on this theme by tweaking the prompts.

All up, he spent roughly 80 hours working on his entry.

Another of Jason Allen’s “space opera” entries in the fine arts competition.

Mr Allen has hired a lawyer and appealed the US Copyright Office’s decision to not award copyright, arguing that it didn’t understand AI is “just a tool”.

“What are you saying? Is it a person? Because it’s not a person. I’m the person,” he said.

“Stop throwing the users under the bus.

“We all have our creative dreams. We all have our ideas. Without that, the AI is nothing.”

AI-whisperer, salary $US335,000

While prompting for image generators involves mixing together thematic elements in an unpredictable way, prompting for text generation is very different. The focus is more on giving a very clear set of instructions.

In London, the law firm Mishcon de Raya is after a prompt engineer, able to “design and develop high-quality prompts for a range of legal and non-legal use cases”.

Nick West, its chief strategy officer, said one “use case” might be analysing contracts.

“The combination of the breakthroughs of GPT4 and the chat bit of ChatGPT suddenly make this a very attractive space.

“I do believe we will be able to do stuff that we wouldn’t otherwise have been able to do.”

But first, the firm needs a prompt engineer.

“It turns out there are better ways of writing the prompt and less good ways of writing that prompt.

“That’s what prompt engineering is.”