Before you sign up for that prompt engineering course, check out our guide on AI image generation.
If you’re curious about AI art, you’ve probably seen the iconic AI pictures of Donald Trump getting arrested and Pope Francis strutting in a white puffer jacket. You might have also seen Théâtre d’Opéra Spatial, an AI-generated artwork that won a digital art competition in 2022. All these images look so flawless, it’s nearly impossible to tell AI made them. But what if your prompts, no matter how hard you try, never seem to deliver similar results? Well, if that’s the case, this guide can take your prompting skills (and your AI generations) to the next level.
In this article:
As painfully boring as it might sound, we’re going to start with a bit of theory behind the AI image generation technology. What you need to know is that most image generators available today are based on diffusion models. Diffusion models are trained on large datasets like COCO and LAION that contain images with text descriptions.
During training, diffusion models gradually add noise to these images and then remove it, trying to restore the original picture. Once the training is complete, they begin building images up from noise using a text query. They first create a canvas full of noise and clear it step by step to produce an image that is a close fit to the text description. Essentially, they are trained to understand what, say, ‘bunnies in a flower garden in front of a French country house, sunset in the background’ looks like and synthesize a similar image once given a text query with matching words.
Every piece of a text query for the image generator that comes before or after a comma is called a prompt.
Before you get to experimenting with text-to-image generation, it’s important to make sense of some of the most important terms. While some AI image generators are incredibly straightforward and only have the prompt field for you to interact with, others have additional settings you can tweak.
Seed — a unique number of a generation. Every single time you generate a new image, the AI creates a random new seed. You can save the seed to reuse it in further generations so that the AI treats it as a starting point.
Model — for AI generators that offer multiple generation models to experiment with, this parameter lets you choose the ‘brain’ behind your generation. Some of the models are more advanced and are great for crisp, accurate results but can be available only in the Pro versions. These often include Absolute Reality v1.6, Stable Diffusion v2.1, and SDXL 1.0. Other models are usually older and less precise, but most of the time come totally free.
Negative prompts — basically, this is reverse prompting where you tell the AI what you don’t want to see in your image at all. When you create an image with prompts, there is a certain degree of randomization in every generation. The AI adds the missing details on its own if you never described them in your prompt. By using negative prompts, you can limit the randomization and make sure a certain element never pops up in the final image.
Guidance scale/prompt strength — how free you allow AI’s creativity to flow when working with your prompt. Along with negative prompts, this is a way to get more precise results when generating images with AI. Depending on how high you set this value, you can either force an image generator to stick to your prompt religiously or let it run a little wild and add a bit of randomization magic.
Rerun — creating an entirely new image with a different seed based on the same prompt and generation settings.
Remix — creating a new version of an image based on that same seed. The AI saves the seed and the settings you used earlier, allowing you to tweak the prompt and the generation parameters. If you don’t change the setup entirely, you will get a close match to the image you choose to remix.
Generate variations — creating more versions of a result of a generation you found to be the closest to your original idea. Variations shouldn’t be treated as a remix based on the picture you choose. They will only slightly differ from an image you pick, so if that one is not too close to what you prompted, a rerun could be a smarter option.
Reference image — an existing image that is close to what you want the result of your generation to look like. It serves as an additional prompt and can be immensely helpful when you have a complex, highly specific query.
Steps — in diffusion models, this shows how many samples you want AI to produce during the diffusion process when refining the image. In simpler terms, steps define how many iterations you want AI to take before it gives you the result of a generation. The more steps you tell it to take, the more precise the image will likely be. On the downside, it slows down the generation process.
Sampler — an algorithm that tells the AI how to remove the noise from the image and produce the final version.
Prompt weight — how much emphasis you want the AI to put on every keyword. Otherwise, to the AI model, they all have the same weight by default with a slight prevalence of the words that come first in your text. The more weight a prompt has, the more prominent it will be in the final image.
Though prompting slightly differs between various image generators, the core principles of creating commands for AI are universal. In short, your prompt has to be clear, descriptive, grammatically correct, and as concise as possible. While most AI tools suggest that a highly specific prompt is the best option, it doesn’t mean you should overload your prompt with unnecessary details.
Here is an example of a prompt that will probably get you nowhere: “Two young Japanese ballerinas with black lace masks on, with a low bun hairdo, with red glitter roses on their necks, wearing red ballet dresses with embroidered black roses on the skirts, with black ballet shoes on dancing on a theater stage, bright red bottom lighting on their faces, thick black and red rose bush decorations in the background behind the ballerinas, dark castle prop in the background, professional 8K photography”.
A text this lengthy is too confusing for the AI and it will inevitably drop, distort, or misinterpret some of the details. There are too many fine objects and too many inter-object relations described within one query.
That said, here are some more things to keep in mind when prompting if you want to get an accurate result.
If you have a clear idea in mind, you need to make sure your prompt is not vague or misleading. The best strategy is to be precise without obsessing over the tiniest details. “Blue butterfly on a sunflower in the cornfield on a sunny summer day” is specific enough and not too elaborate.
Adjectives and adverbs can help the AI figure out how you envision the final image. If you don’t use descriptors and define the characteristics of your objects, the generator will use randomization to fill in those gaps. Add adjectives and adverbs to define the color, size, positioning, and even vibe of the objects in your image to get better results.
The default style for most generators is photorealistic. To get a different style, you can choose one of the presets the AI platform offers. Those often include watercolor, hyperrealism, 3D render, cartoon, pencil drawing, and other styles. Plus, if the list of presets doesn’t have what you need, you can describe the style you want in your prompt. Though not always 100% accurate, AI knows how to imitate different styles, so you can try ‘gta style’, ‘pixar animation style’, ‘paul gauguin style’, and everything in between.
As we explained earlier, every word in your query has its weight. Unless you are using a generator like Midjourney where you can define the exact weight of each prompt, the AI treats the first words as the more important ones. This is why it’s better to describe the main objects in a composition first and save the secondary details for later.
Look how different the results can be, depending on the word order. For the first image, the query was the following: ‘wolf pack on a cliff with a dense green wood in the background at night, stars shining’. For the second one, we modified it and went with ‘cliff in a dense green wood at night, stars shining, wolf pack’.
While AI can be merciful and let a minor typo slip without distorting the results, your choice of words and grammar matter a lot when prompting. Avoid ambiguous word combinations and confusing word order. If you use a prompt like ‘joyful boy walking a dog holding a bone’, the AI won’t know who exactly is supposed to be holding a bone and might give you the result you didn’t expect. Instead, try something like ‘dog carrying a bone, on a walk with a joyful boy’ or ‘joyful boy walking a dog that is carrying a bone in its mouth’. Clarification and compartmentalization can help the AI figure out the composition better.
It’s great if you’re open to whatever the AI might produce for you, but single-word prompts will probably give you nothing exciting. Most AI image generators need at least around 3—7 words to create a coherent composition. Look what you can get by using a 6-word prompt instead of writing a single word.
You can change the vibe of the whole composition by using adjectives describing the ambiance of a scene. If you want to make it look bubbly, fun, and uplifting, you can use prompts like ‘playful mood’, ‘dynamic composition’, ‘joyful atmosphere’, ‘vibrant’, and others. In case you’re in a mood for something gloomy or mysterious, you can add characteristics like ‘dark atmosphere’, ‘melancholic vibe’, ‘horror scene’, ‘apocalyptic’.
Do you want a landscape scene? A wide-angle shot? A closeup macro photograph? Or maybe a 3/4 portrait? You can tell the AI all this via prompts. If you don’t describe the desired composition in your prompts, the AI will figure out the most fitting one based on the rest of the text. Most of the time, the default setting is a front-facing photo for humans and animals, a closeup shot for insects and plants, and a traditional triangular composition for landscapes, interiors, and other non-portrait images.
By telling the AI what viewing angle you need for your image, you can create true pieces of art. You can experiment with keywords like ‘gopro shot’, ‘wide-angle picture’, ‘captured from above’, ‘view from below’, ‘profile portrait’, and others.
This is one of those priceless pieces of advice you won’t find in any course on prompt engineering. If you have a complex prompt that might be a hit or a miss, try it on a free AI image generator first. You will see which parts of your query AI gets right and which ones need a bit of clarification and fine-tuning. With a free generator, you can adjust your prompt without the need to worry about wasting your precious 1.5 welcome tokens on poor results.
Experiment with compartmentalization using commas, try a different prompt order, add an adjective or two for clarification, drop the unnecessary details, etc. Once you are happy with the outcome, you can take your prompt to a different generator.
If you really want to make high-quality AI art that you can use for your projects, get ready to nerd out. Because each AI image generator has a slightly different toolset, it’s important to understand how to use it wisely. They might run on similar generation models, but it doesn’t mean they all follow some universal image generation formula.
Study the FAQ page to make sense of a generator’s custom terminology and generation settings. Check out the knowledge base with examples of prompts from the user community. Go through the gallery on the home page to get inspiration and take a glimpse into how other people prompt. This might be surprisingly insightful and help you create otherworldly images like it’s nothing.
Even though the name artificial intelligence suggests that every AI tool should be smart enough to perform any task flawlessly, this is not the case. Each image generation platform has its strong suits and weak points. Some of them are great at imitating famous artists’ techniques and creating stylized 2D artworks. Others are impeccable for hyperrealistic portraits and sceneries. That said, the same prompt would give you drastically different results across various AI image generators, even those based on the same model.
The good news is that you don’t have to go through dozens AI image generators just to see which one suits your needs. We tried out 8 popular image generation tools to see how they perform with 3 different queries. Most of these are either completely free or offer a generous amount of free tokens.
We wanted to see how good different generators are at creating photorealistic images, stylized pictures, and artworks that imitate particular styles. We also aimed to find out how well they perform when creating humans, animals, and inanimate objects. Finally, we decided to see how well AI tools work with atypically elaborate prompts that include abstract ideas.
That said, we came up with the following queries:
Bing Image Creator is based on DALL-E which is currently one of the strongest diffusion models. Bing is one of the most powerful image generators in terms of getting every tiny detail right. The accuracy of the output is so high, most of the time it’s not necessary to adjust the prompt. Where other generators struggled to make sense of the prompt, Bing got the composition right on the first try. Bing Image Creator is especially great at imitating art created with various media like soft pastels, watercolors, and others.
With multiple Stable Diffusion models available to its users, Nightcafé is a great AI image generation tool. If you are not familiar with AI image generators yet, you can use the basic mode where you only need to choose a model and a style preset. In case you are an advanced AI user, you can switch to the advanced mode to fine-tune your query. The unicorn we got from Nightcafé is arguably the best one of all. The ghost, however, got the AI confused at first so we had to add ‘white ghost’ for better results.
Unlike other AI tools that are based on advanced generation models, Craiyon is powered by DALL-E Mini. This model uses a similar training and generation algorithm as DALL-E. However, OpenAI did not develop DALL-E Mini, so the difference between the generation results is huge. Though Craiyon is great at understanding prompts and combining them to create a coherent image, the images lack precision and cleanliness. Even after you hit Upscale, the quality stays mediocre at best. But hey, it’s open-source and totally free, so you win some — you lose some.
One of our favorite models so far, Mage is a true all-in-one. It’s free, versatile, customizable, and quite fast when it comes to image generation. In the free version, you can choose between SDXL (the default model), Stable Diffusion v1.5, or Stable Diffusion v2.1. You can tweak numerous advanced generation settings like diffusion steps and scheduler for more precision. Mage crushed every single query on the first attempt and only had trouble with freckles. Like other Stable Diffusion-based models, it either ignored freckles as a whole or made them look like dirt on a girl’s face.
Based on Stable Diffusion like many other AI tools on this list, DreamStudio is perfect both for photorealistic and stylized images. It needed a few prompt adjustments, however, to deliver the desired results. The unicorn was the hardest for the AI tool to nail. Though it absolutely nailed the silver body and the purple fantasy desert, it refused to add a rainbow in the background. Plus, some of the unicorns ended up having no horn or black eyes instead of red ones.
Leonardo is one of the most talked about AI tools for a reason. While not entirely free, you get 150 free tokens that recharge every 24 hours, which is a fairly generous offer. Aside from the usual Stable Diffusion v1.5 and v 2.1, it offers other models that are awesome both for photorealism and stylization. We tried Absolute Reality for the girl and the unicorn, and Leonardo Diffusion for the ghost. Like many others, it couldn’t quite get the ghost right at first and forgot to add a rainbow behind the unicorn. Other than that, prompting was a breeze.
Clipdrop from Stability AI is an amazing free AI tool based on Stable Diffusion. The processing time can be frustratingly long, but you get 4 versions of an image in each run. You can control the style of your generation both through style presets and prompting. Because it’s Stable Diffusion, we ran into the very same problems when generating our images: 1) no freckles or dirt-like freckles, 2) no rainbow behind the unicorn or black eyes instead of red ones, 3) no ghost or an awkward ghost-like blob when simply prompting ‘ghost’.
The famous AI powerhouse that took over the AI image generation scene was not as easy to work with as we hoped. We had no trouble generating the ghost and ended up playing around with the prompt a little just for fun. To make it different, we prompted ‘cute ghost’ and Midjourney hit the spot on the first run. The girl was a bit of a puzzle because apparently, freckles are a great mystery of human anatomy. Plus, flower field turned into a wheat field. The unicorn was, however, the hardest to get right and the most exhausting to reprompt. A detail or two was always lacking, and there was not a single generation where everything was in place. Frankly, at a certain point, we gave up on a prompt this complex. To be fair, Midjourney never fails to deliver crisp, high-quality images.
Whether you are looking to create AI art for fun or for further use in a design project, it can be hard to put your ideas into prompts. If you don’t know where to begin, start by browsing other users’ latest creations to get inspired. Nearly every AI tool has a huge gallery of images generated by the users. You can also go through AI-related hashtags on social media to take a peek at what others make. Type in #AIart, #Midjourney, #generativeart, or other hashtags. Finally, you can try some of the prompts we came up with to warm up your imagination:
Don’t forget to tag us in your AI arts with the #Icons8 hashtag if you use these prompts!
Fancy, elegant, cute, and vintage: we've collected all kinds of icons, vector illustrations, and ready-made…
Explore strategies for creating podcast cover artwork that reflects the podcast's themes and tone with…
Want to know how to create a color palette that works for your UI? Dive…
Let’s explore how a well-designed 404 error page can turn a potential setback into a…