How to Use AI to Create Images of Anything You Can Imagine
Mere months ago, if you wanted to create a picture of something, you had to be able to sketch, paint, or use one of the photoshopping tools others keep talking about. After 2022, though, everything changed, all thanks to AI—yes, as in “artificial intelligence.”
Instead of trying to dominate the world, artistically-inclined AI tools can turn anything you describe to them into an image.
Come with us as we enter the world of AI-powered text visualization, and see how you can use such tools to convert your thoughts into actual pictures by merely typing what you have in mind.
Dall-E: The Artistic Side of OpenAI’s GPT-3
The first AI-powered tools that became popular were based on OpenAI’s GPT-3. One of the reasons was the project’s openness to external access, which led to some suggestions that GPT-3 is the future of creative work.
Today you can use the official tools you can find at OpenAI’s beta site or third-party solutions that take advantage of its linguistic superpowers. For example, you can ask GPT-3 to come up with a draft for a post, answer simple questions, or even revise or translate some text.
In 2022 OpenAI revealed that GPT-3 was equally good at crafting images. The DALL-E project, a play on Pixar’s WALL-E movie and Dali’s name, uses GPT-3 not for working with text but as an image-making engine.
Just like with GPT-3 and text, DALL-E isn’t really a creative genius, materializing images out of thin air. Instead, it’s been “trained” on millions of images that already exist online. Its AI powers lie in analyzing those images, taking elements from them, tweaking, morphing, adjusting, and finally combining them into new imagery.
At least, that’s a simplified version of what happens in the background. Most people will only care for what they see in front of them, and that’s a text box where you can type something and see it turned into an image after a few minutes.
Google’s Imagen Answer
Google is one of the top three “players” in AI research. Still, their progress isn’t easily perceivable, nor are its implementations into products as accessible as OpenAI’s offerings.
One of Google AI’s first widely available implementations was in Google Docs and Gmail, in the form of more intelligent auto-complete and suggestions, known as Smart Compose. We won’t dive into details since we’ve previously covered Smart Compose (and how you can use it).
When those features are active, Google’s web apps compare what the user’s typing to what millions of others wrote in the past. Then, it suggests what they typed afterward.
It’s proof that despite what we like to believe, we’re not that different. If 99 out of 100 people type “later” after “see you,” that’s probably what we’d go on typing, too.
We’ve all used some form of autocomplete, even from back in the “dumbphone” era’s T9 predictive text system. That’s why Google’s AI tools didn’t seem as intelligent as OpenAI’s GPT-3. They didn’t feel as much more in use than a better T9 system improved for the 21st century. And that’s also why Imagen’s reveal was a bit of a shock.
Like a DALL-E on steroids, Imagen is a text visualization tool. Based on what’s available today, Imagen can produce “cleaner” and more vivid imagery while also knowing how to deal with advanced features like diffusion and transparency.
Unfortunately, at the time of writing, access to Imagen remains restricted, so we couldn’t try it out.
DALL-E Mini and Friends: Open for Business
You can’t freely access DALL-E and Imagen—yet. Still, many alternatives are already available if you want to fool around with AI-powered textual image generation.
Keeping in mind that those are the early days, and the results or user experience they offer might be far from optimal, it’s still worth checking out some of the following.
Making Memes With Dall-E Mini
Thanks to a combination of more-than-adequate results and a user-friendly interface, but more importantly, its wide availability, DALL-E mini became one of the most popular AI text visualizers.
Far from perfect, sometimes DALL-E mini’s results could be more abstract than intended.
Other times it might fail to create what you had in mind but can get pretty close.
After its explosion in popularity, DALL-E mini’s creators moved it into a new home under new branding. Now you can find DALL-E mini’s latest version as Craiyon on its own site.
Using Craiyon today is as easy as searching online for an existing image. You can visit its site, type a description of your picture in its text field, and hit Enter. After a while, you’ll see the results on your screen.
What’s striking is how good Craiyon and similar tools are at mimicking visual styles. For example, we’ve asked it to conjure images of a puppy on a skateboard:
Then, we used the exact phrase but added a “Pixar style” after it. After a while, Craiyon showed a grid of more “cartoony” images, closer to what we perceive as Pixar’s ray-traced graphics in their beloved movies.
Craiyon gave us even better results when we replaced “Pixar style” with “anime style” in the same prompt.
Anime is more stylized in its appearance than Pixar’s more realistic imagery, which seems to have helped Craiyon produce some almost ready-to-use images.
Fooling Around With Latent Diffusion
The Latent Diffusion model trained on the LAION-400M dataset is another interesting AI text visualizer. However, it’s also more complicated in its use. You must run it online in a virtual machine and play with its various parameters instead of merely typing in a text field. Still, it’s easier than it sounds.
- Visit the Google Latent Diffusion colab space that’s currently its home.
- Scroll a bit down and notice the Prompt field under Parameters. Replace the default prompt with what you want your image to depict.
- Choose Run All from the Runtime menu, or press CTRL + F9.
- If you want to be able to export the produced images directly from within the tool, answer positively when asked if you want to link it with your Google Drive account. The tool takes a while to complete its configuration and needs to download some files during the process.
Increasing the values for Steps, Iterations, and Samples_in_parallel, may lead to more detailed results. However, the tool is extremely demanding in resources on Google’s servers. As a result, it may crash if you increase those values too much, or the process of creating a particular image becomes more complicated than expected.
We’ve spent a significant amount of time testing DALL-E mini and Latent Diffusion. Our scientific method consisted of two distinct parts. First, we had to come up with concepts that could be accurately described as bonkers. Then, ask those AI visualizers to turn them into images. More often than expected, they succeeded, coming close to the general setup we had envisioned.
We’ve also tried some of the available alternatives for this article. We’re still waiting for access to others. Some of the ones worth checking out are (in no particular order):
Will AI-Generated Art Replace Visual Arts?
The abundance and continuously increasing popularity of image-generating AI-powered tools lead many to conclude that visual arts will soon die. What’s the point in investing the time and energy to learn how to draw or use complicated software to visualize things when an AI can do it quicker (and soon better) than you?
If you noticed, those tools are all “trained on datasets.” In plain English, this means that they do what they do thanks to humans already having done the same thing before.
That’s the hint as to why those tools can’t replace human artistry, creativity, and ingenuity. They’re mimics, smart replicators. Without the humanly-produced originals on which they’re trained, they wouldn’t be able to produce any output.
Still, that’s the now, and we admit we don’t know what the future holds. For now, visual artists can sleep safely. At the rate AI is evolving, though, many specialists on the topic agree it’s not a matter of if it will ever replace the work of people like yours truly. It’s only a matter of when.
But hey, it’s not all doom and gloom. While Skynet prepares to take our jobs, at least we can brighten our mood by effortlessly creating images of puppies on skateboards!