Introduction
This post was inspired by this YouTube video from Matt Wolfe:
Matt is one of my favourite YouTubers for looking at some of the more commercial end-user applications of AI. In this round-up of current AI image-generation tools, he works through each one and makes some comparisons.
The video is really good, but the thing that surprised me was that even though I’ve been following generative AI since before it blew up online, there were quite a few tools I hadn’t heard of. There is fierce competition in this space at the moment and it’s pushing the competitors to innovate to try and give themselves an edge.
So in no particular order here is a list of the tools and models that Matt covers in the video, and where you can find them online.
Midjourney
Description
Midjourney is one of my favourite AI tools. I’ve had a subscription since soon after it launched and I’ve had a lot of fun with it. It’s been fascinating to watch it evolve and become a more practical tool, but it still generates plenty of hilarious results as well.
Pros for Midjourney would be that it can do a wide range of image styles and has good options for things like image resolution and aspect ratio. It also has a good range of image manipulation options, and you can upload your images to use as a starting point.
Cons would be that it still has its interface embedded in Discord, which is pretty annoying, it can’t do text in images yet and the free version only gives you a few image credits before putting you in the slow lane.
They have recently released an interesting new headline feature called ‘Style Tuner’. You can go through a process where you compare and select different images that you feel better capture the style of the result you wanted. At the end of the style tuner process, it gives you a code that you can then use with other prompts to get more results in a similar style. This is a great idea and makes it much easier to work with Midjourney if you’re trying to illustrate a document, presentation or other extended word where it would be good to have some consistency in the style of the images.
Link
Dall-E 3
Description
So Dall-E 3 is the third generation of Open AI Dall-E image generation model. It’s one of the most powerful image generators and benefits from integration with their language models. This means you can put in a fairly simple prompt and it will be embellished bu the LLM into something more ornate that will get a better result.
The model is currently hosted in two different tools - it’s available in Chat GPT for users with a paid subscription, and it’s also available in the Bing Image Creator provided by Microsoft that makes a generous number of free image generation credits available to users.
Performance in the two hosting apps may be a little different as they use different system prompts.
They’re both great tools, and Dall-E 3 is top tier as far as the quality of the images it can generate goes, but the tools do lack some of the functions longer-established tools to control the output or tweak the results.
Links
Leonardo AI
Description
Leonardo AI is a power AI image generation tool based on a fine-tuned version of the SDXL open-source model from Stable Diffusion. In this way, it’s similar to Midjourney, and possibly because of this, they both have some of the same functions, such as being able to use guide images to control the output rather than just a text prompt.
Leonardo also offers a tool for prompt generation and several alternative ‘image pipelines’ which seem to be either slightly different fine-tuned models or possibly a combination of that with some different post-processing steps to create different types of images.
Link
Firefly 2
Description
Adobe is still a dominating force in the creative software space, and it makes complete sense for them to both have image generation tools and to integrate generative technology into their existing products. They’ve been using AI for years in features like content-aware fill, noise reduction and sharpening.
Their Firefly 2 image generation is powerful, and the features it’s able to offer in packages like Photoshop are impressive. I’m not sure if it’s the absolute best model for image generation at the moment - it’s very good, but perhaps not quite as interesting as some of the others at the moment. I don’t think that’s going to matter though as it has to be the professional tools that are going to differentiate them. This is stuff that is designed for professional graphic designers, illustrators and photographers to include in their workflow.
Link
Description
Google has added an image generation function to Google Search, but the results are not comparable to the other tools available at the moment. Google is doing very well in terms of lab research, but they’re not managing to productize any of that into compelling products for consumers at the moment.
Ideogram AI
Description
I don’t know much about Ideogram AI, apart from the fact that it got some hype going for a while. The example images on their homepage are pretty good, but after trying a few prompts the image quality wasn’t great. Images were a little fuzzy and often had strange elements that seemed badly formed. A year ago this would have been incredibly impressive but this doesn’t hold up against the other tools on this list.