Site Revisions Underway Some Features May Not Work

How to Create Nearly Anything Using Midjourney

October, 2024

First off, everything I’m about to describe is already discussed in the Midjourney Docs and all over the internet. I highly recommend reading through the Docs because they are short and extremely helpful.

There are two main ways to use Midjourney: The Midjourney Website and through Discord. I’ve been using MJ since the beginning so I’ve had a private Discord server set up for Midjourney for a while now. If this sounds complicated to you, it’s incredibly simple. You sign up for Discord. Create your own Discord. You sign up for Midjourney. You message Midjourney on Discord and now you have a private Discord server set up with Midjourney. If you have any issues with this, there are a ton of tutorials online to walk you through it.

Once you’re in Discord you message the Midjourney app, start your message with /imagine, type your prompt, hit enter and it will give your four images to choose from. You pick which ones you want to Upscale or make Versions of. This is all laid out online also.

In general, I use Discord to create everything because it’s what I’m used to and I’m quick at it. I use the website to look through my outputs and search for images because it’s more convenient than Discord. I’m pretty sure all of the necessary “Midjourney code” which we’ll get to in a second works just the same on the website as it does Discord, but I haven’t really experimented. So going forward, all tips and tricks were done in Discord.

Also, “authentic” images refer to any kind of image that comes from a real life source: a photo of a human being, a scan of an illustration, a screenshot of an animated movie. “Synthetic” images refer to any kind of image created using AI.

Prompts & Resources

There’s a lot of focus on prompts when it comes to using AI and there are many online resources dedicated to great prompts. Simply Google “Midjourney prompts” and you’ll find endless options.

I also highly recommend checking out the work of two people: Nick St. Pierre and Tatiana Tsiguleva. A quick scroll through their Twitter feeds will teach your more about Midjourney than a month of playing with the app.

Nick breaks down prompt structures, code, and is constantly experimenting with new MJ features so he keeps you on the cutting edge. Tatiana is a master of style references and has created some of the most breathtaking AI imagery to date.

Now that we got that all covered, let’s get into some of the fun stuff.

Midjourney Code

Beyond prompts, you can use some very basic “code” to refine your images in very powerful ways. Again, the Docs and Nick St. Pierre go over all of this, but I’ll give you a quick summary below for the ones I use most often.

First off, the way you add “code” to your prompt is by simply typing your prompt and at the end, add a couple of spaces with a “—” (double-dash) and then the code.

For example:

A beautiful sunset mountain view in the winter time. --ar 16:9

As you can see the prompt is followed by “ —ar 16:9” which gives the image a 16:9 aspect ratio. You can pile these “codes” up as well.

A beautiful sunset mountain view in the winter time. --no trees --ar 16:9 --style raw --v6.1

Midjourney Docs has all of these listed out and they are so fun to play with. If you ever make a mistake, you get this big scary red warning that is actually harmless and tells you what you did wrong.

But what we’ll be concentrating on today is reference images.

Reference images

Prompts can get you very far and usually do the trick, but I’ve found that using reference images is a surefire way to get what you want. The biggest asset you can have when creating good images in Midjourney is a good reference image. A lot of times, you need multiple reference images to create a single image.

Midjourney uses two main kinds of reference images: Character references and Style references. Character references are used to create a similar character from image to image. Style references are used to reference the style of imagery you want to create.

The way you “use” Character and Style references is by linking the reference images in your prompt using Midjourney code. To create a Character reference, you add “—cref {link to image}” at the end of your prompt. Style reference is similarly “—sref {link to image}.

A 3D Pixar-style illustration of an orange kitty. --cref {https://cdn.freecodecamp.org/curriculum/cat-photo-app/relaxing-cat.webp}

The links can be direct links to images online, but I always just create custom links for ease. In your Discord chat with Midjourney, you simply upload an image, right-click the image and copy the image link. When you paste that link into a —cref or —sref it will have an insane, 2,000 character link and your prompt will look all weird and gross. But when you hit enter, Midjourney shortens the link into a nice, short mj.link for you to copy and use for later.

Creating reference images

If you want to create something specific using Midjourney, reference images are vital. Reference images are “blueprints” for Midjourney to use when building your image. Finding a good reference image online or scanning your own illustration is usually all you need to complete your projects. However, sometimes, using authentic images as reference images can lead to unwanted results. That’s because real-life images are a foreign language to Midjourney which is not helpful when it needs a blueprint to build your image.

In these instances when your reference image just isn’t doing the trick, I’ve found it’s helpful to recreate the reference image within Midjourney to essentially translate your blueprint’s language into Midjourney’s language.

Without getting too scientific here, these AI image creators (known as Diffusion Models) create things using noise/grain. It starts off with a grey image and then begins rearranging all of pixels a bajillion times (in just a few seconds) until that grey image now looks like whatever you asked for. They work primarily with text-based prompts. But proper reference images can help them a lot.

Midjourney works best with synthetic imagery that it has created for reference images because the synthetic grain is arranged in a way it understands. This is why if you upload a photo of yourself and ask Midjourney to recreate it using —cref, it looks a sibling you never had and not you. That’s because it’s trying to figure out what the hell your image even is. Look at the “Authentic Grain” above and imagine trying to navigate that mess when the “Synthetic Grain” is so organized and easy to follow. That’s why if you upload that same image of the sibling you never had and used that in a —cref, it will recreate it almost perfectly.

So where does the “translation” come in? Like all answers in life: ChatGPT.

This is how you do it. Upload your authentic reference image to ChatGPT and ask it to describe the image for a text-to-image AI image generator. You can tell it to focus on the aspects you care about. For example, if it’s a photo of a person in a park and you want to use it as a character reference, tell GPT to give the character a lot of description. If you want to use that photo for a style reference, ask GPT to focus on the style of the photo and not the subject matter. You can tell GPT how long you want the description to be. I’ve found a solid 3-5 sentence paragraph does best for photo-real stuff while cartoonish stuff can be more simple.

You then upload your reference image to your Discord server and get the link if you haven’t already, use the description GPT gave you and use your reference image as a style or reference link. The AI generated prompt and Midjourney work together to translate the blueprint so you can now build whatever you’d like.

Example of the process

The goal was to create a realistic-looking surveillance photo and then use that as a style reference to create more fake surveillance photos for a video game.

First, I needed to find an authentic reference image. I found this one of Whitey Bulger on wikipedia.

Full disclosure, using this photo directly as a reference image produced awesome results! However, MJ clearly struggled when I asked it to change the location, etc. So I continued using my process…

I uploaded the Whitey Bulger photo to ChatGPT and asked it to describe the photo for me. I then took what GPT described and added —sref {link to the Whitey Bulger Photo} at the end.

Surveillance-style photograph capturing two men outside a brick house. One man, wearing a blue bathrobe, stands near the entrance, looking at the other man, who is dressed in a light jacket and brown pants. They appear to be engaged in conversation. The scene includes a car parked in the foreground and two wall-mounted lanterns on the brick exterior near the door and window. The photo has a candid, slightly grainy quality, evoking the look of covert surveillance photography. --sref https://s.mj.run/9uiDNG9bIZY --ar 2:1

The photo below is what the prompt above generated.

As you can see, the Whitey Bulger photo produced an AWESOME image. Even better, we now have a “translated blueprint” to use for all surveillance style references going forward. And we can take it a little bit further. If we have a character we wanted to spy on, we can mix character references and style references.

Here’s the character I want to “add” to this image above. Let’s call him Marcello. Marcello was created in Midjourney so he’s a good “blueprint” for a character. If I wanted to use an authentic character, I would need to recreate that character in Midjourney for best results.

All I did was upload this image to ChatGPT and asked it to add Marcello to the description.

Then, I got to use some more fancy Midjourney code. You can “weight” references. This means you can reference one image more than the other or vice versa. Character weights (—cw) are between 1-100. Style weights (—sw) are between 0/raw - 1000. It takes a bit of experimenting with weights to understand how they work, but once you get the hang of it, it’s second nature.

For example, if I wanted to add Marcello to the surveillance photo and gave it a character weight of 100, it would likely try to add in some kind of white background since it’s desperately trying to mimic my character reference 100%. Because of this, I will use a character weight of 30 and a style weight of 1000 since I’m really going for the surveillance look and Marcello is only a part of that.

So with this prompt below:

Surveillance-style photograph capturing two men outside a brick house. The man on the left is wearing a brown leather jacket over a tan shirt, with a broad, muscular build, and sporting a serious expression. He stands near the entrance, looking at the other man, who is dressed in a blue jacket and black pants. He has long hair pulled back in a pony tail. The two appear to be engaged in conversation. The scene includes a car parked in the foreground, bushes, and two wall-mounted lanterns on the brick exterior near the door and window. The photo has a candid, slightly grainy quality, evoking the look of covert surveillance photography. --sref https://s.mj.run/gr8iE3bR7mc --cref https://s.mj.run/Ntss5O_ObdY --ar 16:9 --cw 30 --sw 1000

You get something like this:

You can create anything

So that’s basically the process.

Find a reference photo, get a good description of it using GPT, adjust it to your needs, use it as a reference link, and start generating. If things don’t go the way you like, try recreating the reference image in Midjourney and use the synthetic image as a reference.

Marcello Action Figure

Prompt

Product photo, white background. 4 inch tall plastic, posable action figure of a large white male, mobster, wearing a brown leather jacket, yellow dress shirt, brown dress slacks and brown shoes. --cref {link to Marcello image} --style raw

Marcello Simpsons

Prompt

Simpsons style animation of a large white male, mobster, wearing a brown leather jacket, yellow dress shirt, denim blue jeans and brown shoes --cw 5 --style raw --cref {link to Marcello image}

Back to Writing