May 23, 2024
Midjourney's new style tuner is here. Here's how to use it.


VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More


Midjourney is one of the most popular AI art and text-to-image generators, generating high-quality photorealistic and cinematic works from users’ prompts typed in plain English that have already wound up on TV and in cinemas (as well as on VentureBeat, where we use it along with other tools for article art).

Conceived by former Magic Leap programmer David Holz and launched in the summer of 2022, it has since attracted a community of more than 16 million users in its server on the separate messaging app Discord, and has been steadily updated by a small team of programmers with new features including panning, vary region and an anime-focused mobile app.

But its latest update launched on the evening of Nov. 1, 2023 — called the style tuner — is arguably the most important yet for enterprises, brands and creators looking to tell cohesive stories in the same style. That’s because Midjourney’s new style tuner allows users to generate their unique visual style and apply it to any and potentially all images generated in the application going forward.

Before style tuning, users had to repeat their text descriptions to generate consistent styles across multiple images — and even this was no guarantee, since Midjourney, like most AI art generators, is built to offer a functionally infinite variety of image styles and types.

Event

AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.

 

Learn More

Now instead, of relying on their language, users can select between a variety of styles and obtain a code to apply to all their works going forward, keeping them in the same aesthetic family. Midjourney users can also elect to copy and paste their code elsewhere to save it and reference it going forward, or even share it with other Midjourney users in their organization to allow them to generate images in that same style. This is huge for enterprises, brands, and anyone seeking to work on group creative projects in a unified style. Here’s how it works:

Where to find Midjourney’s style tuner

Going into the Midjourney Discord server, the user can simply type “/tune” followed by their prompt to begin the process of tuning their styles.

For example, let’s say I want to update the background imagery of my product or service website for the winter to include more snowy scenes and cozy spaces.

I can type in a single prompt idea I have — “a robot wears a cozy sweater and sits in front of a fireplace drinking hot chocolate out of a mug” — after the “/tune,” like this: “/tune a robot wears a cozy sweater and sits in front of a fireplace drinking hot chocolate out of a mug.”

Midjourney’s Discord bot responds with a large automatic message explaining the style-tuning process at a high level and asking if the user wants to continue. The process requires a paid Midjourney subscription plan (they start at $10 per month paid monthly or $96 per year up-front) and uses up some of the fast hours GPU credits that come with each plan (and vary depending on the plan tier level, with more expensive plans granting more fast hours GPU credits). These credits are used for generating images more rapidly than the “relaxed” mode.

Selecting style directions and mode and what they mean

This message includes two drop-down menus allowing the user to select different options: the number of “style directions” (16, 32, 64, or 128) and the “mode” (default or raw).

The “style directions” setting indicates how many different images Midjourney will generate from the user’s prompts, each one showing a distinctly different style. The user will then have the chance to choose their style from between these images, or combine the resulting images to create a new meta-style based on several of them.

Importantly, the different numbers of images produced by the different style direction options each cost a different amount of fast hours GPU credits. For instance, 16 style directions use up 0.15 fast hours of GPU credits, while 128 style directions use up 1.2 credits. So the user should think hard and discerningly about how many different styles they want to generate and whether they want to spend all those credits.

Meanwhile, the “mode” setting is binary, allowing the user to choose between default or raw, referencing how candid and grainy the photos will appear. Raw images are meant to look more like a film or DLSR camera and as such, may be more photorealistic, but also contain artifacts that the default, sanitized and smooth mode does not.

In our walkthrough for this article, VentureBeat selected 16 style directions and default mode. In our tests, and those reported by several users online, Midjourney was erroneously giving users one additional level up of style directions than they asked for — so in our case, we got 32 even though we asked for 16.

After selecting your mode and style directions, the Midjourney bot will ask you if you are sure you want to continue and show you again how many credits you’re using up, and if you press the green button, you can continue. The process can take up to 2 minutes.

Where to find the different styles to choose from

After Midjourney finishes processing your style tuner options, the bot should respond with a message saying “Style Tuner Ready! Your custom style tuner has finished generating. You can now view, share and generate styles here:” followed by a URL to the Midjourney Tuner website (the domain is tuner.midjourney.com).

The resulting URL should contain a random string of letters and numbers at the end. We’ve removed ours for security purposes in the screenshot below.

Clicking the URL takes the user out of the Discord app and onto the Midjourney website in your browser.

There, the user will see a customized yet default message from Midjourney showing the user’s prompt language and explaining how to finish the tuning process. Namely, Midjourney asks the user to select between two different options with labeled buttons: “Compare two styles at a time” or “Pick your favorite from a big grid.”

In the first instance, “compare two styles at a time” Midjourney displays the resulting grid of whatever number of images you selected previously in the style directions option in Discord in rows of two. In our case, that’s 16 rows. However, each row contains two 4×4 image grids, so 8 images per row.

The user can then choose one 4×4 grid from each row, of however many rows they would like, and Midjourney will make a style informed by the combination of those grids. You can tell which grid is selected by the white outline that appears around it.

So, if I chose the image on the right from the first row, and the image on the left from the bottom row, Midjourney would apply both of those image styles into a combined style and the user could apply that combined style to all images going forward. As Midjourney notes on the bottom of this selection page, selecting more choices from each row results in a more “nuanced and aligned” style while selecting only a few options will result in a “bold style.”

The second option, “Pick your favorite from a big grid,” lets the user choose just one image from the entire grid of all images generated from according to the number of style directions the user set previously. In our case for this article, that’s a total of 32 images arranged in an 8×4 grid. This option is more precise and less ambiguous than the “compare two styles” option, but also more limiting as a result.

In our case, for this article, we will select the “compare two styles at a time,” select 5 grids total and leave it to the algorithms to decide what the combined style looks like.

Applying your freshly tuned style going forward to new images and prompts

Whatever number of rows or images a user selects to base their style on, Midjourney will automatically apply that style and turn it into a shortcode of numerals and letters that the user can manually copy and paste for all prompts going forward. That shortcode appears in several places at the bottom of the user’s unique Style Tuner page, both in a section marked “Your code is:” followed by the code, and then also in a sample prompt based on the original the user provided at the very bottom in a persistent overlay chyron element.

The user can then either copy this code and save it somewhere, or copy their entire original prompt with the code added from the bottom chyron. You can also redo this whole style by pressing the small “refresh” icon at the bottom (circular arrows).

Then, the user will need to return to the Midjourney Discord server and paste the code in after their prompt as follows: “imagine/ a robot wears a cozy sweater and sits in front of a fireplace drinking hot chocolate out of a mug –style [INSERT STYLE CODE HERE]”

Here’s our resulting grid of 4×4 images using the original prompt and our freshly generated style:

We like the fourth one best, so we will select that one to upscale by clicking “U4” and voila, there is our resulting cozy robot drinking hot chocolate by the fireplace!

Now let’s apply the same style to a new prompt by copying and pasting/manually adding the “–style” language to the end of our new prompt, like so: “a robot family opens presents –style [INSERT STYLE CODE HERE]” Here’s the result (after choosing one from our 4×4 grid):

Not bad! Note this is after a few regenerations going back and forth. The style code also works alongside other parameters in your prompt, including aspect ratio/dimensions. Here’s a 16:9 version using the same prompt but written like so: “a robot family opens presents –ar 16:9 –style [INSERT STYLE CODE HERE]”

Cute but a little wonky. We might suggest continuing to refine this one.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.





Source link