bill discovering the tv show gravity falls
It's the 12th time they've watched Dreamscaperers over and over again. It won't be the last, either.
I apologize if the Axolotl looks weird, it's my first time drawing him and couldn't quite figure out what his proportions are :(
314 notes
·
View notes
A friendly wizard and style reference.
Midjourney has just released both the version 6 of its niji anime engine and the first version of its "style reference" tool.
Functionally this is a variation of the image prompting system (explained here), in which breaks a submitted image down into the 'token language' the AI uses internally and uses that as a supplement to a text prompt. "Style Reference" (or 'sref') lets you do this with up to three images, only with only the tokens associated with 'style' being drawn upon.
This is not to be confused with style transfer, a much older and very different AI art process.
But what is a style in this context? And how does it affect generation?
Prompt: a blue axolotl-anthro wizard in a red-and-yellow swirl-pattern robe, holding a sheleighleigh made of purple wood and a potion full of glowing green energy drink. A blue-and-green ladybug familiar stands near his feet, white background, fullbody image
Settings: --niji 6, --style raw --s 50 --seed 1762468963
Here, I've tested the same seed and prompt with a number of reference images.
My semiorganized ramblings under the fold
The first thing I note is that style reference affects the gen so much that same-seed/different style ref comparisons are kind of pointless. Way too much of pose, composition and content changes for it to matter, so for future style ref tests, I'm probably going to drop the seeds.
The second thing I note is that there are certain limitations. You need to change up your prompt for things like photography, and the system interprets styles using its own criteria, not ours. If image prompting misinterprets something, so will style ref, but perhaps not in the same way.
This is notable for the one prompted with a scan from the Nuremberg Chronicle (first row). It recognizes that its a woodcut and emulates that general vibe nicely, but MJ is highly tuned for aesthetics, and emulating real world jank and clumsiness is a weak area. This is literally the first printed (european at least) book with illustrations. Every example thereafter is building on that skillset, so the dataset for woodcuts is going to be largely of a higher apparent quality.
In short, with Midjourney, additional prompt work is needed to replicate the look of early jank or intentionally 'ugly' art styles, and even as recent as v6 I've had no luck with things like midcentury Hanna-Barbereesque cheap TV animation styles or shitty 1990s CGI.
Style reference can help, I've gotten some pretty good cheap 80s-90s TV animation looking stuff from v6 niji and style ref in my early tests:
Color observations: Absent specific requests in the prompt, SREF will stick pretty close to the palette and lighting conditions of the referenced image. With such instructions, you get blending, so the one referencing the okapi fakemon (second row from bottom), for instance, has a lot of colors the reference image doesn't have, but they're in similar in vibrancy and saturation.
One limitation, however, is it doesn't apply to the aspects of the gen that come from any image prompts, so it will always blend the style of the style reference with the style aspects inherited from the image prompt, and that is very strong compared to the style ref.
Using the dog as the image prompt, and the TFTM reformatting as the style prompt, and the text prompt: "a cute older yorkie dog sitting on a bedspread", we get the image on the left. Dropping the image prompt weight to .25 gets us the center option, and removing the image prompt entirely produces the one on the right.
I expect this will be patched eventually, or general image prompting may fall out of favor compared to a combination of style ref and the upcoming character reference option, which will be the same thing, but will only reference the tokens associated with the character in the reference image. Depending on how that works that will have a lot of uses.
Stay tuned for more experiments. There's some good potential for freaky, unexplored aesthetics with combinations of multiple style refs and text prompts.
55 notes
·
View notes