M3
This update is big in some ways, but I will keep this post short.
In essence, I have overhauled the design thinking from M1 and M2 by replacing the hybrid structure with a purely horizontal one. Navigation across the entire canvas is now (excitingly) keyboard based.
I have also done away with Ultimate Upscaler and focused on an upscaling solution using TTPlanet’s tiling approach.
The JSON link is at the bottom of the page.
overview
The Ui is now spread horizontally VS the hybrid approach of past workflows. The canvas is now so large that I can’t fit the upscaling groups into the frame on a 24” screen.
Bookmarks
Everyone knows that panning and dragging can become a real pain in the wrists the moment workflows become complex and stretching about. Many choose to compact everything into a tight spot by hiding all the connections and secondary nodes behind main ones. I used to stack the groups vertically as another solution. I believe none are as effective as pinning each group to its own bookmark node, and accessing it through a simple keyboard shortcut.
In my case, i’ve tagged each group with a number. Model Loaders is 1, PuLID is 2, the prompting area is 6, upscalers begin at 7. Hopping from one to the other becomes a matter of keystrokes, not mouse drags and zooms.
Call it what you want, but for me that is a real game changer, and why I speak of it first in this post. 🙌
HIGHRES FIX
I used to generate at high resolution (1536px+) right off the bat, as it would help me reach even higher resolutions in subsequent upscales faster and with better quality. In my latest tests however, I have found that generating at the base 1024px then “highres fixing” it to 2K was actually faster than shooting straight to 1.5K or 1.8K by an order of 10-15 seconds on a 3090.
Prompt adherence isn’t as well respected as in higher resolutions, but the speed gain means I can batch generate faster and thus iterate even quicker.
As of now, the resolution pipeline works like this:
Base Gen @ 1K
Highres fix @ 2K
Upscale Stage 1 @ 4K
Upscale Stage 2 @ 6K
Upscaling
Prior M workflows relied on Flux Ultimate Upscale. Before them I used to run on SDXL Controlnet tiled upscaling. Both had their pros and cons.
This one feels like a hybrid. It breaks down the image into tiles and runs each through a vision model - Florence, in this case. It then prompts them individually, combines them together and generates the final composite in one go. In my testings, it has consistently outperformed both prior solutions in speed and as well as detail quality.
Another upside is the ability to push through the denoising with no visible hallucinations from tile to tile. I leave mine around 0.4-0.6 for the first pass (Upscale A) and 0.2-0.3 for the second (Upscale B), depending on the situation.
Everything is automated here, from tile size, which is upscale resolution dependant, to the prompting and compositing.
On an RTX3090, the time to go from 2K to 4K is 2 minutes, and around an additional 5-6 minutes to reach 6K.
A 3rd stage (to reach 12K+) is being tested and will be released in the next version.
On a sidenote: the reason I have grown accustomed to breaking the upscaling into 1.5x - 2x stages rather than the classic 4x ones, is because details are better preserved and enhanced through each stage. It is also a faster iterative approach since you can pause the first stage and refine it quickly before moving on in the chain.
Notes everywhere
I’m now annotating as much as I can everywhere, which is another reason why I keep the post short. The full info is now embedded directly in the workflow. You’ll find the links to the models, how to circumvent known errors, and suggestions on how to best tune the parameters.
Whenever you see a black node means info is there for the viewing.