Stable Diffusion for 2D and 3D artists

Upscale result from 512×512 to 2048×2048. That one could be considered as Upscale stage one. For upscaling of such small pictures, Stable Diffusion does way better job than Topaz product.

Second stage of upscale create 8k size picture from 2k, but that is not mandatory. Both solutions (SD and Giga) have own strong and weak sides – lucky for us, apparently they are not competitors in upscale fight, but great partners.

Pay attention on grass. Gigapixel sharpening slider, have tendency to weird interpretation of lines – even on 1-5 point value (out of 100). From other hand, SD not handle well with remove blur elements and add a lot of noise.

On that picture, take a look at background elements that should be little blur.

Current flow map of my upscaling notes. As you can see, SD branch split into two – I can reveal it's about extension vs “extra tab” modules. For anybody willing to make own tests, I strongly suggest similar flowchart – best is Ultimate SD extension.

Example of organic upscale with ESRGAN x4 via extension - one that I consider as best from all tested solutions (that include countless models from upscale wikipedia )

Sadly ESRGAN not handle lips that well. One of possible solution for that would be use Code Former as middle steep or simple masking

Once again, example of quick concept art for photogrammetry piece. Stable Diffusion allows great customisation and adds something extra each time – for example, instead of black surrounding, we got tree leafs around the model.

A little bit more complicated example of concept creation with Pix2Pix module. That one require two stages, but still, workflow was basically "make everything covered in snow" – we reached stage of magic wand for 2D graphic.

Key to success with Pix2Pix is use of mask, thanks to that, module and Stable Diffusion ignore our model and work around it. Another thing is very “something” as input information for Stable Diffusion. As long its green and have shade variation, it's fine

4th picture shows what we could expect from Pix2Pix if we not use mask - way better to first work with main model and have good winter "conversion", and then add fake snow with brush as suggestion for Pix2Pix what we want to see.

Example output from custom LoRA sub-model. Not best, I must admit – but apparently only because, learning data wasn't best one. My suggestion for everybody – background and angles are key for success. Variations, situations, examples – best food for LoRA

Example of my learning dataset. Various angles, light scenarios - huge mistake was not adding backgrounds and zoom out/in examples. Because of that, LoRA learned the concept of that single position/background with limits elasticity of sub-model.

Internet is still not sure that we should, or not, use regularization pictures. One is sure, they are information sign for learning process that we want something around that shape (deformed pumpkin is better than picture of cat when we try create LoRA)

Other example outputs from custom LoRA. As you can see, maybe background change, but position or rotation of pumpkin, not so much. Light still works.

Result of ControlNet guidance as concept tool. That one require the biggest amount of steeps and Stable Diffusion knowledge – but do not worry, img2img and ControlNet allows making a lot of mistakes and still get decent results.

Settings for first ControlNet stage of concept creation. We can use many modules but for that one I decided to use "scribbles". Everything else was default.

Results of generation with scribbles – you can even see those as last picture (2D master level! ). No matter method, result matter – and we got it right away (gen 4th catch my eye).

Stage 2 of CN. This time I wanted some variations of scene – but strongly based on existing input. For that, we can use Depth, HED or Canny. I like use canny most time, but with tweak of sliders time to time – they are responsible for amount of details.

The third stage is with use of Render engine (Maverick in my case) – to match perspective and light over our model. Stable Diffusion can emulate light play, but not that great.

Almost done. Now we need to blend our model into the scene (blurred edges, depth of field etc.) You can use inpaint or custom mask.

Last stage is about upscale and again blending our 2k rendered pumpkin with SD output (except edges that remain blurred as they should). Decided to add mug and basket with apples in bg - notice how nice light and blur came out!

Here is a summary of past months of constant playing with Stable Diffusion. Not sure if there was/will be “software” that improve, change and expand so fast.
Decided to present you 4 potential usage other than just pure anime/porn image generation that dominate Stable Diffusion world.
For LoRA, Pix2Pix and ControlNet are very optional considering that everything can be done in 3D faster or better, but upscaling is something worth to explore.

In following days I plan to write in depth blogposts where I will explain each of the above with more images and information.

Previous artwork

Next artwork