--- license: mit base_model: runwayml/stable-diffusion-v1-5 tags: - stable-diffusion - diffusion - distillation - flow-matching - geometric-deep-learning - research library_name: diffusers pipeline_tag: text-to-image --- # Why do I hear boss music? ## 10000 steps Currently retraining the scale, but it was trained with many raw unscaled latents and it makes the default output hazy. ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/6GFXrQy6vm8h2mdkK5mvD.png) Use this to correctly orient the output to the correct VAE scale. ## Shift 2 is the training target ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/3aUl0td4RiDL9yjMw87KT.png) Higher or lower may yield different results. ## use this ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/zXNIFANpK7Yqmm4oPUuUR.png) a castle at sunset ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/fOeEzWg-VgA7s8ubmKcnv.png) a mountain view with a beautiful landscape ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/Tsk2QSKd6cH0eJ-H_iJ_C.png) a woman sitting on the bus ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/UIQ29npfiE1KfFLOJbCZv.png) a carrot on a cake ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/hWTxprkxdeu8E_E0iqV8J.png) a refrigerator to the left of a table ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_puDnUG_xuazq6soFqfVj.png) a mad scientist's laboratory with strange gagets and mechanisms ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/qgZvxpGSwODJ9dxUxi4iA.png) steampunk goku ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/IITrYMTxNm3BApR-txYmW.png) a man standing on top of a table in the middle of a room full of curtains. ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/P-vleYAQAhHxvXYLLHBjk.png) ## 5000 steps ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/QEAkOA49IHvHeLTFvhe-O.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/LfGEMW5AWdDIf3bFFZsOD.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/tdwAqMrA6b3zy51G6Wu1k.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/eaoQ3iY_QIEfhwA5SK0zV.png) a mad scientists laboratory ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/xqDeCGbxWMhAfD4QV9w2B.png) ## 4000 steps Utilizing this synthesized image set here: https://huggingface.co/datasets/AbstractPhil/sd15-latent-distillation-500k As of typing this, the 500k isn't finished synthesizing. It's at around 200k, which should be more than enough to get a baseline. At 4000 steps the new flow matching trainer is already manifesting results. ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/_h52WVv4rgvzk2H08Jpmy.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/n--fn2cNfsYmi7e3SqmXc.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/XXT9NEEtYtIUrF52hJFWO.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/lhXF0_fOUyandv_hUC3xN.png) Within 4000 steps at batch 16 the pretrained flow matching SD1.5 model is already building convergence. This model was the sd15-flow-matching-try2 aka Lune variation, and I can say for certain she is most definitely not burned. The trainer is in the files.