--- license: apache-2.0 library_name: diffusers pipeline_tag: text-to-image tags: - flux - text-to-image - image-generation - fp16 --- # FLUX.1-dev FP16 High-quality text-to-image generation model from Black Forest Labs. This repository contains the FLUX.1-dev model in FP16 precision for optimal quality and compatibility with modern GPUs. ## Model Description FLUX.1-dev is a state-of-the-art text-to-image diffusion model designed for high-fidelity image generation. This FP16 version maintains full precision for maximum quality output, ideal for creative professionals and researchers requiring the highest image quality. **Key Capabilities**: - High-resolution text-to-image generation - Advanced prompt understanding with T5-XXL text encoder - Superior detail and coherence in generated images - Wide range of artistic styles and subjects - Multi-text encoder architecture (CLIP + T5) ## Repository Contents ``` flux-dev-fp16/ ├── checkpoints/flux/ │ └── flux1-dev-fp16.safetensors # 23 GB - Complete model checkpoint ├── clip/ │ └── t5xxl_fp16.safetensors # 9.2 GB - T5-XXL text encoder ├── clip_vision/ │ └── clip_vision_h.safetensors # CLIP vision encoder ├── diffusion_models/flux/ │ └── flux1-dev-fp16.safetensors # 23 GB - Diffusion model ├── text_encoders/ │ ├── clip-vit-large.safetensors # 1.6 GB - CLIP ViT-Large encoder │ ├── clip_g.safetensors # 1.3 GB - CLIP-G encoder │ ├── clip_l.safetensors # 235 MB - CLIP-L encoder │ └── t5xxl_fp16.safetensors # 9.2 GB - T5-XXL encoder └── vae/flux/ └── flux-vae-bf16.safetensors # 160 MB - VAE decoder (BF16) Total Size: ~72 GB ``` ## Hardware Requirements ### Minimum Requirements - **VRAM**: 24 GB (RTX 3090, RTX 4090, A5000, A6000) - **RAM**: 32 GB system memory - **Disk Space**: 80 GB free space - **GPU**: NVIDIA GPU with Compute Capability 7.0+ (Volta or newer) ### Recommended Requirements - **VRAM**: 32+ GB (RTX 6000 Ada, A6000, H100) - **RAM**: 64 GB system memory - **Disk Space**: 100+ GB for workspace and outputs - **GPU**: NVIDIA RTX 4090 or professional GPUs ### Performance Notes - FP16 precision provides best quality but highest VRAM usage - Consider FP8 version if VRAM is limited (see `flux-dev-fp8` directory) - Generation time: ~30-60 seconds per image at 1024x1024 (depending on GPU) ## Usage Examples ### Using with Diffusers Library ```python import torch from diffusers import FluxPipeline # Load the pipeline with local model files pipe = FluxPipeline.from_pretrained( "E:/huggingface/flux-dev-fp16", torch_dtype=torch.float16 ) pipe = pipe.to("cuda") # Generate an image prompt = "A majestic lion standing on a cliff at sunset, cinematic lighting, photorealistic" image = pipe( prompt=prompt, num_inference_steps=50, guidance_scale=7.5, height=1024, width=1024 ).images[0] image.save("output.png") ``` ### Using with ComfyUI 1. Copy model files to ComfyUI directories: - `checkpoints/flux/flux1-dev-fp16.safetensors` → `ComfyUI/models/checkpoints/` - `text_encoders/*.safetensors` → `ComfyUI/models/clip/` - `vae/flux/flux-vae-bf16.safetensors` → `ComfyUI/models/vae/` 2. In ComfyUI: - Load Checkpoint: Select `flux1-dev-fp16` - Text Encoder: Automatically loaded - VAE: Select `flux-vae-bf16` ### Using Individual Components ```python from diffusers import AutoencoderKL from transformers import T5EncoderModel, CLIPTextModel # Load text encoders t5_encoder = T5EncoderModel.from_pretrained( "E:/huggingface/flux-dev-fp16/text_encoders", torch_dtype=torch.float16, filename="t5xxl_fp16.safetensors" ) clip_encoder = CLIPTextModel.from_pretrained( "E:/huggingface/flux-dev-fp16/text_encoders", torch_dtype=torch.float16, filename="clip_l.safetensors" ) # Load VAE vae = AutoencoderKL.from_pretrained( "E:/huggingface/flux-dev-fp16/vae/flux", torch_dtype=torch.bfloat16, filename="flux-vae-bf16.safetensors" ) ``` ## Model Specifications **Architecture**: - **Type**: Latent Diffusion Transformer - **Parameters**: ~12B (diffusion model) - **Text Encoders**: - T5-XXL: 4.7B parameters (FP16) - CLIP-G: 1.3B parameters - CLIP-L: 235M parameters - **VAE**: BF16 precision (160M parameters) **Precision**: - **Diffusion Model**: FP16 (float16) - **Text Encoders**: FP16 (float16) - **VAE**: BF16 (bfloat16) **Format**: - `.safetensors` - Secure tensor format with fast loading **Resolution Support**: - Native: 1024x1024 - Range: 512x512 to 2048x2048 - Aspect ratios: Supports non-square resolutions ## Performance Tips ### Memory Optimization ```python # Enable memory efficient attention pipe.enable_attention_slicing() # Enable VAE tiling for high resolutions pipe.enable_vae_tiling() # Use CPU offloading if VRAM limited (slower) pipe.enable_sequential_cpu_offload() ``` ### Speed Optimization ```python # Use torch.compile for faster inference (PyTorch 2.0+) pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # Reduce inference steps (trade quality for speed) image = pipe(prompt, num_inference_steps=25) # Default is 50 ``` ### Quality Optimization - Use 50-75 inference steps for best quality - Guidance scale: 7-9 for balanced results - Higher guidance (10-15) for stronger prompt adherence - Consider prompt engineering for better results ## License This model is released under the **Apache 2.0 License**. **Usage Terms**: - ✅ Commercial use allowed - ✅ Modification and redistribution allowed - ✅ Patent use allowed - ⚠️ Requires attribution to Black Forest Labs See the LICENSE file for full terms. ## Citation If you use this model in your research or projects, please cite: ```bibtex @misc{flux-dev, title={FLUX.1-dev: High-Quality Text-to-Image Generation}, author={Black Forest Labs}, year={2024}, howpublished={\url{https://blackforestlabs.ai/}} } ``` ## Related Resources - **Official Website**: https://blackforestlabs.ai/ - **Model Card**: https://huggingface.co/black-forest-labs/FLUX.1-dev - **Documentation**: https://huggingface.co/docs/diffusers/en/api/pipelines/flux - **Community**: https://huggingface.co/black-forest-labs ## Version Information - **Model Version**: FLUX.1-dev - **Precision**: FP16 - **Release**: 2024 - **README Version**: v1.4 --- For FP8 precision version (lower VRAM usage), see `E:/huggingface/flux-dev-fp8/`