Readme
FLUX 2 Klein 4B Base with LoRA support
Black Forest Labs’ FLUX 2 Klein 4B Base is a text-to-image model designed specifically for fine-tuning and LoRA training. This version on Replicate supports fast LoRA inference, which means you can run the base model with custom LoRA adaptations trained on your own images or styles.
What makes this different
Most image models you run are already trained and optimized for speed. This base model is different. It’s the undistilled foundation model that keeps the complete training signal, which makes it ideal when you want maximum flexibility and control. Think of it as the raw material that’s ready to be shaped into exactly what you need.
The “klein” name comes from the German word for small. At 4 billion parameters, it’s significantly smaller than most high-quality image models, but it still generates surprisingly good images. The real value is that it’s open source under Apache 2.0
When to use this model
This model is for you if you’re training custom LoRA adaptations. LoRA (Low-Rank Adaptation) is a technique that lets you fine-tune a model on specific styles, subjects, or concepts without retraining the entire model. It’s how you teach a model to generate images in your specific visual style, maintain consistent characters across images, or capture the look of your brand.
You’d use this base model when you want to run LoRA weights you’ve trained yourself. Since it’s undistilled, it preserves more of the original model’s understanding, which means your LoRA adaptations tend to work better and give you more diverse outputs compared to using distilled models.
Some practical uses: - Training LoRA weights on your own artistic style and running them here for fast generation - Creating consistent characters or products across multiple images by applying custom LoRA adaptations - Building specialized image generators for specific domains like architectural visualization, product photography, or illustration styles - Research and experimentation with how different LoRA training approaches affect output quality
How it works
FLUX 2 Klein uses a rectified flow transformer architecture. Without getting too technical, this means it learns direct paths between random noise and clean images, rather than the traditional approach of gradually removing noise over many steps. This makes it efficient while maintaining quality.
The model combines a vision-language understanding component (based on Qwen 3) with a rectified flow transformer. The vision-language part brings in knowledge about the world and semantic understanding, while the transformer handles spatial composition, materials, and visual structure. A specialized autoencoder handles the conversion between pixel space and the compressed latent space where generation happens.
For LoRA inference, the model loads your custom LoRA weights on top of this base model, modifying specific parts of the network to produce images that match your training data while keeping the base model’s general capabilities intact.
Technical details
Since this is the base model, it gives you higher output diversity compared to distilled models. The distilled variants are optimized for speed and work great for most use cases, but the base model is better when you’re doing research, custom training, or need that extra control over the generation process.
What it’s good at
Text-to-image generation with detailed prompts. The model handles complex multi-part instructions well and understands compositional rules better than many models its size. You can reference colors using hex codes, specify exact arrangements, and describe intricate scenes with multiple elements.
It also supports multi-reference image conditioning, which means you can provide multiple reference images to maintain consistent subjects, styles, or compositions across generations. This is particularly useful when you’re using LoRA weights trained on specific characters or visual styles.
The model has been trained with safety measures to reduce harmful content generation. It went through multiple rounds of safety fine-tuning and evaluation before release, and Black Forest Labs conducted third-party testing specifically focused on preventing abuse.
Limitations
This model isn’t designed to provide factual information. If it generates text in images, that text might be inaccurate or distorted. It’s an image generation model, not a knowledge base.
Like any statistical model trained on internet data, it may reflect biases present in the training data. The model is released under Apache 2.0, which means you can use it commercially, but you’re responsible for how you use it.
Being the base model rather than a distilled variant means it’s slower than the distilled FLUX 2 Klein 4B. If you need sub-second generation and aren’t running custom LoRA weights, you might want the distilled version instead. This base model is specifically for cases where you need the full training signal for fine-tuning or LoRA work.
About the license
This model is fully open source under the Apache 2.0 license. You can use it for commercial projects, modify it, or build on top of it without needing special licensing from Black Forest Labs. Your outputs are yours to use however you want.
Resources
For the official model weights and documentation, see Black Forest Labs’ Hugging Face repository at black-forest-labs/FLUX.2-klein-base-4B. For detailed API documentation, check out Replicate’s API docs.
You can try the model on the Replicate Playground at replicate.com/playground.