Readme

Flux 2 Flex

Flux 2 Flex from Black Forest Labs gives you control over the quality-speed trade-off when generating images. Unlike typical text-to-image models that make these decisions for you, Flex lets you adjust how many steps to run and how strongly the model should follow your prompt.

This is the model to reach for when you’re iterating on designs, creating typography-heavy work like infographics or UI mockups, or editing images with multiple reference photos.

What it’s good at

Text rendering is where Flex really shines. The model can reliably generate clean typography, readable captions, and complex layouts—the kind of stuff that usually takes multiple attempts with other models. If you’re making memes, posters, or product mockups with text, this is your model.

The multi-reference feature lets you blend up to ten images into a single output while keeping things coherent. Use this for style transfer, compositional guidance, or when you want to combine elements from different sources.

Practical tips

Start simple with plain text prompts. You don’t need to overthink it:

A neon-lit cyberpunk alley at night, rain-slick streets, reflective puddles

For more control, use a structured approach. Describe what you want in clear sections:

Scene: Modern coffee shop interior with large windows
Subjects: Barista preparing espresso, two customers chatting at a table
Lighting: Warm afternoon sunlight streaming through windows
Style: Photorealistic with shallow depth of field
Camera: Shot at eye level with 35mm lens

Negative prompts don’t work here. Instead of saying what you don’t want, be specific about what you do want. Say “clean background” instead of “no clutter.”

How the parameters work

Guidance scale controls how closely the model sticks to your prompt. Lower values (around 2-3) give the model more creative freedom. Higher values (4-5) make it follow your instructions more literally. Start around 3.5 and adjust based on what you’re seeing.

Number of steps is your quality dial. Fewer steps (6-10) generate images quickly but with less detail—good for rapid prototyping. More steps (20-50) take longer but produce sharper results with better typography. For most work, 20 steps hits a good balance.

Multi-reference images let you upload up to ten reference photos (14 MB total). The model will extract style, composition, and other visual elements to inform the generation. This is useful for maintaining consistent characters across outputs or matching a specific aesthetic.

Image editing

Flex handles editing at resolutions up to 4 megapixels. Upload a base image and describe what you want to change or add. The model will make modifications while keeping the rest of the image coherent.

For style transfer or compositional changes, provide reference images alongside your edit instructions. The model uses these to understand the direction you’re going.

What to expect

Output resolution is 1024 × 1024 pixels in PNG format. The model accepts PNG or JPEG inputs for reference images and editing tasks.

Generation time varies based on your step count—expect 10-30 seconds for typical runs. Higher resolutions and more steps will take longer.

This model is built for production workflows where you need reliable typography and the flexibility to fine-tune quality versus speed. If you’re making quick sketches or need the absolute highest fidelity, check out the other models in the Flux family.

Try it yourself

You can experiment with Flux 2 Flex on the Replicate Playground at replicate.com/playground.

Examples