Which image editing model should I use?

Posted September 23, 2025 by

shridharathi

Use Playground to compare image editing models

Replicate Playground

In the past few weeks, nearly every major AI lab has released an image editing model. The first was FLUX.1 Kontext from Black Forest Labs in May, which stood out for style transformations and simple image edits. Since then, we’ve seen a wave of models, each strong in its own way.

With so many options, it can be hard to figure out which one works best for your needs. In this post, we’re putting them head to head and evaluating each across a range of image editing tasks. By the end, you should have a clear sense of which one fits your workflow.

To start, here’s an overview of the cost and average inference time for each model we’re evaluating.

Model	Lab	Price per Image	Inference Time
FLUX.1 Kontext [dev]	Black Forest Labs	$0.025	1.7 seconds
FLUX.1 Kontext [pro]	Black Forest Labs	$0.04	4.4 seconds
FLUX.1 Kontext [max]	Black Forest Labs	$0.08	4.9 seconds
Qwen Image Edit	Alibaba	$0.03	2.9 seconds
Qwen Image Edit Plus	Alibaba	$0.03	16 seconds
Nano Banana	Google	$0.039	10 seconds
SeedEdit 3.0	ByteDance	$0.03	13 seconds
Seedream 4	ByteDance	$0.03	14 seconds
GPT Image 1	OpenAI	$0.01-$0.25	40 seconds

The cheapest is GPT-image-1 from OpenAI which starts $0.01 per image, but it has the longest generation time (around 40 seconds). FLUX.1 Kontext [dev] (optimized by Pruna AI) is the fastest at 1.9s per generation and is also one of the cheaper options, but there is, of course, a trade off with image editing quality for hyper-optimized models.

For our tests, we’re evaluating the base model from each AI lab. Specifically, for FLUX.1 Kontext and Qwen, we’ll only show results from FLUX.1 Kontext [pro] and Qwen Image Edit.

Let’s put these models up to the test.

Object removal

The first task we’re looking at is object removal. This is a basic task that one should be able to do in Photoshop. In particular, if we remove objects that are in front of other elements of an image, how well is the model able to interpolate what is behind the removed object?

We tested this with an image of the Golden Gate Bridge.

Original Golden Gate Bridge image — Original image

Here’s how different image editing models perform when tasked with removing the bridge from the image:

Remove the bridge

GPT-image-1 object removal result — GPT-image-1

FLUX.1 Kontext object removal result — FLUX.1 Kontext [pro]

Nano Banana object removal result — Nano Banana

Qwen VL object removal result — Qwen Image Edit

SeedEdit object removal result — SeedEdit 3.0

Seedream object removal result — Seedream 4

Winners: SeedEdit 3.0 and Qwen Image Edit

Loser: FLUX.1 Kontext [pro]

The model that struggled the most was FLUX.1 Kontext [pro], which left the two towers in place. Nano Banana removed the entire bridge but failed to keep the background hills consistent. GPT-image-1 smoothed out the building in the bottom left corner but did successfully remove the bridge. The other models handled the task well.

Front view comparison

Another common image editing task is changing the viewing angles of the object in the image.

Original image for front view transformation — Original image

Let’s see which image models can give us the front-facing view of this character and her cat while maintaining character consistency.

Show the front view of the woman and the cat

GPT-4 Vision front view result — GPT-image-1

FLUX.1 Kontext front view result — FLUX.1 Kontext [pro]

Nano Banana front view result — Nano Banana

Qwen Image Edit front view result — Qwen Image Edit

SeedEdit front view result — SeedEdit 3.0

Winner: Qwen Image Edit

Loser: SeedEdit 3.0

Only GPT-image-1 and Qwen Image Edit gave us the head-on view we were looking for, albeit GPT-image-1 did not seem to maintain character consistency. FLUX.1 Kontext [pro] and Nano Banana did fairly well at showing a front view of our character; both of them even managed to preserve the tattoo on the character’s arm. The ByteDance models struggled the most — SeedEdit did not turn the character at all and Seedream did not preserve our character.

Background editing

Background editing requires models to understand object boundaries and generate coherent environments. Here’s how different image editing models perform when tasked with editing or replacing backgrounds:

Make the background a jungle

GPT-image background editing result — GPT-image-1

FLUX.1 Kontext background editing result — FLUX.1 Kontext [pro]

Nano Banana background editing result — Nano Banana

Qwen background editing result — Qwen Image Edit

SeedEdit background editing result — SeedEdit 3.0

Seedream background editing result — Seedream 4

Winner: SeedEdit 3.0 and Seedream 4

Loser: Nano Banana

Nano Banana performs the worst here, cutting out a small piece of the character and placing it on a generic jungle background. The ByteDance Seed models do the best, with strong character consistency, natural lighting, and believable placement. FLUX.1 Kontext [pro] comes close but doesn’t fully land it, while GPT-image-1 and Qwen generate characters that look noticeably different. Qwen also smooths out the textures, making the result feel less detailed.

Text editing

Text editing within images represents one of the most challenging and impressive capabilities of modern image editing models. The ability to understand, modify, and generate text while maintaining proper typography, perspective, and lighting is a remarkable technical achievement that was nearly impossible even a year ago.

In this evaluation, we are looking for which image models preserve the original font of the text and maintain the physical elements of the signage (i.e. sign texture/color, placement of the surrounding words, etc.)

Let’s see if we change the word “seven” to “eight” in the following image:

Original image for text editing — Original image

Change ‘seven’ to ‘eight’

GPT-image-1 text editing result — GPT-image-1

FLUX.1 Kontext text editing result — FLUX.1 Kontext [pro]

Nano Banana text editing result — Nano Banana

Qwen text editing result — Qwen Image Edit

SeedEdit text editing result — SeedEdit 3.0

Seedream text editing result — Seedream 4

Winners: FLUX.1 Kontext [pro] and Nano Banana

Losers: GPT-image-1 and Seedream 4

The favorites here are FLUX.1 Kontext [pro] and Nano Banana which were able to introduce the word “eight” naturally with consistent type and placement. Even the paper-like texture of the note is preserved in these edits. With Seededit and Qwen, the word “eight” is sticking out and clearly looks edited in. GPT-image-1 looks visually appealing but did not maintain the original note. Seedream’s typography looks fine but it did produce an artifact in the “to:” section of the note.

Style transfer

Style transfer showcases each model’s ability to understand artistic styles and apply them while preserving the original image’s content and composition. Some models excel at capturing fine artistic details while others focus on maintaining structural integrity.

Here’s how these models handle style transfer tasks, specifically converting images to an oil painting style:

Original image for oil painting style transfer — Original image

Transform this into an oil painting

GPT-image-1 oil painting style transfer result — GPT-image-1

FLUX.1 Kontext oil painting style transfer result — FLUX.1 Kontext [pro]

Nano Banana oil painting style transfer result — Nano Banana

Qwen oil painting style transfer result — Qwen

SeedEdit oil painting style transfer result — SeedEdit

Seedream oil painting style transfer result — Seedream

Winner: Nano Banana

Loser: FLUX.1 Kontext [pro]

This task yielded interesting results across all the models as each have different ideas on what an oil painting should look like. Nano Banana and Seedream look closest to the original image, offering an air-brushed, well-blended look. GPT-image also has the short strokes but also has its signature yellow tint. Qwen and FLUX.1 Kontext [pro] are quite similar, with more of a painterly, unblended look (yet both of these also have the yellow tint).

Takeaways

After evaluating these six image editing models across five distinct tasks — object removal, perspective transformation, background editing, text manipulation, and style transfer — there are some clear winners that can guide your choice based on specific needs and priorities.

Object removal: Most models succeeded, but FLUX.1 Kontext [pro] struggled
Perspective changes: GPT Image 1 and Qwen Image Edit best achieved the requested front-facing views with character consistency
Background editing: The ByteDance models (SeedEdit and Seedream) clearly dominated with natural integration of the character with the jungle landscape
Text editing: FLUX.1 Kontext and Nano Banana preserved typography and texture most effectively
Style transfer: Nano Banana and Seedream maintained closest resemblance to originals while achieving nice artistic effects

Keep in mind that these were all surface-level experimentations and the above recommendations might not be enough to justify your choice of model.

Need to experiment some more? Check out Replicate’s playground to parallely test and compare image editing models (or any model):

Replicate playground showing image editing models — Playground on Replicate

It’s what we used to create this post!

As always, chat with us on Discord and follow us on X to keep up with the latest.

Next: Introducing our new search API