Which image editing model should I use?

Posted September 23, 2025 by
Use Playground to compare image editing models

Replicate Playground

In the past few weeks, nearly every major AI lab has released an image editing model. The first was FLUX.1 Kontext from Black Forest Labs in May, which stood out for style transformations and simple image edits. Since then, we’ve seen a wave of models, each strong in its own way.

With so many options, it can be hard to figure out which one works best for your needs. In this post, we’re putting them head to head and evaluating each across a range of image editing tasks. By the end, you should have a clear sense of which one fits your workflow.

To start, here’s an overview of the cost and average inference time for each model we’re evaluating.

ModelLabPrice per ImageInference Time
FLUX.1 Kontext [dev]Black Forest Labs$0.0251.7 seconds
FLUX.1 Kontext [pro]Black Forest Labs$0.044.4 seconds
FLUX.1 Kontext [max]Black Forest Labs$0.084.9 seconds
Qwen Image EditAlibaba$0.032.9 seconds
Qwen Image Edit PlusAlibaba$0.0316 seconds
Nano BananaGoogle$0.03910 seconds
SeedEdit 3.0ByteDance$0.0313 seconds
Seedream 4ByteDance$0.0314 seconds
GPT Image 1OpenAI$0.01-$0.2540 seconds

The cheapest is GPT-image-1 from OpenAI which starts $0.01 per image, but it has the longest generation time (around 40 seconds). FLUX.1 Kontext [dev] (optimized by Pruna AI) is the fastest at 1.9s per generation and is also one of the cheaper options, but there is, of course, a trade off with image editing quality for hyper-optimized models.

For our tests, we’re evaluating the base model from each AI lab. Specifically, for FLUX.1 Kontext and Qwen, we’ll only show results from FLUX.1 Kontext [pro] and Qwen Image Edit.

Let’s put these models up to the test.

Object removal

The first task we’re looking at is object removal. This is a basic task that one should be able to do in Photoshop. In particular, if we remove objects that are in front of other elements of an image, how well is the model able to interpolate what is behind the removed object?

We tested this with an image of the Golden Gate Bridge.

Original Golden Gate Bridge image
Original image

Here’s how different image editing models perform when tasked with removing the bridge from the image:

Remove the bridge

GPT-image-1 object removal result
GPT-image-1
FLUX.1 Kontext object removal result
FLUX.1 Kontext [pro]
Nano Banana object removal result
Nano Banana
Qwen VL object removal result
Qwen Image Edit
SeedEdit object removal result
SeedEdit 3.0
Seedream object removal result
Seedream 4

Winners: SeedEdit 3.0 and Qwen Image Edit

Loser: FLUX.1 Kontext [pro]

The model that struggled the most was FLUX.1 Kontext [pro], which left the two towers in place. Nano Banana removed the entire bridge but failed to keep the background hills consistent. GPT-image-1 smoothed out the building in the bottom left corner but did successfully remove the bridge. The other models handled the task well.

Front view comparison

Another common image editing task is changing the viewing angles of the object in the image.

Original image for front view transformation
Original image

Let’s see which image models can give us the front-facing view of this character and her cat while maintaining character consistency.

Show the front view of the woman and the cat

GPT-4 Vision front view result
GPT-image-1
FLUX.1 Kontext front view result
FLUX.1 Kontext [pro]
Nano Banana front view result
Nano Banana
Qwen Image Edit front view result
Qwen Image Edit
SeedEdit front view result
SeedEdit 3.0
Seedream front view result
Seedream 4

Winner: Qwen Image Edit

Loser: SeedEdit 3.0

Only GPT-image-1 and Qwen Image Edit gave us the head-on view we were looking for, albeit GPT-image-1 did not seem to maintain character consistency. FLUX.1 Kontext [pro] and Nano Banana did fairly well at showing a front view of our character; both of them even managed to preserve the tattoo on the character’s arm. The ByteDance models struggled the most — SeedEdit did not turn the character at all and Seedream did not preserve our character.

Background editing

Background editing requires models to understand object boundaries and generate coherent environments. Here’s how different image editing models perform when tasked with editing or replacing backgrounds:

Original image for background editing
Original image

Make the background a jungle

GPT-image background editing result
GPT-image-1
FLUX.1 Kontext background editing result
FLUX.1 Kontext [pro]
Nano Banana background editing result
Nano Banana
Qwen background editing result
Qwen Image Edit
SeedEdit background editing result
SeedEdit 3.0
Seedream background editing result
Seedream 4

Winner: SeedEdit 3.0 and Seedream 4

Loser: Nano Banana

Nano Banana performs the worst here, cutting out a small piece of the character and placing it on a generic jungle background. The ByteDance Seed models do the best, with strong character consistency, natural lighting, and believable placement. FLUX.1 Kontext [pro] comes close but doesn’t fully land it, while GPT-image-1 and Qwen generate characters that look noticeably different. Qwen also smooths out the textures, making the result feel less detailed.

Text editing

Text editing within images represents one of the most challenging and impressive capabilities of modern image editing models. The ability to understand, modify, and generate text while maintaining proper typography, perspective, and lighting is a remarkable technical achievement that was nearly impossible even a year ago.

In this evaluation, we are looking for which image models preserve the original font of the text and maintain the physical elements of the signage (i.e. sign texture/color, placement of the surrounding words, etc.)

Let’s see if we change the word “seven” to “eight” in the following image:

Original image for text editing
Original image

Change ‘seven’ to ‘eight’

GPT-image-1 text editing result
GPT-image-1
FLUX.1 Kontext text editing result
FLUX.1 Kontext [pro]
Nano Banana text editing result
Nano Banana
Qwen text editing result
Qwen Image Edit
SeedEdit text editing result
SeedEdit 3.0
Seedream text editing result
Seedream 4

Winners: FLUX.1 Kontext [pro] and Nano Banana

Losers: GPT-image-1 and Seedream 4

The favorites here are FLUX.1 Kontext [pro] and Nano Banana which were able to introduce the word “eight” naturally with consistent type and placement. Even the paper-like texture of the note is preserved in these edits. With Seededit and Qwen, the word “eight” is sticking out and clearly looks edited in. GPT-image-1 looks visually appealing but did not maintain the original note. Seedream’s typography looks fine but it did produce an artifact in the “to:” section of the note.

Style transfer

Style transfer showcases each model’s ability to understand artistic styles and apply them while preserving the original image’s content and composition. Some models excel at capturing fine artistic details while others focus on maintaining structural integrity.

Here’s how these models handle style transfer tasks, specifically converting images to an oil painting style:

Original image for oil painting style transfer
Original image

Transform this into an oil painting

GPT-image-1 oil painting style transfer result
GPT-image-1
FLUX.1 Kontext oil painting style transfer result
FLUX.1 Kontext [pro]
Nano Banana oil painting style transfer result
Nano Banana
Qwen oil painting style transfer result
Qwen
SeedEdit oil painting style transfer result
SeedEdit
Seedream oil painting style transfer result
Seedream

Winner: Nano Banana

Loser: FLUX.1 Kontext [pro]

This task yielded interesting results across all the models as each have different ideas on what an oil painting should look like. Nano Banana and Seedream look closest to the original image, offering an air-brushed, well-blended look. GPT-image also has the short strokes but also has its signature yellow tint. Qwen and FLUX.1 Kontext [pro] are quite similar, with more of a painterly, unblended look (yet both of these also have the yellow tint).

Takeaways

After evaluating these six image editing models across five distinct tasks — object removal, perspective transformation, background editing, text manipulation, and style transfer — there are some clear winners that can guide your choice based on specific needs and priorities.

  • Object removal: Most models succeeded, but FLUX.1 Kontext [pro] struggled
  • Perspective changes: GPT Image 1 and Qwen Image Edit best achieved the requested front-facing views with character consistency
  • Background editing: The ByteDance models (SeedEdit and Seedream) clearly dominated with natural integration of the character with the jungle landscape
  • Text editing: FLUX.1 Kontext and Nano Banana preserved typography and texture most effectively
  • Style transfer: Nano Banana and Seedream maintained closest resemblance to originals while achieving nice artistic effects

Keep in mind that these were all surface-level experimentations and the above recommendations might not be enough to justify your choice of model.

Need to experiment some more? Check out Replicate’s playground to parallely test and compare image editing models (or any model):

Replicate playground showing image editing models
Playground on Replicate

It’s what we used to create this post!

As always, chat with us on Discord and follow us on Twitter X to keep up with the latest.