bytedance / pulid

📖 PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Warm

Public
3M runs
L40S
GitHub
Paper
License

Iterate in playground

Run with an API

Playground API Examples README Versions

Input

main_face_image

*file

Preview

ID image (main)

auxiliary_face_image1

file

Additional ID image (auxiliary)

auxiliary_face_image2

file

Additional ID image (auxiliary)

auxiliary_face_image3

file

Additional ID image (auxiliary)

prompt

string

Shift + Return to add a new line

portrait, impressionist painting, loose brushwork, vibrant color, light and shadow playportrait, impressionist painting, loose brushwork, vibrant color, light and shadow play

Prompt

Default: "portrait,color,cinematic,in garden,soft light,detailed face"

negative_prompt

string

Shift + Return to add a new line

flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurryflaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry

Negative Prompt

Default: "flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects, deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry"

cfg_scale

number

(minimum: 1, maximum: 1.5)

CFG, recommend value range [1, 1.5], 1 will be faster

Default: 1.2

num_steps

integer

(minimum: 1, maximum: 100)

Steps

Default: 4

image_height

integer

(minimum: 512, maximum: 2024)

Height

Default: 1024

image_width

integer

(minimum: 512, maximum: 2024)

Width

Default: 768

identity_scale

number

(minimum: 0, maximum: 5)

ID scale

Default: 0.8

generation_mode

string

mode

Default: "fidelity"

mix_identities

boolean

ID Mix (if you want to mix two ID image, please turn this on, otherwise, turn this off)

Default: false

seed

integer

Random seed. Leave blank to randomize the seed

num_samples

integer

(minimum: 1, maximum: 8)

Num samples

Default: 4

output_format

string

Format of the output images

Default: "webp"

output_quality

integer

(minimum: 0, maximum: 100)

Quality of the output images, from 0 to 100. 100 is best quality, 0 is lowest quality.

Default: 80

Run this model in Node.js with one line of code:

npx create-replicate --model=bytedance/pulid

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run bytedance/pulid using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "bytedance/pulid:43d309c37ab4e62361e5e29b8e9e867fb2dcbcec77ae91206a8d95ac5dd451a0",
  {
    input: {
      prompt: "portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play",
      cfg_scale: 1.2,
      num_steps: 4,
      image_width: 768,
      num_samples: 4,
      image_height: 1024,
      output_format: "webp",
      identity_scale: 0.8,
      mix_identities: false,
      output_quality: 80,
      generation_mode: "fidelity",
      main_face_image: "https://replicate.delivery/pbxt/Kr6iendsvYS0F3MLmwRZ8q07XIMEJdemnQI3Cmq9nNrauJbq/zcy.webp",
      negative_prompt: "flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry"
    }
  }
);

// To access the file URL:
console.log(output[0].url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output[0]);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run bytedance/pulid using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "bytedance/pulid:43d309c37ab4e62361e5e29b8e9e867fb2dcbcec77ae91206a8d95ac5dd451a0",
    input={
        "prompt": "portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play",
        "cfg_scale": 1.2,
        "num_steps": 4,
        "image_width": 768,
        "num_samples": 4,
        "image_height": 1024,
        "output_format": "webp",
        "identity_scale": 0.8,
        "mix_identities": False,
        "output_quality": 80,
        "generation_mode": "fidelity",
        "main_face_image": "https://replicate.delivery/pbxt/Kr6iendsvYS0F3MLmwRZ8q07XIMEJdemnQI3Cmq9nNrauJbq/zcy.webp",
        "negative_prompt": "flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry"
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run bytedance/pulid using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "bytedance/pulid:43d309c37ab4e62361e5e29b8e9e867fb2dcbcec77ae91206a8d95ac5dd451a0",
    "input": {
      "prompt": "portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play",
      "cfg_scale": 1.2,
      "num_steps": 4,
      "image_width": 768,
      "num_samples": 4,
      "image_height": 1024,
      "output_format": "webp",
      "identity_scale": 0.8,
      "mix_identities": false,
      "output_quality": 80,
      "generation_mode": "fidelity",
      "main_face_image": "https://replicate.delivery/pbxt/Kr6iendsvYS0F3MLmwRZ8q07XIMEJdemnQI3Cmq9nNrauJbq/zcy.webp",
      "negative_prompt": "flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry"
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

{
  "completed_at": "2024-05-03T17:03:40.427024Z",
  "created_at": "2024-05-03T17:03:32.933000Z",
  "data_removed": false,
  "error": null,
  "id": "wkqmbjhy8nrgj0cf7xmvky7p58",
  "input": {
    "prompt": "portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play",
    "cfg_scale": 1.2,
    "num_steps": 4,
    "image_width": 768,
    "num_samples": 4,
    "image_height": 1024,
    "output_format": "webp",
    "identity_scale": 0.8,
    "mix_identities": false,
    "output_quality": 80,
    "generation_mode": "fidelity",
    "main_face_image": "https://replicate.delivery/pbxt/Kr6iendsvYS0F3MLmwRZ8q07XIMEJdemnQI3Cmq9nNrauJbq/zcy.webp",
    "negative_prompt": "flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry"
  },
  "logs": "Using seed: 61631\n[!] (<class 'cog.types.Path'>) main_face_image=/tmp/tmpk7az_pmizcy.webp\n[!] (<class 'NoneType'>) auxiliary_face_image1=None\n[!] (<class 'NoneType'>) auxiliary_face_image2=None\n[!] (<class 'NoneType'>) auxiliary_face_image3=None\n[!] (<class 'str'>) prompt=portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play\n[!] (<class 'str'>) negative_prompt=flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry\n[!] (<class 'float'>) cfg_scale=1.2\n[!] (<class 'int'>) num_samples=4\n[!] (<class 'int'>) seed=61631\n[!] (<class 'int'>) num_steps=4\n[!] (<class 'int'>) image_height=1024\n[!] (<class 'int'>) image_width=768\n[!] (<class 'float'>) identity_scale=0.8\n[!] (<class 'str'>) generation_mode=fidelity\n[!] (<class 'bool'>) mix_identities=False\n  0%|          | 0/4 [00:00<?, ?it/s]\n 25%|██▌       | 1/4 [00:00<00:00,  5.54it/s]\n 50%|█████     | 2/4 [00:00<00:00,  6.73it/s]\n 75%|███████▌  | 3/4 [00:00<00:00,  6.12it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.87it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.98it/s]\n  0%|          | 0/4 [00:00<?, ?it/s]\n 25%|██▌       | 1/4 [00:00<00:00,  5.53it/s]\n 50%|█████     | 2/4 [00:00<00:00,  6.70it/s]\n 75%|███████▌  | 3/4 [00:00<00:00,  6.09it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.85it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.95it/s]\n  0%|          | 0/4 [00:00<?, ?it/s]\n 25%|██▌       | 1/4 [00:00<00:00,  5.50it/s]\n 50%|█████     | 2/4 [00:00<00:00,  6.66it/s]\n 75%|███████▌  | 3/4 [00:00<00:00,  6.08it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.84it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.94it/s]\n  0%|          | 0/4 [00:00<?, ?it/s]\n 25%|██▌       | 1/4 [00:00<00:00,  5.52it/s]\n 50%|█████     | 2/4 [00:00<00:00,  6.68it/s]\n 75%|███████▌  | 3/4 [00:00<00:00,  6.08it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.83it/s]\n100%|██████████| 4/4 [00:00<00:00,  5.94it/s]\n[~] Saving to output_image_0.webp...\n[~] Output format: WEBP\n[~] Output quality: 80\n[~] Saving to output_image_1.webp...\n[~] Output format: WEBP\n[~] Output quality: 80\n[~] Saving to output_image_2.webp...\n[~] Output format: WEBP\n[~] Output quality: 80\n[~] Saving to output_image_3.webp...\n[~] Output format: WEBP\n[~] Output quality: 80",
  "metrics": {
    "predict_time": 7.456445,
    "total_time": 7.494024
  },
  "output": [
    "https://replicate.delivery/pbxt/nDciJ2jxtSYcCVxGxprSzvycVWR6fIHyJYQFeyDDwDSqrswSA/output_image_0.webp",
    "https://replicate.delivery/pbxt/f8UCxdcfXNrYokQMYOUFPzBnVsiRY6Ok1Eotaorg14ZrrswSA/output_image_1.webp",
    "https://replicate.delivery/pbxt/k7YRRLjBos7EBhiOz8OD7kGrgFwdbemQTbvsQAOmizo1VWYJA/output_image_2.webp",
    "https://replicate.delivery/pbxt/Tif4TmIemNjDiksQWaGTUzG4momOuqeGcEqAMhQHxUdYXZhlA/output_image_3.webp"
  ],
  "started_at": "2024-05-03T17:03:32.970579Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/wkqmbjhy8nrgj0cf7xmvky7p58",
    "cancel": "https://api.replicate.com/v1/predictions/wkqmbjhy8nrgj0cf7xmvky7p58/cancel"
  },
  "version": "c169c3b8f6952cf895d043d7b56830b4e9a3e9409a026004e9efbd9da42912b4"
}

Generated in

7.5 seconds

Tweak it ShareReport View full prediction

Using seed: 61631
[!] (<class 'cog.types.Path'>) main_face_image=/tmp/tmpk7az_pmizcy.webp
[!] (<class 'NoneType'>) auxiliary_face_image1=None
[!] (<class 'NoneType'>) auxiliary_face_image2=None
[!] (<class 'NoneType'>) auxiliary_face_image3=None
[!] (<class 'str'>) prompt=portrait, impressionist painting, loose brushwork, vibrant color, light and shadow play
[!] (<class 'str'>) negative_prompt=flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects,  deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry
[!] (<class 'float'>) cfg_scale=1.2
[!] (<class 'int'>) num_samples=4
[!] (<class 'int'>) seed=61631
[!] (<class 'int'>) num_steps=4
[!] (<class 'int'>) image_height=1024
[!] (<class 'int'>) image_width=768
[!] (<class 'float'>) identity_scale=0.8
[!] (<class 'str'>) generation_mode=fidelity
[!] (<class 'bool'>) mix_identities=False
  0%|          | 0/4 [00:00<?, ?it/s]
 25%|██▌       | 1/4 [00:00<00:00,  5.54it/s]
 50%|█████     | 2/4 [00:00<00:00,  6.73it/s]
 75%|███████▌  | 3/4 [00:00<00:00,  6.12it/s]
100%|██████████| 4/4 [00:00<00:00,  5.87it/s]
100%|██████████| 4/4 [00:00<00:00,  5.98it/s]
  0%|          | 0/4 [00:00<?, ?it/s]
 25%|██▌       | 1/4 [00:00<00:00,  5.53it/s]
 50%|█████     | 2/4 [00:00<00:00,  6.70it/s]
 75%|███████▌  | 3/4 [00:00<00:00,  6.09it/s]
100%|██████████| 4/4 [00:00<00:00,  5.85it/s]
100%|██████████| 4/4 [00:00<00:00,  5.95it/s]
  0%|          | 0/4 [00:00<?, ?it/s]
 25%|██▌       | 1/4 [00:00<00:00,  5.50it/s]
 50%|█████     | 2/4 [00:00<00:00,  6.66it/s]
 75%|███████▌  | 3/4 [00:00<00:00,  6.08it/s]
100%|██████████| 4/4 [00:00<00:00,  5.84it/s]
100%|██████████| 4/4 [00:00<00:00,  5.94it/s]
  0%|          | 0/4 [00:00<?, ?it/s]
 25%|██▌       | 1/4 [00:00<00:00,  5.52it/s]
 50%|█████     | 2/4 [00:00<00:00,  6.68it/s]
 75%|███████▌  | 3/4 [00:00<00:00,  6.08it/s]
100%|██████████| 4/4 [00:00<00:00,  5.83it/s]
100%|██████████| 4/4 [00:00<00:00,  5.94it/s]
[~] Saving to output_image_0.webp...
[~] Output format: WEBP
[~] Output quality: 80
[~] Saving to output_image_1.webp...
[~] Output format: WEBP
[~] Output quality: 80
[~] Saving to output_image_2.webp...
[~] Output format: WEBP
[~] Output quality: 80
[~] Saving to output_image_3.webp...
[~] Output format: WEBP
[~] Output quality: 80

This output was created using a different version of the model, bytedance/pulid:c169c3b8.

Examples

View more examples

Run time and cost

This model costs approximately $0.0016 to run on Replicate, or 625 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 2 seconds.

Readme

PuLID: Pure and Lightning ID Customization (Classic Version)

Welcome to PuLID v1.0, a tuning-free ID customization solution for text-to-image models. This is the classic version of PuLID, designed to work with Stable Diffusion XL.

🆕 Looking for PuLID for FLUX? Check out our FLUX-PuLID demo!

About PuLID

PuLID (Pure and Lightning ID customization) is an AI model that customizes images, especially faces, while preserving important identity features. Here’s what PuLID does:

Adds a specific identity (like a person’s face) to a text-to-image model without altering the model’s core functionality.
Creates images with high identity similarity.
Allows modification of attributes, styles, and backgrounds using text prompts.
Maintains consistency in image elements like background, lighting, and style.
Provides extensive options for editing and refining generated images.

PuLID leverages advanced technologies: - Comprises two main components: a standard diffusion training part and an innovative Lightning T2I part. - Employs sophisticated methods for facial identity comprehension. - Utilizes a “contrastive alignment” technique to ensure image consistency. - Generates images rapidly while maintaining identity accuracy.

Applications of PuLID include: - Creating personalized avatars and characters - Facial editing and enhancement - Developing digital art - Producing prototypes and visualizations

How to Use This Replicate Demo

Upload an image containing the identity you wish to customize.
Enter a text prompt describing the image you want to generate.
Adjust the settings as needed (refer to “Advanced Settings” below).
Click to generate your customized image!

Advanced Settings

Seed: Set a specific seed for reproducible results.
Guidance Scale: Controls how closely the image adheres to your text prompt.
Number of Inference Steps: More steps can lead to higher quality but increase generation time.
Negative Prompt: Describe elements you want to avoid in the generated image.

Useful Tips

For best results, use clear, front-facing images of faces as your identity input.
Experiment with different prompts to explore various styles and scenarios.
If the generated image doesn’t capture the identity well, try adjusting the guidance scale or increasing the number of inference steps.
Use negative prompts to refine your results, especially for avoiding unwanted elements.

Limitations

While PuLID performs well on a wide range of identities, results may vary depending on the input image quality and facial characteristics.
Very complex or abstract prompts might lead to less accurate identity preservation.
The model works best with front-facing, clear images of faces.

Examples

Here are some examples of images generated with PuLID:

PuLID Examples

Learn More

For more technical details, latest updates, and additional examples, visit our GitHub repository.

If you find PuLID helpful, please star our repo or share it with others!

Questions or Suggestions?

If you have questions or ideas for improvement, please open an issue on our GitHub repository.

Citation

If you use PuLID in your work, please cite:

@article{guo2024pulid,
  title={PuLID: Pure and Lightning ID Customization via Contrastive Alignment},
  author={Guo, Zinan and Wu, Yanze and Chen, Zhuowei and Chen, Lang and He, Qian},
  journal={arXiv preprint arXiv:2404.16022},
  year={2024}
}

Support

For updates and more AI content, follow: - The lead developers: - Yanze Wu: GitHub - Zinan Guo: Email