daanelson/minigpt-4 | Run with an API on Replicate

Input

image

*file

Image to discuss

prompt

*string

Shift + Return to add a new line

This llama's name is Dave. Write me a story about how Dave found his skateboard.This llama's name is Dave. Write me a story about how Dave found his skateboard.

Prompt for mini-gpt4 regarding input image

num_beams

integer

(minimum: 1, maximum: 10)

Number of beams for beam search decoding

Default: 3

temperature

number

(minimum: 0.01, maximum: 2)

Temperature for generating tokens, lower = more predictable results

Default: 1

top_p

number

(minimum: 0, maximum: 1)

Sample from the top p percent most likely tokens

Default: 0.9

repetition_penalty

number

(minimum: 0.01, maximum: 5)

Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Default: 1

max_new_tokens

integer

(minimum: 1)

Maximum number of new tokens to generate

Default: 3000

max_length

integer

(minimum: 1)

Total length of prompt and output in tokens

Default: 4000

Run this model in Node.js with one line of code:

npx create-replicate --model=daanelson/minigpt-4

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run daanelson/minigpt-4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "daanelson/minigpt-4:e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06",
  {
    input: {
      image: "https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png",
      top_p: 0.9,
      prompt: "This llama's name is Dave. Write me a story about how Dave found his skateboard.",
      num_beams: 5,
      max_length: 4000,
      temperature: 1.32,
      max_new_tokens: 3000,
      repetition_penalty: 1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run daanelson/minigpt-4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "daanelson/minigpt-4:e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06",
    input={
        "image": "https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png",
        "top_p": 0.9,
        "prompt": "This llama's name is Dave. Write me a story about how Dave found his skateboard.",
        "num_beams": 5,
        "max_length": 4000,
        "temperature": 1.32,
        "max_new_tokens": 3000,
        "repetition_penalty": 1
    }
)

print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run daanelson/minigpt-4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "daanelson/minigpt-4:e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06",
    "input": {
      "image": "https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png",
      "top_p": 0.9,
      "prompt": "This llama\'s name is Dave. Write me a story about how Dave found his skateboard.",
      "num_beams": 5,
      "max_length": 4000,
      "temperature": 1.32,
      "max_new_tokens": 3000,
      "repetition_penalty": 1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

You can run this model locally using Cog. First, install Cog:

brew install cog

If you don’t have Homebrew, there are other installation options available.

Run this to download the model and run it in your local environment:

cog predict r8.im/daanelson/minigpt-4@sha256:e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06 \
  -i 'image="https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png"' \
  -i 'top_p=0.9' \
  -i $'prompt="This llama\'s name is Dave. Write me a story about how Dave found his skateboard."' \
  -i 'num_beams=5' \
  -i 'max_length=4000' \
  -i 'temperature=1.32' \
  -i 'max_new_tokens=3000' \
  -i 'repetition_penalty=1'

To learn more, take a look at the Cog documentation.

Run this to download the model and run it in your local environment:

docker run -d -p 5000:5000 --gpus=all r8.im/daanelson/minigpt-4@sha256:e447a8583cffd86ce3b93f9c2cd24f2eae603d99ace6afa94b33a08e94a3cd06
curl -s -X POST \
  -H "Content-Type: application/json" \
  -d $'{
    "input": {
      "image": "https://replicate.delivery/pbxt/IqG1MbemhULihtfr62URRZbI29XtcPsnOYASrTDQ6u5oSqv9/llama_13b.png",
      "top_p": 0.9,
      "prompt": "This llama\'s name is Dave. Write me a story about how Dave found his skateboard.",
      "num_beams": 5,
      "max_length": 4000,
      "temperature": 1.32,
      "max_new_tokens": 3000,
      "repetition_penalty": 1
    }
  }' \
  http://localhost:5000/predictions

To learn more, take a look at the Cog documentation.

Output

Dave the llama was feeling very bored one day. He had been wandering around the city for hours, but there was nothing interesting to do. Suddenly, he saw a skateboard lying on the ground. He decided to try it out, and as soon as he started riding it, he felt a rush of excitement. He rode around the city, enjoying the feeling of the wind in his hair and the freedom of being on his own. As he rode, he saw all sorts of interesting things that he had never noticed before. He even met some new friends along the way. After a while, Dave realized that he had found his true passion - skateboarding. From then on, he spent all his free time riding his skateboard and exploring the city.

Generated in

17.0 seconds

Tweak it Report View full prediction

This output was created using a different version of the model, daanelson/minigpt-4:b96a2f33.

Run time and cost

This model costs approximately $0.0057 to run on Replicate, or 175 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 5 seconds.

Readme

Model description

MiniGPT-4 is a multimodal model which allows users to prompt a language model with an image and some text. This enables users to ask questions about images, generate HTML from website mockups, write advertisements for fictional products, and more. It can function as a chatbot with longer back and forth conversations, though this implementation is a simple question and answer model.

MiniGPT-4 consists of a frozen vision encoder with a pretrained ViT and Q-Former, a single linear projection layer, and a frozen Vicuna large language model. MiniGPT-4 only requires training the linear layer to align the visual features with the Vicuna.

Intended use

MiniGPT-4 is useful for various applications that require image understanding, including: - Describing an image and its context (for instance, - Writing stories about images of characters - Describing recipes from images of food - Etc.

Citation

@article{zhu2023minigpt,
  title={MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models},
  author={Zhu, Deyao and Chen, Jun and Shen, Xiaoqian and Li, Xiang and Elhoseiny, Mohamed},
  journal={arXiv preprint arXiv:2304.10592},
  year={2023}
}