nateraw/mixtral-8x7b-32kseqlen – Run with an API on Replicate

nateraw / mixtral-8x7b-32kseqlen

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.

Cold

Public
15.1K runs
4x A100 (80GB)
GitHub

Iterate in playground

Run with an API

Playground API Examples README Versions

Input

Run this model in Node.js with one line of code:

npx create-replicate --model=nateraw/mixtral-8x7b-32kseqlen

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run nateraw/mixtral-8x7b-32kseqlen using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "nateraw/mixtral-8x7b-32kseqlen:defd13a450b135eab8f24e23f9659c71eed07b8de1b0dde9503a6e8a961d5510",
  {
    input: {
      top_p: 0.9,
      prompt: "Here are the top 10 useful Hindi phrases for your upcoming trip to India:\n\n1. ",
      temperature: 0.6,
      max_new_tokens: 512
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run nateraw/mixtral-8x7b-32kseqlen using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "nateraw/mixtral-8x7b-32kseqlen:defd13a450b135eab8f24e23f9659c71eed07b8de1b0dde9503a6e8a961d5510",
    input={
        "top_p": 0.9,
        "prompt": "Here are the top 10 useful Hindi phrases for your upcoming trip to India:\n\n1. ",
        "temperature": 0.6,
        "max_new_tokens": 512
    }
)
print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run nateraw/mixtral-8x7b-32kseqlen using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "nateraw/mixtral-8x7b-32kseqlen:defd13a450b135eab8f24e23f9659c71eed07b8de1b0dde9503a6e8a961d5510",
    "input": {
      "top_p": 0.9,
      "prompt": "Here are the top 10 useful Hindi phrases for your upcoming trip to India:\\n\\n1. ",
      "temperature": 0.6,
      "max_new_tokens": 512
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

नमस्ते (Namaste): This is the most common greeting in India and is used to say “hello” or “goodbye.” It is a respectful way to greet someone and can be used with anyone, regardless of age or gender. 2. धन्यवाद (Dhanyavaad): This is the Hindi word for “thank you” and is used to express gratitude. It is important to use this phrase when someone does something for you or when you receive a gift or favor. 3. क्या हाल है? (Kya Haal Hai?): This phrase means “how are you?” and is used to ask someone about their well-being. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender. 4. मैं बहुत अच्छा हूँ (Main Bahaaut Acha Hoon): This phrase means “I am very good” and is used as a response to the question “how are you?” It is a polite way to let someone know that you are doing well. 5. क्या आप हिंदी बोलते हैं? (Kya Aap Hindi Bolte Hain?): This phrase means “do you speak Hindi?” and is used to ask someone if they speak the language. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender. 6. क्या आप इंग्लिश बोलते हैं? (Kya Aap English Bolte Hain?): This phrase means “do you speak English?” and is used to ask someone if they speak the language. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender. 7. क्या आप कैसे पहुँचे? (Kya Aap Kaise Pahunche?): This phrase means “how did you get here?” and is used to ask someone how they arrived at a particular location. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender.

{
  "completed_at": "2023-12-09T04:21:23.099483Z",
  "created_at": "2023-12-09T04:18:24.555008Z",
  "data_removed": false,
  "error": null,
  "id": "7g7hs33buoiqp2rdoszwzy5yse",
  "input": {
    "top_p": 0.9,
    "prompt": "Here are the top 10 useful Hindi phrases for your upcoming trip to India:\n\n1. ",
    "temperature": 0.6,
    "max_new_tokens": 512
  },
  "logs": "generation took 38.182s",
  "metrics": {
    "predict_time": 38.18934,
    "total_time": 178.544475
  },
  "output": "नमस्ते (Namaste): This is the most common greeting in India and is used to say “hello” or “goodbye.” It is a respectful way to greet someone and can be used with anyone, regardless of age or gender.\n2. धन्यवाद (Dhanyavaad): This is the Hindi word for “thank you” and is used to express gratitude. It is important to use this phrase when someone does something for you or when you receive a gift or favor.\n3. क्या हाल है? (Kya Haal Hai?): This phrase means “how are you?” and is used to ask someone about their well-being. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender.\n4. मैं बहुत अच्छा हूँ (Main Bahaaut Acha Hoon): This phrase means “I am very good” and is used as a response to the question “how are you?” It is a polite way to let someone know that you are doing well.\n5. क्या आप हिंदी बोलते हैं? (Kya Aap Hindi Bolte Hain?): This phrase means “do you speak Hindi?” and is used to ask someone if they speak the language. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender.\n6. क्या आप इंग्लिश बोलते हैं? (Kya Aap English Bolte Hain?): This phrase means “do you speak English?” and is used to ask someone if they speak the language. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender.\n7. क्या आप कैसे पहुँचे? (Kya Aap Kaise Pahunche?): This phrase means “how did you get here?” and is used to ask someone how they arrived at a particular location. It is a polite way to start a conversation and can be used with anyone, regardless of age or gender.\n",
  "started_at": "2023-12-09T04:20:44.910143Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/7g7hs33buoiqp2rdoszwzy5yse",
    "cancel": "https://api.replicate.com/v1/predictions/7g7hs33buoiqp2rdoszwzy5yse/cancel"
  },
  "version": "f8125aef9cd96d879f4e5c5c1ff78618818e62939ab76ab1e07425ac75d453bc"
}

Generated in

38.2 seconds

Tweak it Report

This output was created using a different version of the model, nateraw/mixtral-8x7b-32kseqlen:f8125aef.

Run time and cost

This model costs approximately $0.26 to run on Replicate, or 3 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on 4x Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 46 seconds. The predict time for this model varies significantly based on the inputs.

Readme

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks we tested.

For full details of this model please read our release blog post or the model card on Hugging Face