Llama 3.1 is the latest language model from Meta. It features a massive 405 billion parameter model that rivals GPT-4 in quality, with a context window of 8000 tokens.
With Replicate, you can run Llama 3.1 in the cloud with one line of code.
Before you dive in, try Llama 3.1 in our API playground.
Try tweaking the prompt and see how Llama 3.1 responds. Most models on Replicate have an interactive API playground like this, available on the model page: https://replicate.com/meta/meta-llama-3.1-405b-instruct
The API playground is a great way to get a feel for what a model can do, and provides copyable code snippets in a variety of languages to help you get started.
You can run Llama 3.1 with our official JavaScript client:
Install Replicate's Node.js client library
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Import and set up the client
Run meta/meta-llama-3.1-405b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at the guide on getting started with Node.js.
You can run Llama 3.1 with our official Python client:
Install Replicate's Python client library
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Import the client
Run meta/meta-llama-3.1-405b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at the guide on getting started with Python.
Your can call the HTTP API directly with tools like cURL:
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Run meta/meta-llama-3.1-405b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at Replicate's HTTP API reference docs.
You can also run Llama using other Replicate client libraries for Go, Swift, and others.
Llama 3.1 405B is currently the only variant available on Replicate. This model represents the cutting edge of open-source language models:
Llama 3.1 comes with a strong focus on responsible AI development. Meta has introduced several tools and resources to help developers use the model safely and ethically:
We recommend reviewing these resources when building applications with Llama 3.1. For more information, check out the Purple Llama GitHub repository.
If you want a place to start, we've built a demo chat app in Next.js that can be deployed on Vercel:
Try it out on llama3.replicate.dev. Take a look at the GitHub README to learn how to customize and deploy it.
Happy hacking! 🦙