Llama 3 is the latest language model from Meta. It has state of the art performance and a context window of 8000 tokens, double Llama 2's context window.
With Replicate, you can run Llama 3 in the cloud with one line of code.
Before you dive in, try Llama 3 in our API playground.
Try tweaking the prompt and see how Llama 3 responds. Most models on Replicate have an interactive API playground like this, available on the model page: https://replicate.com/meta/meta-llama-3-70b-instruct
The API playground is a great way to get a feel for what a model can do, and provides copyable code snippets in a variety of languages to help you get started.
You can run Llama 3 with our official JavaScript client:
Install Replicate's Node.js client library
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Import and set up the client
Run meta/meta-llama-3-70b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at the guide on getting started with Node.js.
You can run Llama 3 with our official Python client:
Install Replicate's Python client library
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Import the client
Run meta/meta-llama-3-70b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at the guide on getting started with Python.
Your can call the HTTP API directly with tools like cURL:
Set the REPLICATE_API_TOKEN
environment variable
(You can generate an API token in your account. Keep it to yourself.)
Run meta/meta-llama-3-70b-instruct using Replicate's API. Check out the model's schema for an overview of inputs and outputs.
To learn more, take a look at Replicate's HTTP API reference docs.
You can also run Llama using other Replicate client libraries for Go, Swift, and others.
There are four variant Llama 3 models on Replicate, each with their own strengths. Llama 3 comes in two parameter sizes: 70 billion and 8 billion, with both base and chat tuned models.
If you want a place to start, we've built a demo chat app in Next.js that can be deployed on Vercel:
Try it out on llama3.replicate.dev. Take a look at the GitHub README to learn how to customize and deploy it.
Happy hacking! 🦙