Llama 2 is the first open-source language model of the same caliber as OpenAI’s models, and because it’s open source you can hack it to do new things that aren’t possible with GPT-4.
Like become a better poet. Talk like Homer Simpson. Write Midjourney prompts. Or replace your best friends.
One of the main reasons to fine-tune models is so you can use a small model do a task that would normally require a large model. This means you can do the same task, but cheaper and faster. For example, the 7 billion parameter Llama 2 model is not good at summarizing text, but we can teach it how.
In this guide, we’ll show you how to create a text summarizer. We'll be using Llama 2 7B, an open-source large language model from Meta and fine-tuning it on a dataset of messenger-like conversations with summaries. When we're done, you'll be able to distill chat transcripts, emails, webpages, and other documents into a brief summary. Short and sweet.
Here are the Llama models on Replicate that you can fine-tune:
If your model is responding to instructions from users, you want to use the chat models. If you are just completing text, you'll want to use the base.
Your training data should be in a JSONL text file.
In this guide, we’ll be using the SAMSum dataset, transformed into JSONL.
You need to create an empty model on Replicate for your trained model. When your training finishes, it will be pushed as a new version to this model.
Go to replicate.com/create and create a new model called “llama2-summarizer”.
Authenticate by setting your token in an environment variable:
Find your API token in your account settings.
Install the Python library:
And kick off training, replacing the destination name with your username and the name of your new model:
It takes these arguments:
version
: The model to train, in the format {username}/{model}:{version}
.input
: The training data and params to pass to the training process, which are defined by the model. Llama 2's params can be found in the model's "Train" tab.destination
: The model to push the trained version to, in the format your-username/your-model-name
Once you've kicked off your training, visit replicate.com/trainings in your browser to monitor the progress.
You can now run your model from the web or with an API. To use your model in the browser, go to your model page.
To use your model with an API, run the version
from the training output:
That's it! You've fine-tuned Llama 2 and can run your new model with an API.
Happy hacking! 🦙