Train and run Stanford Alpaca on your own machine

LLaMA is a new open-source language model from Meta Research that performs as well as closed-source models. Similar to Stable Diffusion, there’s been a ton of experimentation and innovation since the model was publicly released. As Simon Willison articulated, LLaMA is easy to run on your own hardware, large enough to be useful, and open-source enough to be tinkered with.

LLaMA is powerful, but it was not built for answering questions. It functions more like a fancy version of autocomplete than a conversational bot. This is where Stanford’s Alpaca comes in. Alpaca is a fine-tuned version of LLaMA that can respond to instructions like ChatGPT. And, like LLaMA, it’s open-source.

The problem is, the weights for Alpaca have not been released, so you can’t tinker with it. We do have all the component parts we need to replicate it though: the LLaMA weights, the training data, and the training script.

In this post we’ll show you how to train Alpaca so you can tinker with it on your own machine.

Note: LLaMA and anything built on LLaMA is for research purposes only. You can’t build anything commercial with it.

Prerequisites

LLaMA weights. They are only available for research use. To apply for access, fill out this Meta Research form.
GPU machine. You’ll need a Linux machine with one or more 80GB A100 GPUs. It’ll be faster if you get a machine with more GPUs – we used four. We’ve had success with Google Cloud. You can follow our instructions here.

Step 1: Clone the Alpaca repository

We’ve created a fork of the Alpaca repository that adds a Cog file that’ll set up all the dependencies for you.

Log into your GPU instance via SSH. Clone the repository by running:

git clone https://github.com/replicate/cog_stanford_alpaca
cd cog_stanford_alpaca

Step 2: Convert the LLaMA weights

The LLaMA weights are currently only available for research use. To apply for access, fill out this Meta Research form.

Put your downloaded weights in a folder called unconverted-weights. The folder hierarchy should look something like this:

unconverted-weights
├── 7B
│   ├── checklist.chk
│   ├── consolidated.00.pth
│   └── params.json
├── tokenizer.model
└── tokenizer_checklist.chk

Convert the weights from a PyTorch checkpoint to a transformers-compatible format using this command:

cog run python -m transformers.models.llama.convert_llama_weights_to_hf \
  --input_dir unconverted-weights \
  --model_size 7B \
  --output_dir weights

You final directory structure should look like this:

weights
├── llama-7b
└── tokenizermdki

Step 3: Train the model

Kick off the training:

cog run ./train_model.sh

This will take about an hour and a half on four A100s, so you might want to go and do some programming while your model is programming itself.

Step 4: Run the model

When that’s finished, you can run Alpaca:

$ cog predict -i prompt="Tell me something about alpacas.

Alpacas are a species of South American camelid and are closely related to llamas. They are smaller than llamas and have a finer fleece, which is used to make clothing and other crafts. Alpacas are social animals that live in herds and can come in two colors: white and brown. They are very easy to take care of and require minimal grooming.

Next steps

Here are some ideas for what you could do next:

Fine-tune the model or constrain the decoder to create a model for a particular task.
Experiment with different interfaces for interacting with the model. Where could you talk to it?
Push the model to Replicate to run it in the cloud. This is handy if you want an API to build interfaces, or to run large-scale evaluation in parallel. You’ll need to keep it private so the weights aren’t public.

Just remember that you can only use Alpaca for non-commercial research. Eventually, we expect models like this will be released with more permissive licenses that will allow them to be used for all sorts of things — chat bots, coding assistants, and so on.

Open-source language models are just getting started, and we can’t wait to see what you build.

We’re going to be posting more guides to hacking on open-source language models. Follow us on Twitter to follow along.