replicate / gpt-j-6b

A large language model by EleutherAI

  • Public
  • 9.4K runs
  • GitHub
  • License

Run time and cost

This model costs approximately $0.031 to run on Replicate, or 32 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 23 seconds.

Readme

GPT-J-6B

GPT-J-6B is a 6 billion parameter language model by EleutherAI.

Official page: https://huggingface.co/EleutherAI/gpt-j-6b

Fine-tuning

If you have access to the training beta, you can fine-tune this model.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="replicate/gpt-j-6b:b3546aeec6c9891f0dd9929c2d3bedbf013c12e02e7dd0346af09c37e008c827", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Training takes these input parameters:

  • train_data (required): URL to a file in JSONL where each line is in the format {"prompt": ..., "completion": ...}
  • epochs (optional, default=1): Number of times to iterate over the training dataset
  • max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1