replicate / gpt-j-6b

A large language model by EleutherAI

  • Public
  • 9.6K runs
  • GitHub
  • License

Run time and cost

This model costs approximately $0.0019 to run on Replicate, or 526 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 2 seconds. The predict time for this model varies significantly based on the inputs.

Readme

GPT-J-6B

GPT-J-6B is a 6 billion parameter language model by EleutherAI.

Official page: https://huggingface.co/EleutherAI/gpt-j-6b

Fine-tuning

If you have access to the training beta, you can fine-tune this model.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="replicate/gpt-j-6b:b3546aeec6c9891f0dd9929c2d3bedbf013c12e02e7dd0346af09c37e008c827", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Training takes these input parameters:

  • train_data (required): URL to a file in JSONL where each line is in the format {"prompt": ..., "completion": ...}
  • epochs (optional, default=1): Number of times to iterate over the training dataset
  • max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1