replicate / gpt-j-6b

A large language model by EleutherAI

  • Public
  • 6.2K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 12 seconds. The predict time for this model varies significantly based on the inputs.

Readme

GPT-J-6B

GPT-J-6B is a 6 billion parameter language model by EleutherAI.

Official page: https://huggingface.co/EleutherAI/gpt-j-6b

Fine-tuning

If you have access to the training beta, you can fine-tune this model.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="replicate/gpt-j-6b:b3546aeec6c9891f0dd9929c2d3bedbf013c12e02e7dd0346af09c37e008c827", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Training takes these input parameters:

  • train_data (required): URL to a file in JSONL where each line is in the format {"prompt": ..., "completion": ...}
  • epochs (optional, default=1): Number of times to iterate over the training dataset
  • max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1