replicate / gpt-j-6b

A large language model by EleutherAI

  • Public
  • 8.1K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 1 seconds.

Readme

GPT-J-6B

GPT-J-6B is a 6 billion parameter language model by EleutherAI.

Official page: https://huggingface.co/EleutherAI/gpt-j-6b

Fine-tuning

If you have access to the training beta, you can fine-tune this model.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="replicate/gpt-j-6b:b3546aeec6c9891f0dd9929c2d3bedbf013c12e02e7dd0346af09c37e008c827", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Training takes these input parameters:

  • train_data (required): URL to a file in JSONL where each line is in the format {"prompt": ..., "completion": ...}
  • epochs (optional, default=1): Number of times to iterate over the training dataset
  • max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1