replicate / gpt-j-6b

A large language model by EleutherAI

  • Public
  • 9.5K runs
  • A100 (80GB)
  • GitHub
  • License

Input

*string
Shift + Return to add a new line

Input Prompt.

integer
(minimum: 1)

Maximum number of tokens to generate. A word is generally 2-3 tokens

Default: 500

string

Choose a decoding method

Default: "top_p"

integer

Valid if you choose top_k decoding. The number of highest probability vocabulary tokens to keep for top-k-filtering

Default: 50

number
(minimum: 0.01, maximum: 1)

Valid if you choose top_p decoding. When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens

Default: 1

number
(minimum: 0.01, maximum: 5)

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.

Default: 0.75

number
(minimum: 0.01, maximum: 5)

Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Default: 1.2

Output

A. Yes, it is a sandwich because it's eaten between two pieces of bread or other edible products and served in the same way as sandwiches. An open-faced sandwich would be an example
Generated in

Run time and cost

This model costs approximately $0.031 to run on Replicate, or 32 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 23 seconds.

Readme

GPT-J-6B

GPT-J-6B is a 6 billion parameter language model by EleutherAI.

Official page: https://huggingface.co/EleutherAI/gpt-j-6b

Fine-tuning

If you have access to the training beta, you can fine-tune this model.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="replicate/gpt-j-6b:b3546aeec6c9891f0dd9929c2d3bedbf013c12e02e7dd0346af09c37e008c827", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Training takes these input parameters:

  • train_data (required): URL to a file in JSONL where each line is in the format {"prompt": ..., "completion": ...}
  • epochs (optional, default=1): Number of times to iterate over the training dataset
  • max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1