stability-ai/stablelm-base-alpha-7b | Run with an API on Replicate

Readme

Model description

StableLM-Base-Alpha-7B is a 7B parameter decoder-only language model.

The StableLM-Alpha models are trained on a new dataset that builds on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.

An upcoming technical report will document the model specifications and the training settings.

Fine-tuning

If you have access to the training beta, you can fine-tune this model. Read more in our guide to fine-tuning.

Here’s an example using replicate-python:

training = replicate.trainings.create(
    version="stability-ai/stablelm-base-alpha-7b:34e80c92e0a78fdb91b9421f63b2688093c546d9d039450a7342a16f9556f188", 
    input={
        "train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl", 
    }, 
    destination="my-username/my-model"
)

Fine-tuning takes these input parameters:

train_data (required): URL to a file where each row is a JSON record in the format {"prompt": ..., "completion": ...}. Can be JSONL or one JSON list.
train_batch_size (optional, default=1): Train batch size. For llama-13B, we recommend keeping the batch size small and increasing gradient_accumulation_steps
gradient_accumulation_steps (optional, default=8): Number of training steps (each of train_batch_size) to store gradients for before performing an optimizer step.
learning_rate (optional, default=2e-5): Learning rate!
num_train_epochs (optional, default=1): Number of epochs (iterations over the entire training dataset) to train for.
warmup_ratio (optional, default=0.03): Percentage of all training steps used for a linear LR warmup.
logging_steps (optional, default=1): Prints loss & other logging info every logging_steps.
max_steps (optional, default=-1): Maximum number of training steps. Unlimited if max_steps=-1.
lora_rank (optional, default=8): Rank of the LoRA matrices.
lora_alpha (optional, default=16): Alpha parameter for scaling LoRA weights; weights are scaled by alpha/rank
lora_dropout (optional, default=0.1): Dropout for LoRA training.
lora_target_modules (optional, default=’q_proj,v_proj’): Comma-separated list of target modules to fine-tune with LoRA.

Read our guide to fine-tuning a model to learn more

Licenses

Base model checkpoints (StableLM-Base-Alpha) are licensed under the Creative Commons license (CC BY-SA-4.0). Under the license, you must give credit to Stability AI, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the Stability AI endorses you or your use.
Fine-tuned checkpoints (StableLM-Tuned-Alpha) are licensed under the Non-Commercial Creative Commons license (CC BY-NC-SA-4.0), in-line with the original non-commercial license specified by Stanford Alpaca.
All code in this repository is licensed under the Apache License 2.0 license.