Readme
Model description
StableLM-Base-Alpha-3B is a 3B parameter decoder-only language model.
The StableLM-Alpha models are trained on a new dataset that builds on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.
An upcoming technical report will document the model specifications and the training settings.
Fine-tuning
If you have access to the training beta, you can fine-tune this model.
Here’s an example using replicate-python:
training = replicate.trainings.create(
version="replicate/stablelm-base-alpha-3b:1a55f731ed1787bee29f94ad6661fb8eb7d63ca4fee1c983b710afc66954677a",
input={
"train_data": "https://storage.googleapis.com/dan-scratch-public/fine-tuning/70k_samples.jsonl",
},
destination="my-username/my-model"
)
Fine-tuning takes these input parameters:
train_data
(required): URL to a file where each row is a JSON record in the format{"prompt": ..., "completion": ...}
. Can be JSONL or one JSON list.train_batch_size
(optional, default=1): Train batch size. For llama-13B, we recommend keeping the batch size small and increasinggradient_accumulation_steps
gradient_accumulation_steps
(optional, default=8): Number of training steps (each oftrain_batch_size
) to store gradients for before performing an optimizer step.learning_rate
(optional, default=2e-5): Learning rate!num_train_epochs
(optional, default=1): Number of epochs (iterations over the entire training dataset) to train for.warmup_ratio
(optional, default=0.03): Percentage of all training steps used for a linear LR warmup.logging_steps
(optional, default=1): Prints loss & other logging info everylogging_steps
.max_steps
(optional, default=-1): Maximum number of training steps. Unlimited ifmax_steps=-1
.lora_rank
(optional, default=8): Rank of the LoRA matrices.lora_alpha
(optional, default=16): Alpha parameter for scaling LoRA weights; weights are scaled by alpha/ranklora_dropout
(optional, default=0.1): Dropout for LoRA training.lora_target_modules
(optional, default=’q_proj,v_proj’): Comma-separated list of target modules to fine-tune with LoRA.
Read our guide to fine-tuning a model to learn more
Acknowledgements
StableLM-Tuned-Alpha
would not have been possible without the helpful hand of Dakota Mahan @dmayhem93.
Licenses
-
Base model checkpoints (
StableLM-Base-Alpha
) are licensed under the Creative Commons license (CC BY-SA-4.0). Under the license, you must give credit to Stability AI, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the Stability AI endorses you or your use. -
Fine-tuned checkpoints (
StableLM-Tuned-Alpha
) are licensed under the Non-Commercial Creative Commons license (CC BY-NC-SA-4.0), in-line with the original non-commercial license specified by Stanford Alpaca. -
All code in this repository is licensed under the Apache License 2.0 license.