StableLM-Tuned-Alpha-7B is a 7B parameter decoder-only language model built on top of the StableLM-Base-Alpha models and further fine-tuned on various chat and instruction-following datasets.
The StableLM-Alpha models are trained on the new dataset that builds on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.
An upcoming technical report will document the model specifications and the training settings.
As a proof-of-concept, we also fine-tuned the model with Stanford Alpaca's procedure using a combination of five recent datasets for conversational agents: Stanford's Alpaca, Nomic-AI's gpt4all, RyokoAI's ShareGPT52K datasets, Databricks labs' Dolly, and Anthropic's HH. We will be releasing these models as StableLM-Tuned-Alpha.
StableLM-Tuned-Alphawould not have been possible without the helpful hand of Dakota Mahan @dmayhem93.
Base model checkpoints (
StableLM-Base-Alpha) are licensed under the Creative Commons license (CC BY-SA-4.0). Under the license, you must give credit to Stability AI, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the Stability AI endorses you or your use.
Fine-tuned checkpoints (
StableLM-Tuned-Alpha) are licensed under the Non-Commercial Creative Commons license (CC BY-NC-SA-4.0), in-line with the original non-commercial license specified by Stanford Alpaca.
All code in this repository is licensed under the Apache License 2.0 license.