adirik / mamba-2.8b-chat

Mamba 2.8B state space language model fine tuned for chat

  • Public
  • 76 runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 106 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Mamba-Chat

Mamba-Chat is the first chat language model based on mamba, which is a language model that leverages state-space model architecture. See the original repo and paper for more details.

Basic Usage

The API input arguments are as follows:
- message: The input message to the chatbot.
- message_history: The chat history as json string to condition the chatbot on.
- temperature: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- top_p: Samples from the top p percentage of most likely tokens during text decoding, lower to ignore less likely tokens.
- top_k: Samples from the top k most likely tokens during text decoding, lower to ignore less likely tokens.
- repetition_penalty: Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.
- seed: The seed parameter for deterministic text generation. A specific seed can be used to reproduce results or left blank for random generation.

References

@article{mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}