adirik / mamba-790m

Base version of Mamba 790M, a 790 million parameter state space language model

  • Public
  • 47 runs
  • L40S
  • GitHub
  • Paper
  • License

Input

*string
Shift + Return to add a new line

Text prompt to send to the model.

integer
(minimum: 1, maximum: 5000)

Maximum number of tokens to generate. A word is generally 2-3 tokens.

Default: 100

number
(minimum: 0.1, maximum: 5)

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.

Default: 1

number
(minimum: 0.01, maximum: 1)

When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.

Default: 1

integer

When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.

Default: 1

number
(minimum: 0.01, maximum: 10)

Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Default: 1.2

integer

The seed for the random number generator

Output

I'm good. I just got back from the doctor's office, and he said that my blood pressure is normal for me at this age... so it was a little bit of an adjustment to get used too but overall everything has been great! So thank God!! :)<|endoftext|>Q: tag in HTML5 not working properly with IE9 <div class="container"> <!-- <h1>Hello World</p><!-- </br /> --> // works fine in all
Generated in

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Mamba

Mamba is a large language model with state space model architecture showing promising performance on information-dense data such as language modeling. See the original repo and paper for details.

Basic Usage

The API input arguments are as follows:

  • prompt: The text prompt for Mamba.
  • max_length: Maximum number of tokens to generate. A word is generally 2-3 tokens.
  • temperature: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
  • top_p: Samples from the top p percentage of most likely tokens during text decoding, lower to ignore less likely tokens.
  • top_k: Samples from the top k most likely tokens during text decoding, lower to ignore less likely tokens.
  • repetition_penalty: Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.
  • seed: The seed parameter for deterministic text generation. A specific seed can be used to reproduce results or left blank for random generation.

References

@article{mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}