nateraw / mixtral-8x7b-32kseqlen

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.

  • Public
  • 14.9K runs
  • GitHub

Input

Output

Run time and cost

This model runs on 4x Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 46 seconds. The predict time for this model varies significantly based on the inputs.

Readme

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks we tested.

For full details of this model please read our release blog post or the model card on Hugging Face