edoproch/deepseekr1-distilled-llama-70b-ollama | Run with an API on Replicate

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

🚀 Meet DeepSeek-R1 distilled on LLaMA 70B! Unlike other similar models on Replicate, this one has its weights cached, so you don’t have to waste time downloading them every time. ⏳💨

But wait, there’s more! 🎉 It’s also quantized, meaning you get way better efficiency with barely any performance loss. Smarter, faster, and optimized just for you! ⚡🔥