paragekbote | Replicate

An optimized gemma-3-4b setup with INT8 weight-only quantization, torch_compile and sparsity for efficient inference.

45 runs

Public

An optimized Flux.1-dev Img2Img setup delivering blazing-fast inference, memory efficiency and dynamic LoRA hotswapping.

27 runs

Public

phi-4-reasoning-plus tuned for scalable inference with long context using Unsloth.

31 runs

Public

SmolLM3-3B with Pruna for lightning-fast, memory-efficient AI inference.

23 runs

Public

A blazing-fast inference setup for Flux.1-dev with dynamic LoRA hotswapping.

40 runs

Public