paragekbote/phi-4-reasoning-plus-unsloth | Run with an API on Replicate

phi-4-reasoning-plus tuned for scalable inference with long context using Unsloth.

Public

43 runs

License

Weights

Playground API Examples README Versions

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Overview

unsloth/phi-4-reasoning-plus is an optimized version of Microsoft’s Phi-4 reasoning model, accelerated with Unsloth techniques for faster inference and reduced memory footprint.

Features

Reasoning-first LLM: Tuned for step-by-step logical explanations, problem solving, and structured reasoning.
Unsloth-optimized: Memory-efficient kernel fusion and quantization enable smooth inference on smaller GPUs
Natural conversation: Responds well in dialogue without requiring heavy prompting.
Flexible decoding: Supports temperature, top-p and max_new_tokens for controlling creativity vs. coherence.

Usage Tips

Keep prompts natural: Short, conversational input works well (e.g., “Explain why the sky is blue in simple steps.”).
Control verbosity: Adjust max_new_tokens depending on whether you want concise vs. detailed explanations.

Use Cases

Education & tutoring (step-by-step explanations)
Logical reasoning & math walkthroughs
Chain-of-thought style outputs

Model created 5 months, 4 weeks ago

Model updated 2 months ago