nvidia / chronoedit
ChronoEdit-14B enables physics-aware image editing and action-conditioned world simulation through temporal reasoning.
nvidia / nemotron-nano-v2-12b-vl
A multi-modal AI model for visual Q&A, summarization, and data extraction, supporting text, images, and video.
nvidia / canary-qwen-2.5b
🎤The best open-source speech-to-text model as of Jul 2025, transcribing audio with record 5.63% WER and enabling AI tasks like summarization directly from speech✨
nvidia / sana-sprint-1.6b
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
nvidia / pdf-to-podcast
Transform PDFs into AI podcasts for engaging on-the-go audio content.
nvidia / sana
A fast image model with wide artistic range and resolutions up to 4096x4096
nvidia / parakeet-rnnt-1.1b
🗣️ Nvidia + Suno.ai's speech-to-text conversion with high accuracy and efficiency 📝
nvidia / prismer
A Vision-Language Model with An Ensemble of Experts