craftfulcharles/waveform

Generate Waveform videos using an audio file and an image.

Public
106 runs

Run time and cost

This model runs on CPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

🎵 Pulse: Audio Waveform Video Generator

Generate beautiful, dynamic waveform videos from any audio file. This Cog model creates a stylish dotted waveform and overlays it on a background image that can pulse to the music or slowly zoom in.

Run this model on Replicate »

✨ Features

This model takes any audio file and an optional image and turns them into a shareable MP4 video.

  • Dotted Waveform: Creates a “dot” style audio visualization.
  • Image Background: Overlay the waveform on any background image.
  • Dynamic Effects:
    • Pulse: Make the image “thump” or pulse in time with the music’s amplitude.
    • Zoom In: Add a slow, cinematic zoom-in effect to the image.
  • Smooth Pulse: The pulse effect is smoothed to look more natural and less “jerky.”
  • Auto-Fill & Crop: Your background image is automatically scaled to fill the entire video frame, cropping from the center (no black bars!).
  • Customizable: Control dot size, color, spacing, FPS, and more.

🚀 Using the Replicate API

You can run this model directly from your own Python code using the Replicate API client.

First, install the client:

pip install replicate

Then, set your API token as an environment variable:

export REPLICATE_API_TOKEN=[your_api_token]

Now you can run predictions:

import replicate

# --- Example 1: Pulsing Image ---
output_pulse = replicate.run(
    "your-username/your-model-name:version-id",
    input={
        "audio_file": open("path/to/my-song.mp3", "rb"),
        "image_file": open("path/to/my-image.jpg", "rb"),
        "image_effect": "pulse",
        "dot_color": "#FF0080",
        "pulse_intensity": 0.5,
        "pulse_smoothing": 0.7
    }
)
print(f"Pulsing video at: {output_pulse}")


# --- Example 2: Zoom-In Effect ---
output_zoom = replicate.run(
    "your-username/your-model-name:version-id",
    input={
        "audio_file": open("path/to/my-audio.wav", "rb"),
        "image_file": open("path/to/my-background.png", "rb"),
        "image_effect": "zoom_in",
        "dot_color": "#FFFFFF",
        "zoom_start": 1.0,
        "zoom_end": 1.5
    }
)
print(f"Zooming video at: {output_zoom}")

⚙️ Model Inputs

The model accepts the following inputs:

Parameter Type Description Default
audio_file Path Input audio file. Required
image_file Path Optional background image. None
dot_size int Size of dots in pixels. 6
dot_spacing int Spacing between dots in pixels. 6
height int Height of the output video. 720
width int Width of the output video. 1280
max_height int Max height of visualization as a % of frame. 30
dot_color str Dot color in hex format. "#00FFFF"
fps int Frames per second. 10
image_effect str Effect to apply to the image. (pulse, zoom_in, none) "pulse"
pulse_intensity float Intensity of the pulse effect (max scale added to base). 0.1
pulse_smoothing float Smoothing factor for the pulse (0.0 = jerky, 0.9 = smooth). 0.7
zoom_start float Base image scale (or zoom start scale). 1.0
zoom_end float Ending zoom scale (for ‘zoom_in’ effect). 1.2

📤 Model Output

The model returns a Path (string URL) to the generated .mp4 video file.


💻 Local Development

To run this model locally, you’ll need to install Cog.

  1. Clone the repository:

    bash git clone https://github.com/your-username/your-repo-name.git cd your-repo-name

  2. Install dependencies: The cog.yaml file lists all system and Python dependencies. They will be automatically installed inside a Docker container when you run the model.

  3. Run a prediction:

    bash cog run python predict.py \ -i audio_file=@"path/to/song.mp3" \ -i image_file=@"path/to/image.png" \ -i image_effect="pulse" \ -i pulse_intensity=0.5

The output video will be saved to /tmp/output.mp4 inside the container.

Model created