nohamoamary / image-captioning-with-visual-attention

datasets: Flickr8k

  • Public
  • 10.8K runs
  • GitHub
  • Paper

Input

Output

Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 1 seconds.

Readme

attention based model that automatically learns to describe the content of images. datasets: Flickr8k