attention based model that automatically learns to describe the content of images. datasets: Flickr8k
nohamoamary/image-captioning-with-visual-attention
Run time and cost
Predictions run on CPU hardware. Predictions typically complete within 2 seconds. The predict time for this model varies significantly based on the inputs.