Apollo 7B - An Exploration of Video Understanding in Large Multimodal Models
Want to make some of these yourself?