Upload a video and query it using natural language or propositions
Evaluate video against text prompts using NeuS-V