Edit Models filters

Apps

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

video-text-to-text

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

371

Full-text search

Active filters: video-text-to-text

nkkbr/ViCA2-thinkng

Video-Text-to-Text • 8B • Updated May 15 • 1

nkkbr/ViCA-thinking

Video-Text-to-Text • 8B • Updated May 7 • 1

andrewt28/qwen2.5-omni-3b-keyboard-video-text

Video-Text-to-Text • Updated May 8 • 3

trinhvg/ViDRiP_LLaVA_video

Video-Text-to-Text • 8B • Updated 22 days ago • 7

QiWang98/VideoRFT

Video-Text-to-Text • 8B • Updated 27 days ago • 124 • 4

QiWang98/VideoRFT-SFT

Video-Text-to-Text • 8B • Updated 27 days ago • 164

friedrichor/Unite-Base-Qwen2-VL-2B

Feature Extraction • 2B • Updated Jun 10 • 458

friedrichor/Unite-Base-Qwen2-VL-7B

Feature Extraction • 8B • Updated Jun 10 • 47 • 1

friedrichor/Unite-Instruct-Qwen2-VL-2B

Feature Extraction • 2B • Updated Jun 10 • 6 • 1

friedrichor/Unite-Instruct-Qwen2-VL-7B

Feature Extraction • 8B • Updated Jun 10 • 7

chancharikm/qwen2.5-vl-72b-cam-motion

Video-Text-to-Text • 73B • Updated Sep 19 • 21

Haoz0206/Omni-R1

Video-Text-to-Text • 9B • Updated May 28 • 50 • 23

BBBBCHAN/LLaVA-Scissor-baseline-7B

Video-Text-to-Text • 8B • Updated Jul 1 • 22 • 3

BBBBCHAN/LLaVA-Scissor-baseline-0.5B

Video-Text-to-Text • 0.9B • Updated Jul 1 • 18 • 4

second-state/SmolVLM2-2.2B-Instruct-GGUF

Image-Text-to-Text • 2B • Updated May 29 • 820 • 3

gaianet/SmolVLM2-2.2B-Instruct-GGUF

Image-Text-to-Text • 2B • Updated May 28 • 141

Falconss1/TW-GRPO

Video-Text-to-Text • 8B • Updated Jun 15 • 131

Diankun/Spatial-MLLM-subset-sft

Video-Text-to-Text • 5B • Updated Jun 10 • 11k • 3

BAAI/Video-XL-2

Video-Text-to-Text • 8B • Updated Jun 6 • 532 • 54

yunzhuyunzhu/flexselect_llava_video

Video-Text-to-Text • 0.5B • Updated Jun 10 • 28

yunzhuyunzhu/flexselect_qwen2.5vl

Video-Text-to-Text • 0.5B • Updated Jun 10 • 10

yunzhuyunzhu/flexselect_internvl2.5

Video-Text-to-Text • 0.5B • Updated Jun 10 • 7

QiWang98/VideoRFT-3B

Video-Text-to-Text • 4B • Updated 27 days ago • 68

QiWang98/VideoRFT-SFT-3B

Video-Text-to-Text • 4B • Updated 27 days ago • 63

Alrightalright/DreamFrame-Related

Video-Text-to-Text • Updated Jun 11

Darwin-Project/MUSEG-7B

Video-Text-to-Text • 8B • Updated Jun 9 • 56

Darwin-Project/MUSEG-3B

Video-Text-to-Text • 4B • Updated Jun 9 • 13

Uni-MoE/VerIPO-7B-v1.0

Video-Text-to-Text • 8B • Updated Jun 6 • 3

Mungert/SkyCaptioner-V1-GGUF

Video-Text-to-Text • 8B • Updated Sep 24 • 295 • 5

DAMO-NLP-SG/VideoRefer-VideoLLaMA3-7B

Video-Text-to-Text • 8B • Updated Jun 19 • 62 • 11