Makhtum Ahmed

anything098

ahmedmakhtum011

AI & ML interests

uncensored

Recent Activity

upvoted a paper 13 days ago

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

liked a model 14 days ago

PokeeAI/pokee_research_7b

upvoted a paper about 1 month ago

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

View all activity

Organizations

None yet

upvoted a paper 13 days ago

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Paper • 2510.19430 • Published 18 days ago • 44

liked a model 14 days ago

PokeeAI/pokee_research_7b

Text Generation • 8B • Updated 17 days ago • 5.71k • 97

upvoted a paper about 1 month ago

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

Paper • 2509.24900 • Published Sep 29 • 53

liked a dataset about 1 month ago

WilliamHuang91/MAPO_Math_OOD_Dataset

Viewer • Updated Sep 24 • 11.5k • 262 • 1

upvoted 2 papers about 2 months ago

Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

Paper • 2509.15591 • Published Sep 19 • 45

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Paper • 2509.16197 • Published Sep 19 • 54

liked a model about 2 months ago

fancyfeast/big-asp-v2

Text-to-Image • Updated Apr 10 • 4.87k • 10

upvoted a paper about 2 months ago

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25, 2024 • 10

liked a model about 2 months ago

genmo/mochi-1-preview

Text-to-Video • Updated Sep 4 • 4.66k • • 1.28k

upvoted 9 papers about 2 months ago

WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance

Paper • 2509.15130 • Published Sep 18 • 30

LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

Paper • 2509.12203 • Published Sep 15 • 19

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11 • 236

RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10 • 72

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9 • 83

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

liked 2 models 2 months ago

kudzueye/boreal-qwen-image

Text-to-Image • Updated Sep 5 • 12.9k • • 114

lodestones/Chroma

Text-to-Image • Updated 17 days ago • 1.28k

Makhtum Ahmed

AI & ML interests

Recent Activity

Organizations

anything098's activity