plmsmile's picture

1 2 16

plmsmile

plmsmile

·

plmsmile

AI & ML interests

None yet

Recent Activity

liked a Space 4 days ago

HuggingFaceTB/smol-training-playbook

upvoted an article 5 months ago

GRPO for GUI Grounding Done Right

reacted to VirtualOasis's post with 🔥 6 months ago

Automatic Multi-Modal Research Agent I am thinking of building an Automatic Research Agent that can boost creativity! Input: Topics or data sources Processing: Automated deep research Output: multimodal results (such as reports, videos, audio, diagrams) & multi-platform publishing. There is a three-stage process In the initial Stage, output for text-based content in markdown format allows for user review before transformation into various other formats, such as PDF or HTML. The second stage transforms the output into other modalities, like audio, video, diagrams, and translations into different languages. The final stage focuses on publishing multi-modal content across multiple platforms like X, GitHub, Hugging Face, YouTube, and podcasts, etc.

View all activity

Organizations

None yet

plmsmile 's datasets

None public yet