Datasets used to train SmolDocling
HuggingFaceM4
Team
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
WebSight is a dataset of 823,000 HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot.
-
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 13k • 376 -
HuggingFaceM4/VLM_WebSight_finetuned
Text Generation • 8B • Updated • 699 • 191 -
Screenshot to HTML
⚡911Convert screenshots to HTML code
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55
Collection gathering artifacts related to OBELICS
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
IDEFICS2 Playground
🐨169Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 9.62k • 617 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 317 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 1.94k • 28
Collection assembling all the models and spaces related to IDEFICS
Datasets used to train SmolDocling
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
IDEFICS2 Playground
🐨169Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 9.62k • 617 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 317 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 1.94k • 28
WebSight is a dataset of 823,000 HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot.
-
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 13k • 376 -
HuggingFaceM4/VLM_WebSight_finetuned
Text Generation • 8B • Updated • 699 • 191 -
Screenshot to HTML
⚡911Convert screenshots to HTML code
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55
Collection assembling all the models and spaces related to IDEFICS
Collection gathering artifacts related to OBELICS