huggingcooks (Cookbook Authors)

posted an update 5 days ago

Post

5082

fine-tuning a 14B model with TRL + SFT on a free Colab (T4 GPU)?
thanks to the latest TRL optimizations, you actually can!
sharing a new notebook showing how to do it 😎

colab: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb

notebooks in TRL: https://github.com/huggingface/trl/tree/main/examples/notebooks

2 replies

·

sergiopaniego

posted an update 6 days ago

Post

313

Gave a smol 🤏 intro to Agents using smolagents last Monday!
Sharing the slides in case you're curious. They serve as a gentle first step into the Agents Course we developed at @huggingface 🫶🫶

Course: https://huggingface.co/learn/agents-course/unit0/introduction

Workshop material: https://github.com/sergiopaniego/talks/tree/main/intro_to_agents

sergiopaniego

posted an update 9 days ago

Post

3014

Sharing the slides from yesterday's talk about "Fine Tuning with TRL" from the @TogetherAgent x @huggingface workshop we hosted in our Paris office 🎃!

Link: https://github.com/sergiopaniego/talks/blob/main/fine_tuning_with_trl/Fine%20tuning%20with%20TRL%20(Oct%2025).pdf

sergiopaniego

posted an update 11 days ago

Post

348

On-Policy distillation is trendy! and super useful!

HuggingFaceH4/on-policy-distillation

sergiopaniego

posted an update 17 days ago

Post

2786

Meet OpenEnv 👋, an open ecosystem of environments for intelligent agents. Build, share, and test agents safely and consistently.

Ideal for training with TRL (we include examples🤓), deployment, and community collaboration via the HF Hub

Blog: https://huggingface.co/blog/openenv
Hub for Environments:

openenv
OpenEnv repo: https://github.com/meta-pytorch/OpenEnv
Try it out using TRL: https://huggingface.co/docs/trl/main/en/openenv

1 reply

·

merve

posted an update 20 days ago

Post

5094

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

3 replies

·

sergiopaniego

posted an update 23 days ago

Post

1919

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 23 days ago

Post

875

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 25 days ago

Post

2297

@Qwen released their new small and dense VLMs (Qwen3-VL).

They're incredibly capable and one of my all-time favourite VLMs.

🤗 We’ve prepared some resources to help you get started.

> Fine-tune Qwen3-VL-4B with SFT or GRPO (free Colab notebooks):
> SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb
> GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb

> Compare object detection vs. Moondream3:
sergiopaniego/vlm_object_understanding

> Fine-tune from the CLI using TRL:
https://github.com/kashif/Qwen3-VL/blob/trl-sft/qwen-vl-finetune/README.md#trl-based-training-single-gpu

sergiopaniego

posted an update 30 days ago

Post

1463

Super nice intro to fine-tuning with TRL, just dropped by @google (runs free on Colab)!

They use SFT + QLoRA to fine-tune the tiny Gemma 3 270M model for emoji generation

Here’s what the fine-tuned model generates for the prompt: “I'm learning to tweet” → 🐦🗣💻

Colab: https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Demos/Emoji-Gemma-on-Web/resources/Fine_tune_Gemma_3_270M_for_emoji_generation.ipynb
Try it out: google/emoji-gemma
Learn more: https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/

sergiopaniego

posted an update about 1 month ago

Post

2413

Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

1 reply

·

sergiopaniego

posted an update about 1 month ago

Post

2895

A few days ago, Thinking Machines Lab released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret

sergiopaniego

posted an update about 1 month ago

Post

594

Want to deploy open models using vLLM as the inference engine?
We just released a step-by-step guide on how to do it with @huggingface Inference Endpoints, now available in the vLLM docs.

let the gpus go brrr

https://docs.vllm.ai/en/latest/deployment/frameworks/hf_inference_endpoints.html

sergiopaniego

posted an update about 1 month ago

Post

492

You need to try this tool! 🫡

My colleague @Molbap built an interactive HF Space to explore the modular support of open models in transformers over time

👀 You’ll spot things like 🦙 llama defining many models or which ones could be modular next

Try it: Molbap/transformers-modular-refactor

sergiopaniego

posted an update about 1 month ago

Post

484

How fast can you create an endpoint in Hugging Face Inference Endpoints with a new model + vLLM to deploy a state-of-the-art OCR model?

Let’s break it down step by step.

1️⃣ Create your endpoint
Go to Hugging Face Endpoints → + NEW
Select Deploy from Hub → rednote-hilab/dots.ocr → Configure 🛠️

2️⃣ Configure hardware & container
Pick hardware: AWS/GPU/L4 ⚡
Set container: vLLM 🐇
Click Create ✅

3️⃣ Update endpoint settings
Container: Container URI: vllm/vllm-openai:nightly → Update
Advanced: add flag --trust-remote-code → Update ⚠️

4️⃣ Run inference
Download the script 📝: ariG23498/useful-scripts
Set your HF_TOKEN and update base_url in the script.
Run it. ✅

Your OCR model is now live via HF Inference Endpoints!

sergiopaniego

posted an update about 2 months ago

Post

3467

💥 Tons of new material just landed in the smol-course! 🧑‍💻

> evaluation
> alignment
> VLMs
> quizzes
> assignments!
> certificates!👩‍🎓

go learn! 👉 https://huggingface.co/learn/smol-course/unit0/1

1 reply

·

merve

posted an update about 2 months ago

Post

6625

large AI labs open-sourced a ton of models last week 🔥
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 🤝
> IBM released a new Docling model with 258M params based on Granite (A2.0) 📝 ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana 🍌 (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset 💻 OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash 💭 meituan-longcat/LongCat-Flash-Thinking

2 replies

·

sergiopaniego

posted an update about 2 months ago

Post

1400

This summer TRL leveled up for multimodal alignment 🌞

✅ New VLM alignment methods (MPO, GRPO, GSPO)
✅ Extended RLOO & Online DPO for VLMs
✅ Native SFT support
✅ Ready-to-use training scripts

🔗 https://huggingface.co/blog/trl-vlm-alignment

sergiopaniego

posted an update about 2 months ago

Post

564

You can now use any open LLM as your coding assistant in VS Code with the @huggingface Provider for GitHub Copilot Chat.

Just pick your fav open model and start building!

Vibe-coding is all you need!?

learn more: https://huggingface.co/docs/inference-providers/en/guides/vscode

1 reply

·

merve

posted an update about 2 months ago

Post

3276

IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face 🔥

> not only a document converter but also can do document question answering, understand multiple languages 🤯
> best part: released with Apache 2.0 license 👏 use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! 🤗
> built on SigLIP2 & granite-165M

model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo 💗

AI & ML interests

Team members 15

huggingcooks's activity