huggingface-course (The LLM Course)

posted an update 6 days ago

Post

5141

fine-tuning a 14B model with TRL + SFT on a free Colab (T4 GPU)?
thanks to the latest TRL optimizations, you actually can!
sharing a new notebook showing how to do it 😎

colab: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb

notebooks in TRL: https://github.com/huggingface/trl/tree/main/examples/notebooks

2 replies

·

sergiopaniego

posted an update 8 days ago

Post

337

Gave a smol 🤏 intro to Agents using smolagents last Monday!
Sharing the slides in case you're curious. They serve as a gentle first step into the Agents Course we developed at @huggingface 🫶🫶

Course: https://huggingface.co/learn/agents-course/unit0/introduction

Workshop material: https://github.com/sergiopaniego/talks/tree/main/intro_to_agents

sergiopaniego

posted an update 11 days ago

Post

3036

Sharing the slides from yesterday's talk about "Fine Tuning with TRL" from the @TogetherAgent x @huggingface workshop we hosted in our Paris office 🎃!

Link: https://github.com/sergiopaniego/talks/blob/main/fine_tuning_with_trl/Fine%20tuning%20with%20TRL%20(Oct%2025).pdf

sergiopaniego

posted an update 12 days ago

Post

362

On-Policy distillation is trendy! and super useful!

HuggingFaceH4/on-policy-distillation

sergiopaniego

posted an update 18 days ago

Post

2799

Meet OpenEnv 👋, an open ecosystem of environments for intelligent agents. Build, share, and test agents safely and consistently.

Ideal for training with TRL (we include examples🤓), deployment, and community collaboration via the HF Hub

Blog: https://huggingface.co/blog/openenv
Hub for Environments:

openenv
OpenEnv repo: https://github.com/meta-pytorch/OpenEnv
Try it out using TRL: https://huggingface.co/docs/trl/main/en/openenv

1 reply

·

sergiopaniego

posted an update 24 days ago

Post

1925

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 24 days ago

Post

880

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 27 days ago

Post

2300

@Qwen released their new small and dense VLMs (Qwen3-VL).

They're incredibly capable and one of my all-time favourite VLMs.

🤗 We’ve prepared some resources to help you get started.

> Fine-tune Qwen3-VL-4B with SFT or GRPO (free Colab notebooks):
> SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb
> GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb

> Compare object detection vs. Moondream3:
sergiopaniego/vlm_object_understanding

> Fine-tune from the CLI using TRL:
https://github.com/kashif/Qwen3-VL/blob/trl-sft/qwen-vl-finetune/README.md#trl-based-training-single-gpu

lvwerra

authored a paper 28 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9 • 35

burtenshaw

in huggingface-course/supervised_finetuning_quiz 28 days ago

Update app.py

1

#1 opened about 1 month ago by

John6666

sergiopaniego

posted an update about 1 month ago

Post

1464

Super nice intro to fine-tuning with TRL, just dropped by @google (runs free on Colab)!

They use SFT + QLoRA to fine-tune the tiny Gemma 3 270M model for emoji generation

Here’s what the fine-tuned model generates for the prompt: “I'm learning to tweet” → 🐦🗣💻

Colab: https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Demos/Emoji-Gemma-on-Web/resources/Fine_tune_Gemma_3_270M_for_emoji_generation.ipynb
Try it out: google/emoji-gemma
Learn more: https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/

sergiopaniego

posted an update about 1 month ago

Post

2416

Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

1 reply

·

sergiopaniego

posted an update about 1 month ago

Post

2897

A few days ago, Thinking Machines Lab released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret

burtenshaw

authored a paper about 1 month ago

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Paper • 2509.25397 • Published Sep 29 • 10

sergiopaniego

posted an update about 1 month ago

Post

596

Want to deploy open models using vLLM as the inference engine?
We just released a step-by-step guide on how to do it with @huggingface Inference Endpoints, now available in the vLLM docs.

let the gpus go brrr

https://docs.vllm.ai/en/latest/deployment/frameworks/hf_inference_endpoints.html

sergiopaniego

posted an update about 2 months ago

Post

493

You need to try this tool! 🫡

My colleague @Molbap built an interactive HF Space to explore the modular support of open models in transformers over time

👀 You’ll spot things like 🦙 llama defining many models or which ones could be modular next

Try it: Molbap/transformers-modular-refactor

sergiopaniego

posted an update about 2 months ago

Post

485

How fast can you create an endpoint in Hugging Face Inference Endpoints with a new model + vLLM to deploy a state-of-the-art OCR model?

Let’s break it down step by step.

1️⃣ Create your endpoint
Go to Hugging Face Endpoints → + NEW
Select Deploy from Hub → rednote-hilab/dots.ocr → Configure 🛠️

2️⃣ Configure hardware & container
Pick hardware: AWS/GPU/L4 ⚡
Set container: vLLM 🐇
Click Create ✅

3️⃣ Update endpoint settings
Container: Container URI: vllm/vllm-openai:nightly → Update
Advanced: add flag --trust-remote-code → Update ⚠️

4️⃣ Run inference
Download the script 📝: ariG23498/useful-scripts
Set your HF_TOKEN and update base_url in the script.
Run it. ✅

Your OCR model is now live via HF Inference Endpoints!

sergiopaniego

posted an update about 2 months ago

Post

3469

💥 Tons of new material just landed in the smol-course! 🧑‍💻

> evaluation
> alignment
> VLMs
> quizzes
> assignments!
> certificates!👩‍🎓

go learn! 👉 https://huggingface.co/learn/smol-course/unit0/1

1 reply

·

sergiopaniego

posted an update about 2 months ago

Post

1401

This summer TRL leveled up for multimodal alignment 🌞

✅ New VLM alignment methods (MPO, GRPO, GSPO)
✅ Extended RLOO & Online DPO for VLMs
✅ Native SFT support
✅ Ready-to-use training scripts

🔗 https://huggingface.co/blog/trl-vlm-alignment

sergiopaniego

posted an update about 2 months ago

Post

565

You can now use any open LLM as your coding assistant in VS Code with the @huggingface Provider for GitHub Copilot Chat.

Just pick your fav open model and start building!

Vibe-coding is all you need!?

learn more: https://huggingface.co/docs/inference-providers/en/guides/vscode

1 reply

·

The LLM Course

AI & ML interests

Recent Activity

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Update app.py

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

AI & ML interests

Recent Activity

Team members 10

huggingface-course's activity

Update app.py