Liang0223
/

Qwen-2.5-Math-1.5B-DPO

Text Generation

text-generation-inference

Model card Files Files and versions

Add model card

#1

by nielsr HF Staff - opened Aug 24

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -1,3 +1,9 @@
----
-license: mit
----

+---
+license: mit
+library_name: transformers
+pipeline_tag: text-generation
+---
+This model was presented in the paper [On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification](https://huggingface.co/papers/2508.05629).
+Code: https://github.com/yongliang-wu/DFT?tab=readme-ov-file