File size: 3,583 Bytes
d0f4968
956bf30
c691166
956bf30
c691166
d0f4968
80e9d9d
c691166
80e9d9d
d0f4968
80e9d9d
 
 
d0f4968
80e9d9d
d0f4968
80e9d9d
 
d0f4968
80e9d9d
956bf30
80e9d9d
d0f4968
80e9d9d
 
d0f4968
80e9d9d
 
d0f4968
80e9d9d
 
d0f4968
80e9d9d
956bf30
80e9d9d
 
 
 
 
956bf30
80e9d9d
956bf30
80e9d9d
956bf30
 
80e9d9d
956bf30
80e9d9d
 
 
956bf30
80e9d9d
956bf30
80e9d9d
 
 
956bf30
 
 
80e9d9d
956bf30
80e9d9d
956bf30
80e9d9d
956bf30
80e9d9d
 
 
 
464c7c0
80e9d9d
464c7c0
 
80e9d9d
d0f4968
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-32B/blob/main/LICENSE
pipeline_tag: text-generation
---
<div align="center">

# 🧩 ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

<a href="https://arxiv.org/pdf/2510.24592"><img src="https://img.shields.io/badge/Paper-arXiv-d63031?logo=arxiv&logoColor=white"></a>
<a href="https://huggingface.co/collections/GuoxinChen/reform"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-0984e3"></a>
<a href="https://github.com/Chen-GX/ReForm"><img src="https://img.shields.io/badge/GitHub-ReForm-black?logo=github"></a>

</div>

**ReForm** is a reflective **Autoformalization** framework that enables large language models to *generate → verify → refine* formal mathematical statements in an integrated self-corrective loop. 
It introduces **Prospective Bounded Sequence Optimization (PBSO)** — a novel reinforcement learning algorithm designed for heterogeneous rewards at different sequence positions — enabling stable, reflective training and substantial gains in semantic fidelity.

---

## 🚀 Highlights

- 🪞 **Reflective Autoformalization Paradigm**  
  Turns single-pass translation into an iterative “generate–validate–refine” cycle, allowing the model to autonomously detect and correct semantic errors.

- ⚖️ **Prospective Bounded Sequence Optimization (PBSO)**  
  A reinforcement learning algorithm with position-specific rewards for both task accuracy and critique quality, ensuring stable and interpretable optimization.

- 📈 **State-of-the-art Semantic Consistency**  
  ReForm achieves an **average +22.6pp improvement** over the strongest baseline across four formalization benchmarks (miniF2F, ProofNet, Putnam, and AIME 2025).

---

<div align="center">
  <img src="https://github.com/Chen-GX/ReForm/raw/main/images/benchmark_comparison.png" width="100%">
  <br>
  <sub><b>Figure:</b> ReForm consistently outperforms Goedel-V2 and Kimina-Autoformalizer on all benchmarks.</sub>
</div>

---

## 💡 Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "GuoxinChen/ReForm-8B"  # or "GuoxinChen/ReForm-32B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Think step by step to translate the mathematical problem in natural language to Lean 4, and verify the consistency.\nLet $a_1, a_2,\\cdots, a_n$ be real constants, $x$ a real variable, and $f(x)=\\cos(a_1+x)+\\frac{1}{2}\\cos(a_2+x)+\\frac{1}{4}\\cos(a_3+x)+\\cdots+\\frac{1}{2^{n-1}}\\cos(a_n+x).$ Given that $f(x_1)=f(x_2)=0,$ prove that $x_2-x_1=m\\pi$ for some integer $m.$"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=32768)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```


More Details please refer to our [Github Repo](https://github.com/Chen-GX/ReForm).

# 📚 Citation

If you find ReForm useful for your research, please cite:

```bibtex
@misc{chen2025reform,
      title={ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization}, 
      author={Guoxin Chen and Jing Wu and Xinjie Chen and Wayne Xin Zhao and Ruihua Song and Chengxi Li and Kai Fan and Dayiheng Liu and Minpeng Liao},
      year={2025},
      eprint={2510.24592},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.24592}, 
}
```