xingxm commited on
Commit
00e494d
·
verified ·
1 Parent(s): 40a7683

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -3
README.md CHANGED
@@ -1,3 +1,91 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: apache-2.0
4
+ license_link: https://huggingface.co/PromptEnhancer/PromptEnhancer-32B/blob/main/License.txt
5
+ language:
6
+ - zh
7
+ - en
8
+ tags:
9
+ - text-to-image
10
+ - prompt-enhancement
11
+ - prompt-rewriting
12
+ - chain-of-thought
13
+ pipeline_tag: text-generation
14
+ library_name: transformers
15
+ base_model: Qwen/Qwen2.5-VL-32B-Instruct
16
+ ---
17
+
18
+ # PromptEnhancerV2 (32B) - Img2Img Edit
19
+
20
+ PromptEnhancerV2 is a multimodal language model fine-tuned for image-to-image editing instruction enhancement and rewriting. It refines editing instructions by leveraging both the input text and the provided image, preserving the original intent while producing clearer, structured, and logically consistent prompts suitable for downstream image editing tasks.
21
+
22
+ ## Model Details
23
+
24
+ ### Model Description
25
+
26
+ PromptEnhancerV2 (Img2Img Edit) is a specialized vision-language prompt rewriting model that employs chain-of-thought reasoning to enhance user editing instructions with visual context.
27
+
28
+ - **Model type:** Vision-Language Model for Prompt Enhancement
29
+ - **Language(s) (NLP):** Chinese (zh), English (en)
30
+ - **License:** Apache-2.0
31
+ - **Finetuned from model:** Qwen/Qwen2.5-VL-32B-Instruct
32
+
33
+ ### Model Sources
34
+
35
+ - **Repository:** https://github.com/ximinng/PromptEnhancer
36
+ - **Paper:** https://arxiv.org/abs/2509.04545
37
+ - **Homepage:** https://hunyuan-promptenhancer.github.io/
38
+
39
+ ## How to Get Started with the Model
40
+
41
+ - **1. Clone the repository:**:
42
+
43
+ ```bash
44
+ git clone https://github.com/ximinng/PromptEnhancer.git
45
+ cd PromptEnhancer
46
+ pip install -r requirements.txt
47
+ ```
48
+
49
+ - **2. Model Download:**
50
+
51
+ ```bash
52
+ huggingface-cli download PromptEnhancer/PromptEnhancer-Img2img-Edit --local-dir ./models/promptenhancer-img2img-edit
53
+ ```
54
+
55
+ - **3. Use the model:**
56
+
57
+ ```python
58
+ from inference.prompt_enhancer_img2img import PromptEnhancerImg2Img
59
+
60
+ # Initialize the model
61
+ models_root_path = "./models/promptenhancer-img2img-edit"
62
+ enhancer = PromptEnhancerImg2Img(model_path=models_root_path, device_map="auto")
63
+
64
+ # Enhance an editing instruction with image context (Chinese or English)
65
+ edit_instruction = "去掉图片底部的水印,保留主体不变"
66
+ image_path = "./examples/sample_image.png"
67
+
68
+ enhanced_prompt = enhancer.predict(
69
+ edit_instruction=edit_instruction,
70
+ image_path=image_path,
71
+ temperature=0.1,
72
+ top_p=0.9,
73
+ max_new_tokens=2048
74
+ )
75
+
76
+ print("Enhanced:", enhanced_prompt)
77
+ ```
78
+
79
+ ## Citation
80
+
81
+ If you find this model useful, please consider citing:
82
+
83
+ **BibTeX:**
84
+ ```bibtex
85
+ @article{promptenhancer,
86
+ title={PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting},
87
+ author={Wang, Linqing and Xing, Ximing and Cheng, Yiji and Zhao, Zhiyuan and Donghao, Li and Tiankai, Hang and Zhenxi, Li and Tao, Jiale and Wang, QiXun and Li, Ruihuang and Chen, Comi and Li, Xin and Wu, Mingrui and Deng, Xinchi and Gu, Shuyang and Wang, Chunyu and Lu, Qinglin},
88
+ journal={arXiv preprint arXiv:2509.04545},
89
+ year={2025}
90
+ }
91
+ ```