PromptEnhancer
/

PromptEnhancer-Img2img-Edit

Text Generation

prompt-enhancement

prompt-rewriting

chain-of-thought

text-generation-inference

Model card Files Files and versions

xingxm commited on Sep 30

Commit

00e494d

·

verified ·

1 Parent(s): 40a7683

Update README.md

Files changed (1) hide show

README.md +91 -3

README.md CHANGED Viewed

@@ -1,3 +1,91 @@
----
-license: apache-2.0
----

+---
+license: other
+license_name: apache-2.0
+license_link: https://huggingface.co/PromptEnhancer/PromptEnhancer-32B/blob/main/License.txt
+language:
+- zh
+- en
+tags:
+- text-to-image
+- prompt-enhancement
+- prompt-rewriting
+- chain-of-thought
+pipeline_tag: text-generation
+library_name: transformers
+base_model: Qwen/Qwen2.5-VL-32B-Instruct
+---
+# PromptEnhancerV2 (32B) - Img2Img Edit
+PromptEnhancerV2 is a multimodal language model fine-tuned for image-to-image editing instruction enhancement and rewriting. It refines editing instructions by leveraging both the input text and the provided image, preserving the original intent while producing clearer, structured, and logically consistent prompts suitable for downstream image editing tasks.
+## Model Details
+### Model Description
+PromptEnhancerV2 (Img2Img Edit) is a specialized vision-language prompt rewriting model that employs chain-of-thought reasoning to enhance user editing instructions with visual context.
+- **Model type:** Vision-Language Model for Prompt Enhancement
+- **Language(s) (NLP):** Chinese (zh), English (en)
+- **License:** Apache-2.0
+- **Finetuned from model:** Qwen/Qwen2.5-VL-32B-Instruct
+### Model Sources
+- **Repository:** https://github.com/ximinng/PromptEnhancer
+- **Paper:** https://arxiv.org/abs/2509.04545
+- **Homepage:** https://hunyuan-promptenhancer.github.io/
+## How to Get Started with the Model
+- **1. Clone the repository:**:
+```bash
+git clone https://github.com/ximinng/PromptEnhancer.git
+cd PromptEnhancer
+pip install -r requirements.txt
+```
+- **2. Model Download:**
+```bash
+huggingface-cli download PromptEnhancer/PromptEnhancer-Img2img-Edit --local-dir ./models/promptenhancer-img2img-edit
+```
+- **3. Use the model:**
+```python
+from inference.prompt_enhancer_img2img import PromptEnhancerImg2Img
+# Initialize the model
+models_root_path = "./models/promptenhancer-img2img-edit"
+enhancer = PromptEnhancerImg2Img(model_path=models_root_path, device_map="auto")
+# Enhance an editing instruction with image context (Chinese or English)
+edit_instruction = "去掉图片底部的水印，保留主体不变"
+image_path = "./examples/sample_image.png"
+enhanced_prompt = enhancer.predict(
+    edit_instruction=edit_instruction,
+    image_path=image_path,
+    temperature=0.1,
+    top_p=0.9,
+    max_new_tokens=2048
+)
+print("Enhanced:", enhanced_prompt)
+```
+## Citation
+If you find this model useful, please consider citing:
+**BibTeX:**
+```bibtex
+@article{promptenhancer,
+  title={PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting},
+  author={Wang, Linqing and Xing, Ximing and Cheng, Yiji and Zhao, Zhiyuan and Donghao, Li and Tiankai, Hang and Zhenxi, Li and Tao, Jiale and Wang, QiXun and Li, Ruihuang and Chen, Comi and Li, Xin and Wu, Mingrui and Deng, Xinchi and Gu, Shuyang and Wang, Chunyu and Lu, Qinglin},
+  journal={arXiv preprint arXiv:2509.04545},
+  year={2025}
+}
+```