nielsr's picture
nielsr HF Staff
Improve model card: Add pipeline tag, library name, RAG tag, and sample usage
1f524fe verified
|
raw
history blame
3.25 kB
---
base_model:
- Qwen/Qwen2.5-7B-Instruct
language:
- en
- zh
license: mit
pipeline_tag: question-answering
library_name: transformers
tags:
- biology
- finance
- text-generation-inference
- retrieval-augmented-generation
---
# HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
## Model Information
We release the agent model used in **HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches**.
<p align="left">
Useful links: 📝 <a href="https://arxiv.org/abs/2508.08088" target="_blank">Paper (arXiv)</a> • 🤗 <a href="https://huggingface.co/papers/2508.08088" target="_blank">Paper (Hugging Face)</a> • 🧩 <a href="https://github.com/plageon/HierSearch" target="_blank">Github Repository</a>
</p>
1. We explore the deep search framework in multi-knowledge-source scenarios and propose a hierarchical agentic paradigm and train with HRL;
2. We notice drawbacks of the naive information transmission among deep search agents and developed a knowledge refiner suitable for multi-knowledge-source scenarios;
3. Our proposed approach for reliable and effective deep search across multiple knowledge sources outperforms existing baselines the flat-RL solution in various domains.
🌹 If you use this model, please ✨star our **[GitHub repository](https://github.com/plageon/HierSearch)** or upvote our **[paper](https://huggingface.co/papers/2508.08088)** to support us. Your star means a lot!
## Usage
This model is designed as a "planner agent" within the HierSearch framework, coordinating local and web searches to answer complex questions. It is based on `Qwen2.5-7B-Instruct`. You can load and use it with the `transformers` library for general text generation, or refer to the full codebase for the complete deep search functionality.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "plageon/HierSearch-Planner-Agent"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{"role": "system", "content": "You are a helpful and knowledgeable assistant specializing in enterprise search."},
{"role": "user", "content": "What are the main findings of the paper 'HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches'?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(decoded_output)
```
## Citation
```bibtex
@misc{tan2025hiersearchhierarchicalenterprisedeep,
title={HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches},
author={Jiejun Tan and Zhicheng Dou and Yan Yu and Jiehan Cheng and Qiang Ju and Jian Xie and Ji-Rong Wen},
year={2025},
eprint={2508.08088},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2508.08088},
}
```