metadata
base_model:
- Qwen/Qwen2.5-7B-Instruct
language:
- en
- zh
license: mit
pipeline_tag: question-answering
library_name: transformers
tags:
- biology
- finance
- text-generation-inference
- retrieval-augmented-generation
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
Model Information
We release the agent model used in HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches.
Useful links: 📝 Paper (arXiv) • 🤗 Paper (Hugging Face) • 🧩 Github Repository
- We explore the deep search framework in multi-knowledge-source scenarios and propose a hierarchical agentic paradigm and train with HRL;
- We notice drawbacks of the naive information transmission among deep search agents and developed a knowledge refiner suitable for multi-knowledge-source scenarios;
- Our proposed approach for reliable and effective deep search across multiple knowledge sources outperforms existing baselines the flat-RL solution in various domains.
🌹 If you use this model, please ✨star our GitHub repository or upvote our paper to support us. Your star means a lot!
Usage
This model is designed as a "planner agent" within the HierSearch framework, coordinating local and web searches to answer complex questions. It is based on Qwen2.5-7B-Instruct. You can load and use it with the transformers library for general text generation, or refer to the full codebase for the complete deep search functionality.
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "plageon/HierSearch-Planner-Agent"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{"role": "system", "content": "You are a helpful and knowledgeable assistant specializing in enterprise search."},
{"role": "user", "content": "What are the main findings of the paper 'HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches'?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(decoded_output)
Citation
@misc{tan2025hiersearchhierarchicalenterprisedeep,
title={HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches},
author={Jiejun Tan and Zhicheng Dou and Yan Yu and Jiehan Cheng and Qiang Ju and Jian Xie and Ji-Rong Wen},
year={2025},
eprint={2508.08088},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2508.08088},
}