HierSearch-Web-Agent / README.md

nielsr HF Staff

Improve model card: Add pipeline tag, library name, RAG tag, and sample usage

1f524fe verified 3 months ago

preview code

raw

history blame

3.25 kB

metadata

base_model:
  - Qwen/Qwen2.5-7B-Instruct
language:
  - en
  - zh
license: mit
pipeline_tag: question-answering
library_name: transformers
tags:
  - biology
  - finance
  - text-generation-inference
  - retrieval-augmented-generation

HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches

Model Information

We release the agent model used in HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches.

Useful links: 📝 Paper (arXiv) • 🤗 Paper (Hugging Face) • 🧩 Github Repository

We explore the deep search framework in multi-knowledge-source scenarios and propose a hierarchical agentic paradigm and train with HRL;
We notice drawbacks of the naive information transmission among deep search agents and developed a knowledge refiner suitable for multi-knowledge-source scenarios;
Our proposed approach for reliable and effective deep search across multiple knowledge sources outperforms existing baselines the flat-RL solution in various domains.

🌹 If you use this model, please ✨star our GitHub repository or upvote our paper to support us. Your star means a lot!

Usage

This model is designed as a "planner agent" within the HierSearch framework, coordinating local and web searches to answer complex questions. It is based on Qwen2.5-7B-Instruct. You can load and use it with the transformers library for general text generation, or refer to the full codebase for the complete deep search functionality.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "plageon/HierSearch-Planner-Agent" 

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")

messages = [
    {"role": "system", "content": "You are a helpful and knowledgeable assistant specializing in enterprise search."},
    {"role": "user", "content": "What are the main findings of the paper 'HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches'?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(decoded_output)

Citation

@misc{tan2025hiersearchhierarchicalenterprisedeep,
      title={HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches}, 
      author={Jiejun Tan and Zhicheng Dou and Yan Yu and Jiehan Cheng and Qiang Ju and Jian Xie and Ji-Rong Wen},
      year={2025},
      eprint={2508.08088},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2508.08088}, 
}