| # Advanced RAG Chatbot - User Guide | |
| ## What's New? | |
| ### 1. Multiple Images & Texts Support in `/index` API | |
| The `/index` endpoint now supports indexing multiple texts and images in a single request (max 10 each). | |
| **Before:** | |
| ```python | |
| # Old: Only 1 text and 1 image | |
| data = { | |
| 'id': 'doc1', | |
| 'text': 'Single text', | |
| } | |
| files = {'image': open('image.jpg', 'rb')} | |
| ``` | |
| **After:** | |
| ```python | |
| # New: Multiple texts and images (max 10 each) | |
| data = { | |
| 'id': 'doc1', | |
| 'texts': ['Text 1', 'Text 2', 'Text 3'], # Up to 10 | |
| } | |
| files = [ | |
| ('images', open('image1.jpg', 'rb')), | |
| ('images', open('image2.jpg', 'rb')), | |
| ('images', open('image3.jpg', 'rb')), # Up to 10 | |
| ] | |
| response = requests.post('http://localhost:8000/index', data=data, files=files) | |
| ``` | |
| **Example with cURL:** | |
| ```bash | |
| curl -X POST "http://localhost:8000/index" \ | |
| -F "id=event123" \ | |
| -F "texts=Sự kiện âm nhạc tại Hà Nội" \ | |
| -F "texts=Diễn ra vào ngày 20/10/2025" \ | |
| -F "texts=Địa điểm: Trung tâm Hội nghị Quốc gia" \ | |
| -F "[email protected]" \ | |
| -F "[email protected]" \ | |
| -F "[email protected]" | |
| ``` | |
| ### 2. Advanced RAG Pipeline in `/chat` API | |
| The chat endpoint now uses modern RAG techniques for better response quality: | |
| #### Key Improvements: | |
| 1. **Query Expansion**: Automatically expands your question with variations | |
| 2. **Multi-Query Retrieval**: Searches with multiple query variants | |
| 3. **Reranking**: Re-scores results for better relevance | |
| 4. **Contextual Compression**: Keeps only the most relevant parts | |
| 5. **Better Prompt Engineering**: Optimized prompts for LLM | |
| #### How to Use: | |
| **Basic Usage (Auto-enabled):** | |
| ```python | |
| import requests | |
| response = requests.post('http://localhost:8000/chat', json={ | |
| 'message': 'Dao có nguy hiểm không?', | |
| 'use_rag': True, | |
| 'use_advanced_rag': True, # Default: True | |
| 'hf_token': 'hf_xxxxx' | |
| }) | |
| result = response.json() | |
| print("Response:", result['response']) | |
| print("RAG Stats:", result['rag_stats']) # See pipeline statistics | |
| ``` | |
| **Advanced Configuration:** | |
| ```python | |
| response = requests.post('http://localhost:8000/chat', json={ | |
| 'message': 'Làm sao để tạo event mới?', | |
| 'use_rag': True, | |
| 'use_advanced_rag': True, | |
| # RAG Pipeline Options | |
| 'use_query_expansion': True, # Expand query with variations | |
| 'use_reranking': True, # Rerank results | |
| 'use_compression': True, # Compress context | |
| 'score_threshold': 0.5, # Min relevance score (0-1) | |
| 'top_k': 5, # Number of documents to retrieve | |
| # LLM Options | |
| 'max_tokens': 512, | |
| 'temperature': 0.7, | |
| 'hf_token': 'hf_xxxxx' | |
| }) | |
| ``` | |
| **Disable Advanced RAG (Use Basic):** | |
| ```python | |
| response = requests.post('http://localhost:8000/chat', json={ | |
| 'message': 'Your question', | |
| 'use_rag': True, | |
| 'use_advanced_rag': False, # Use basic RAG | |
| }) | |
| ``` | |
| ## API Changes Summary | |
| ### `/index` Endpoint | |
| **Old Parameters:** | |
| - `id`: str (required) | |
| - `text`: str (required) | |
| - `image`: UploadFile (optional) | |
| **New Parameters:** | |
| - `id`: str (required) | |
| - `texts`: List[str] (optional, max 10) | |
| - `images`: List[UploadFile] (optional, max 10) | |
| **Response:** | |
| ```json | |
| { | |
| "success": true, | |
| "id": "doc123", | |
| "message": "Đã index thành công document doc123 với 3 texts và 2 images" | |
| } | |
| ``` | |
| ### `/chat` Endpoint | |
| **New Parameters:** | |
| - `use_advanced_rag`: bool (default: True) - Enable advanced RAG | |
| - `use_query_expansion`: bool (default: True) - Expand query | |
| - `use_reranking`: bool (default: True) - Rerank results | |
| - `use_compression`: bool (default: True) - Compress context | |
| - `score_threshold`: float (default: 0.5) - Min relevance score | |
| **Response (New):** | |
| ```json | |
| { | |
| "response": "AI generated answer...", | |
| "context_used": [...], | |
| "timestamp": "2025-10-29T...", | |
| "rag_stats": { | |
| "original_query": "Your question", | |
| "expanded_queries": ["Query variant 1", "Query variant 2"], | |
| "initial_results": 10, | |
| "after_rerank": 5, | |
| "after_compression": 5 | |
| } | |
| } | |
| ``` | |
| ## Complete Examples | |
| ### Example 1: Index Multiple Social Media Posts | |
| ```python | |
| import requests | |
| # Index a social media event with multiple posts and images | |
| data = { | |
| 'id': 'event_festival_2025', | |
| 'texts': [ | |
| 'Festival âm nhạc quốc tế Hà Nội 2025', | |
| 'Ngày 15-17 tháng 11 năm 2025', | |
| 'Địa điểm: Công viên Thống Nhất', | |
| 'Line-up: Sơn Tùng MTP, Đen Vâu, Hoàng Thùy Linh', | |
| 'Giá vé từ 500.000đ - 2.000.000đ' | |
| ] | |
| } | |
| files = [ | |
| ('images', open('poster_festival.jpg', 'rb')), | |
| ('images', open('lineup.jpg', 'rb')), | |
| ('images', open('venue_map.jpg', 'rb')) | |
| ] | |
| response = requests.post('http://localhost:8000/index', data=data, files=files) | |
| print(response.json()) | |
| ``` | |
| ### Example 2: Advanced RAG Chat | |
| ```python | |
| import requests | |
| # Chat with advanced RAG | |
| chat_response = requests.post('http://localhost:8000/chat', json={ | |
| 'message': 'Festival âm nhạc Hà Nội diễn ra khi nào và ở đâu?', | |
| 'use_rag': True, | |
| 'use_advanced_rag': True, | |
| 'top_k': 3, | |
| 'score_threshold': 0.6, | |
| 'hf_token': 'your_hf_token_here' | |
| }) | |
| result = chat_response.json() | |
| print("Answer:", result['response']) | |
| print("\nRetrieved Context:") | |
| for ctx in result['context_used']: | |
| print(f"- [{ctx['id']}] Confidence: {ctx['confidence']:.2%}") | |
| print("\nRAG Pipeline Stats:") | |
| print(f"- Original query: {result['rag_stats']['original_query']}") | |
| print(f"- Query variants: {result['rag_stats']['expanded_queries']}") | |
| print(f"- Documents retrieved: {result['rag_stats']['initial_results']}") | |
| print(f"- After reranking: {result['rag_stats']['after_rerank']}") | |
| ``` | |
| ## Performance Comparison | |
| | Feature | Basic RAG | Advanced RAG | | |
| |---------|-----------|--------------| | |
| | Query Understanding | Single query | Multiple query variants | | |
| | Retrieval Method | Direct vector search | Multi-query + hybrid | | |
| | Result Ranking | Score from DB | Reranked with semantic similarity | | |
| | Context Quality | Full text | Compressed, relevant parts only | | |
| | Response Accuracy | Good | Better | | |
| | Response Time | Faster | Slightly slower but better quality | | |
| ## When to Use What? | |
| **Use Basic RAG when:** | |
| - You need fast response time | |
| - Queries are straightforward | |
| - Context is already well-structured | |
| **Use Advanced RAG when:** | |
| - You need higher accuracy | |
| - Queries are complex or ambiguous | |
| - Context documents are long | |
| - You want better relevance | |
| ## Troubleshooting | |
| ### Error: "Tối đa 10 texts" | |
| You're sending more than 10 texts. Reduce to max 10. | |
| ### Error: "Tối đa 10 images" | |
| You're sending more than 10 images. Reduce to max 10. | |
| ### RAG stats show 0 results | |
| Your `score_threshold` might be too high. Try lowering it (e.g., 0.3-0.5). | |
| ## Next Steps | |
| To further improve RAG, consider: | |
| 1. **Add BM25 Hybrid Search**: Combine dense + sparse retrieval | |
| 2. **Use Cross-Encoder for Reranking**: Better than embedding similarity | |
| 3. **Implement Query Decomposition**: Break complex queries into sub-queries | |
| 4. **Add Citation/Source Tracking**: Show which document each fact comes from | |
| 5. **Integrate RAG-Anything**: For advanced multimodal document processing | |
| For RAG-Anything integration (more complex), see: https://github.com/HKUDS/RAG-Anything | |