--- license: mit datasets: - Lyon28/datasets-caca-3500 language: - id tags: - retrieval - qa - indonesian - bm25 - tfidf --- # Chatbot Caca - Retrieval-Based QA Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia. ## Model Details - **Type:** Retrieval-based QA System - **Size:** 2.69 MB - **Algorithm:** Hybrid BM25 + TF-IDF + Fuzzy Matching - **Dataset:** datasets-caca-3500 (3,500 QA pairs) - **Language:** Indonesian ## Usage ```python # Install dependencies !pip install rank-bm25 scikit-learn huggingface-hub # Download model from huggingface_hub import hf_hub_download model_path = hf_hub_download( repo_id="Lyon28/caca-based-chatbot", filename="chatbot_caca.pkl" ) # Load model import pickle with open(model_path, 'rb') as f: data = pickle.load(f) print(f"Loaded {len(data['qa_pairs'])} QA pairs!") ``` ## Performance - Query speed: < 10ms - Accuracy: High for paraphrase matching - Memory: ~3MB ## Credits Created by Lyon28