--- sdk: gradio python_version: "3.10" app_file: app.py --- # Farsi Audio Chatbot This is a Gradio-based application that allows users to speak in Farsi, receive a response from a chatbot, and hear the response in Farsi audio. ## Prerequisites - Python 3.8 or higher ## Installation 1. Clone this repository. 2. Create and activate a virtual environment. 3. Install dependencies: `pip install -r requirements.txt` 4. Run the application: `python app.py` ## How It Works - **Speech-to-Text (STT)**: Uses [Whisper small](https://huggingface.co/openai/whisper-small) for converting Farsi speech to text. - **Natural Language Processing (NLP)**: Uses [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) to generate Farsi text responses. - **Text-to-Speech (TTS)**: Uses [edge-tts](https://github.com/rany2/edge-tts) with the `fa-IR-FaridNeural` voice for Farsi audio output. ## Deployment on Hugging Face Spaces To deploy on [Hugging Face Spaces](https://huggingface.co/spaces): 1. Create a new Space. 2. Upload this repository, including `requirements.txt`, `app.py`, and `README.md`. 3. Ensure the Space has sufficient resources (at least 2GB RAM, GPU optional). 4. The app will automatically build and run. **Note**: The current version processes audio inputs discretely (via button click). For continuous streaming, additional optimizations like real-time audio chunk processing are needed. ## Limitations - Whisper-small may have reduced accuracy in noisy environments. - GPT2-fa is suitable for short responses but may struggle with complex conversations. - Continuous audio streaming is not yet implemented. ## Citations - Whisper (Speech-to-Text): [openai/whisper-small](https://huggingface.co/openai/whisper-small) - Chatbot (NLP): [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) - edge-tts (Text-to-Speech): [edge-tts](https://github.com/rany2/edge-tts)