Spaces:
Running
Running
| title: Safe O Bot | |
| emoji: π | |
| colorFrom: red | |
| colorTo: gray | |
| sdk: gradio | |
| sdk_version: 5.49.1 | |
| app_file: app.py | |
| pinned: true | |
| short_description: Complete Moderation tool, blocking harmful links, spam, etc. | |
| # Text Safety Analyzer β Multi-model pipeline | |
| A Hugging Face Space / project template that analyzes input text for multiple safety signals: | |
| - Harm/toxicity detection (who is harmed: author, reader, or target β via multi-model ensemble) | |
| - AI jailbreak / filter-bypass pattern detection (heuristics + optional model) | |
| - Filter-obfuscation detection (homoglyphs, separators, zero-width) | |
| - Hidden/obfuscated URL detection (heuristics + malicious-URL model) | |
| - ASCII-art / low-entropy payload detection | |
| This project intentionally focuses on **detection** and explanation. It does NOT provide ways to bypass safety protections. | |
| --- | |
| ## Files | |
| - `classifier.py` β Core pipeline: normalization, heuristics, multi-model inference, aggregation and explanations. | |
| - `app.py` β Gradio demo ready for Hugging Face Spaces. | |
| - `requirements.txt` β Python dependencies. | |
| - `examples/` β (not included by default) place labeled examples for tuning thresholds & unit tests. | |
| --- | |
| ## Architecture & design | |
| 1. **Normalization step** β homoglyph mapping, zero-width removal, whitespace collapse. | |
| 2. **Heuristic detectors** β regex-based detection for obfuscated URLs, ASCII art, jailbreak patterns, and low entropy checks. | |
| 3. **Model ensemble** β several models can be loaded for specific tasks: | |
| - Harm / toxicity models (English and multilingual) | |
| - URL malicious classifier | |
| 4. **Aggregation & explanation** β combine model outputs and heuristic flags and present explainable reasons with model names and scores. | |
| The app is intentionally modular: add additional models by editing `HARM_MODELS` or `URL_MODEL` in `classifier.py` and reloading. | |
| --- | |
| ## How to run locally | |
| 1. Create a virtual environment and install dependencies: | |
| ```bash | |
| python -m venv .venv | |
| source .venv/bin/activate | |
| pip install -r requirements.txt |