Spaces:

PatoFlamejanteTV
/

Safe-o-Bot

Running

App Files Files Community

Safe-o-Bot / README.md

PatoFlamejanteTV

Update README.md

72e922e verified 17 days ago

preview code

raw

history blame contribute delete

2.03 kB

	---
	title: Safe O Bot
	emoji: 💂
	colorFrom: red
	colorTo: gray
	sdk: gradio
	sdk_version: 5.49.1
	app_file: app.py
	pinned: true
	short_description: Complete Moderation tool, blocking harmful links, spam, etc.
	---

	# Text Safety Analyzer — Multi-model pipeline

	A Hugging Face Space / project template that analyzes input text for multiple safety signals:
	- Harm/toxicity detection (who is harmed: author, reader, or target — via multi-model ensemble)
	- AI jailbreak / filter-bypass pattern detection (heuristics + optional model)
	- Filter-obfuscation detection (homoglyphs, separators, zero-width)
	- Hidden/obfuscated URL detection (heuristics + malicious-URL model)
	- ASCII-art / low-entropy payload detection

	This project intentionally focuses on detection and explanation. It does NOT provide ways to bypass safety protections.

	---

	## Files
	- `classifier.py` — Core pipeline: normalization, heuristics, multi-model inference, aggregation and explanations.
	- `app.py` — Gradio demo ready for Hugging Face Spaces.
	- `requirements.txt` — Python dependencies.
	- `examples/` — (not included by default) place labeled examples for tuning thresholds & unit tests.

	---

	## Architecture & design
	1. Normalization step — homoglyph mapping, zero-width removal, whitespace collapse.
	2. Heuristic detectors — regex-based detection for obfuscated URLs, ASCII art, jailbreak patterns, and low entropy checks.
	3. Model ensemble — several models can be loaded for specific tasks:
	- Harm / toxicity models (English and multilingual)
	- URL malicious classifier
	4. Aggregation & explanation — combine model outputs and heuristic flags and present explainable reasons with model names and scores.

	The app is intentionally modular: add additional models by editing `HARM_MODELS` or `URL_MODEL` in `classifier.py` and reloading.

	---

	## How to run locally
	1. Create a virtual environment and install dependencies:
	```bash
	python -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt