--- title: AI Evaluation Dashboard emoji: 📊 colorFrom: blue colorTo: indigo sdk: docker pinned: false app_port: 3000 --- # AI Evaluation Dashboard This repository is a Next.js application for viewing and authoring AI evaluations. It provides a comprehensive platform for documenting and sharing AI system evaluations across multiple dimensions including capabilities and risks. ## Project Goals The AI Evaluation Dashboard aims to: - **Standardize AI evaluation reporting** across different AI systems and models - **Facilitate transparency** by providing detailed evaluation cards for AI systems - **Enable comparative analysis** of AI capabilities and risks - **Support research and policy** by consolidating evaluation data in an accessible format - **Promote responsible AI development** through comprehensive risk assessment ## For External Collaborators ### Making Changes to Evaluation Categories and Schema All evaluation categories, form fields, and data structures are centrally managed in the `schema/` folder. **This is the primary location for making structural changes to the evaluation framework.** Key schema files: - **`schema/evaluation-schema.json`** - Defines all evaluation categories (capabilities and risks) - **`schema/output-schema.json`** - Defines the complete data structure for evaluation outputs - **`schema/system-info-schema.json`** - Defines form field options for system information - **`schema/category-details.json`** - Contains detailed descriptions and criteria for each category - **`schema/form-hints.json`** - Provides help text and guidance for form fields ### Standards and Frameworks Used The evaluation framework is based on established standards: - **Risk categories** are derived from **NIST AI 600-1** (AI Risk Management Framework) - **Capability categories** are based on the **OECD AI Classification Framework** This ensures consistency with international AI governance standards and facilitates interoperability with other evaluation systems. ### Contributing Evaluation Data Evaluation data files are stored in `public/evaluations/` as JSON files. Each file represents a complete evaluation of an AI system and must conform to the schema defined in `schema/output-schema.json`. To add a new evaluation: 1. Create a new JSON file in `public/evaluations/` 2. Follow the structure defined in `schema/output-schema.json` 3. Ensure all required fields are populated 4. Validate against the schema before submission ### Development Setup ## Run locally Install dependencies and run the dev server: ```bash npm ci npm run dev ``` Build for production and run: ```bash npm ci npm run build NODE_ENV=production PORT=3000 npm run start ``` ## Docker (recommended for Hugging Face Spaces) A `Dockerfile` is included for deploying this app as a dynamic service on Hugging Face Spaces (Docker runtime). Build the image locally: ```bash docker build -t ai-eval-dashboard . ``` Run the container (expose port 3000): ```bash docker run -p 3000:3000 -e HF_TOKEN="$HF_TOKEN" ai-eval-dashboard ``` Visit `http://localhost:3000` to verify. ### Deploy to Hugging Face Spaces 1. Create a new Space at https://huggingface.co/new-space and choose **Docker** as the runtime. 2. Push this repository to the Space Git (or upload files through the UI). The Space will build the Docker image using the included `Dockerfile` and serve your app on port 3000. Notes: - If your build needs native dependencies (e.g. `sharp`), the Docker image may require extra apt packages; update the Dockerfile accordingly.