latihan-artikulasi / API_DOCS.md
fariedalfarizi's picture
Add comprehensive Swagger/OpenAPI documentation with detailed endpoint descriptions
431f09f
|
raw
history blame
7.05 kB

API Documentation - Vocal Articulation Assessment v2.0

Swagger/OpenAPI Documentation

API ini menggunakan FastAPI yang menyediakan dokumentasi interaktif otomatis.

Akses Dokumentasi

Setelah aplikasi berjalan, akses dokumentasi di:

1. Swagger UI (Recommended)

https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/docs

atau lokal:

http://localhost:7860/docs

Features:

  • 🎯 Interactive API testing
  • πŸ“ Try out endpoints langsung dari browser
  • πŸ“‹ Request/Response schemas
  • πŸ” Parameter descriptions

2. ReDoc (Alternative Documentation)

https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/redoc

atau lokal:

http://localhost:7860/redoc

Features:

  • πŸ“š Clean, readable documentation
  • πŸ”— Deep linking
  • πŸ“– Better for reading

3. OpenAPI JSON Schema

https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/openapi.json

atau lokal:

http://localhost:7860/openapi.json

Quick API Overview

Base URL

https://huggingface.co/spaces/Cyberlace/latihan-artikulasi

Endpoints

Method Endpoint Description Tags
GET / API information General
GET /health Health check & model status System
GET /levels List all articulation levels Articulation
POST /score Score single audio file Scoring
POST /batch_score Score multiple audio files Scoring

Example Usage

1. Check Health

curl -X GET "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/health"

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "device": "cpu",
  "whisper_model": "openai/whisper-small"
}

2. Get Levels

curl -X GET "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/levels"

Response:

{
  "levels": {
    "1": {
      "name": "Vokal Tunggal",
      "difficulty": "Pemula",
      "targets": ["A", "I", "U", "E", "O"]
    },
    ...
  },
  "total_levels": 5
}

3. Score Audio (Python)

import requests

# Single file
url = "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score"
files = {'audio': open('recording.wav', 'rb')}
data = {'target_text': 'STRATEGI', 'level': 4}

response = requests.post(url, files=files, data=data)
result = response.json()

print(f"Score: {result['overall_score']}")
print(f"Grade: {result['grade']}")
print(f"Transcription: {result['transcription']}")
print(f"Feedback: {result['feedback']}")

4. Score Audio (cURL)

curl -X POST "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score" \
  -F "[email protected]" \
  -F "target_text=STRATEGI" \
  -F "level=4"

5. Batch Score (Python)

import requests

url = "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/batch_score"

files = [
    ('audios', open('audio1.wav', 'rb')),
    ('audios', open('audio2.wav', 'rb')),
    ('audios', open('audio3.wav', 'rb')),
]

data = {
    'target_texts': 'A,I,U',
    'levels': '1,1,1'
}

response = requests.post(url, files=files, data=data)
results = response.json()['results']

for r in results:
    print(f"{r['filename']}: Score={r['overall_score']}, Grade={r['grade']}")

Response Schema

Score Response

{
  "success": true,
  "overall_score": 85.5,
  "grade": "B",
  "clarity_score": 90.0,
  "energy_score": 85.0,
  "speech_rate_score": 80.0,
  "pitch_consistency_score": 88.0,
  "snr_score": 82.0,
  "articulation_score": 87.0,
  "transcription": "STRATEGI",
  "target": "STRATEGI",
  "similarity": 1.0,
  "wer": 0.0,
  "feedback": "Bagus! Pengucapan sudah cukup jelas.",
  "suggestions": [
    "Pertahankan volume suara yang stabil"
  ],
  "audio_features": {
    "duration": 1.234,
    "rms_db": -25.5,
    "zero_crossing_rate": 0.0523,
    "spectral_centroid": 2500.0,
    "spectral_rolloff": 5000.0,
    "spectral_bandwidth": 1800.0,
    "tempo": 120.0
  },
  "level": 4
}

Grading System

  • Grade A (90-100): Sempurna - pengucapan sangat jelas dan akurat
  • Grade B (80-89): Bagus - pengucapan cukup jelas dengan minor errors
  • Grade C (70-79): Cukup - ada beberapa kesalahan
  • Grade D (60-69): Kurang - perlu latihan lebih
  • Grade E (<60): Terus berlatih!

Scoring Metrics

  1. Clarity (0-100): ASR accuracy dari Whisper transcription
  2. Energy (0-100): Kualitas volume dan energi suara (optimal: -30 to -10 dB)
  3. Speech Rate (0-100): Kecepatan bicara (suku kata per detik)
  4. Pitch Consistency (0-100): Stabilitas nada suara
  5. SNR (0-100): Signal-to-Noise Ratio (kualitas rekaman)
  6. Articulation (0-100): Kejernihan artikulasi dari analisis spektral

Error Handling

Common Errors

503 Service Unavailable

{
  "detail": "Model not loaded"
}

Solution: Tunggu model selesai loading (~30-60 detik saat startup)

400 Bad Request - Invalid Level

{
  "detail": "Invalid level. Must be 1-5. Available levels: [1, 2, 3, 4, 5]"
}

Solution: Gunakan level 1-5

400 Bad Request - Empty Target

{
  "detail": "target_text cannot be empty"
}

Solution: Berikan target_text yang valid

500 Internal Server Error

{
  "detail": "Error processing audio: [error message]"
}

Solution: Pastikan format audio valid (WAV, MP3, M4A, FLAC, OGG)


Testing with Swagger UI

  1. Buka: https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/docs
  2. Click endpoint yang ingin di-test (misal: POST /score)
  3. Click "Try it out"
  4. Fill parameters:
    • audio: Upload file audio
    • target_text: Masukkan text (misal: "STRATEGI")
    • level: Pilih 1-5
  5. Click "Execute"
  6. Lihat response di bawah

Client Libraries

Python

# Install requests
pip install requests

# Example code above

JavaScript/Node.js

const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');

const form = new FormData();
form.append('audio', fs.createReadStream('recording.wav'));
form.append('target_text', 'STRATEGI');
form.append('level', '4');

axios.post('https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score', form, {
  headers: form.getHeaders()
})
.then(response => {
  console.log('Score:', response.data.overall_score);
  console.log('Grade:', response.data.grade);
})
.catch(error => console.error(error));

cURL

# See examples above

Rate Limits & Performance

  • Model: Whisper Small (~967 MB)
  • Processing Time: ~2-5 seconds per audio file
  • Max Audio Duration: Recommended < 10 seconds for best results
  • Supported Formats: WAV, MP3, M4A, FLAC, OGG
  • Max File Size: Recommended < 10 MB

Support & Contact


Last Updated: November 19, 2025