fariedalfarizi commited on
Commit
431f09f
·
1 Parent(s): 7de9a5a

Add comprehensive Swagger/OpenAPI documentation with detailed endpoint descriptions

Browse files
Files changed (2) hide show
  1. API_DOCS.md +320 -0
  2. api/routes.py +127 -22
API_DOCS.md ADDED
@@ -0,0 +1,320 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # API Documentation - Vocal Articulation Assessment v2.0
2
+
3
+ ## Swagger/OpenAPI Documentation
4
+
5
+ API ini menggunakan **FastAPI** yang menyediakan dokumentasi interaktif otomatis.
6
+
7
+ ### Akses Dokumentasi
8
+
9
+ Setelah aplikasi berjalan, akses dokumentasi di:
10
+
11
+ #### 1. **Swagger UI** (Recommended)
12
+ ```
13
+ https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/docs
14
+ ```
15
+ atau lokal:
16
+ ```
17
+ http://localhost:7860/docs
18
+ ```
19
+
20
+ **Features:**
21
+ - 🎯 Interactive API testing
22
+ - 📝 Try out endpoints langsung dari browser
23
+ - 📋 Request/Response schemas
24
+ - 🔍 Parameter descriptions
25
+
26
+ #### 2. **ReDoc** (Alternative Documentation)
27
+ ```
28
+ https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/redoc
29
+ ```
30
+ atau lokal:
31
+ ```
32
+ http://localhost:7860/redoc
33
+ ```
34
+
35
+ **Features:**
36
+ - 📚 Clean, readable documentation
37
+ - 🔗 Deep linking
38
+ - 📖 Better for reading
39
+
40
+ #### 3. **OpenAPI JSON Schema**
41
+ ```
42
+ https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/openapi.json
43
+ ```
44
+ atau lokal:
45
+ ```
46
+ http://localhost:7860/openapi.json
47
+ ```
48
+
49
+ ---
50
+
51
+ ## Quick API Overview
52
+
53
+ ### Base URL
54
+ ```
55
+ https://huggingface.co/spaces/Cyberlace/latihan-artikulasi
56
+ ```
57
+
58
+ ### Endpoints
59
+
60
+ | Method | Endpoint | Description | Tags |
61
+ |--------|----------|-------------|------|
62
+ | `GET` | `/` | API information | General |
63
+ | `GET` | `/health` | Health check & model status | System |
64
+ | `GET` | `/levels` | List all articulation levels | Articulation |
65
+ | `POST` | `/score` | Score single audio file | Scoring |
66
+ | `POST` | `/batch_score` | Score multiple audio files | Scoring |
67
+
68
+ ---
69
+
70
+ ## Example Usage
71
+
72
+ ### 1. Check Health
73
+ ```bash
74
+ curl -X GET "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/health"
75
+ ```
76
+
77
+ **Response:**
78
+ ```json
79
+ {
80
+ "status": "healthy",
81
+ "model_loaded": true,
82
+ "device": "cpu",
83
+ "whisper_model": "openai/whisper-small"
84
+ }
85
+ ```
86
+
87
+ ### 2. Get Levels
88
+ ```bash
89
+ curl -X GET "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/levels"
90
+ ```
91
+
92
+ **Response:**
93
+ ```json
94
+ {
95
+ "levels": {
96
+ "1": {
97
+ "name": "Vokal Tunggal",
98
+ "difficulty": "Pemula",
99
+ "targets": ["A", "I", "U", "E", "O"]
100
+ },
101
+ ...
102
+ },
103
+ "total_levels": 5
104
+ }
105
+ ```
106
+
107
+ ### 3. Score Audio (Python)
108
+ ```python
109
+ import requests
110
+
111
+ # Single file
112
+ url = "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score"
113
+ files = {'audio': open('recording.wav', 'rb')}
114
+ data = {'target_text': 'STRATEGI', 'level': 4}
115
+
116
+ response = requests.post(url, files=files, data=data)
117
+ result = response.json()
118
+
119
+ print(f"Score: {result['overall_score']}")
120
+ print(f"Grade: {result['grade']}")
121
+ print(f"Transcription: {result['transcription']}")
122
+ print(f"Feedback: {result['feedback']}")
123
+ ```
124
+
125
+ ### 4. Score Audio (cURL)
126
+ ```bash
127
+ curl -X POST "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score" \
128
129
+ -F "target_text=STRATEGI" \
130
+ -F "level=4"
131
+ ```
132
+
133
+ ### 5. Batch Score (Python)
134
+ ```python
135
+ import requests
136
+
137
+ url = "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/batch_score"
138
+
139
+ files = [
140
+ ('audios', open('audio1.wav', 'rb')),
141
+ ('audios', open('audio2.wav', 'rb')),
142
+ ('audios', open('audio3.wav', 'rb')),
143
+ ]
144
+
145
+ data = {
146
+ 'target_texts': 'A,I,U',
147
+ 'levels': '1,1,1'
148
+ }
149
+
150
+ response = requests.post(url, files=files, data=data)
151
+ results = response.json()['results']
152
+
153
+ for r in results:
154
+ print(f"{r['filename']}: Score={r['overall_score']}, Grade={r['grade']}")
155
+ ```
156
+
157
+ ---
158
+
159
+ ## Response Schema
160
+
161
+ ### Score Response
162
+ ```json
163
+ {
164
+ "success": true,
165
+ "overall_score": 85.5,
166
+ "grade": "B",
167
+ "clarity_score": 90.0,
168
+ "energy_score": 85.0,
169
+ "speech_rate_score": 80.0,
170
+ "pitch_consistency_score": 88.0,
171
+ "snr_score": 82.0,
172
+ "articulation_score": 87.0,
173
+ "transcription": "STRATEGI",
174
+ "target": "STRATEGI",
175
+ "similarity": 1.0,
176
+ "wer": 0.0,
177
+ "feedback": "Bagus! Pengucapan sudah cukup jelas.",
178
+ "suggestions": [
179
+ "Pertahankan volume suara yang stabil"
180
+ ],
181
+ "audio_features": {
182
+ "duration": 1.234,
183
+ "rms_db": -25.5,
184
+ "zero_crossing_rate": 0.0523,
185
+ "spectral_centroid": 2500.0,
186
+ "spectral_rolloff": 5000.0,
187
+ "spectral_bandwidth": 1800.0,
188
+ "tempo": 120.0
189
+ },
190
+ "level": 4
191
+ }
192
+ ```
193
+
194
+ ### Grading System
195
+ - **Grade A** (90-100): Sempurna - pengucapan sangat jelas dan akurat
196
+ - **Grade B** (80-89): Bagus - pengucapan cukup jelas dengan minor errors
197
+ - **Grade C** (70-79): Cukup - ada beberapa kesalahan
198
+ - **Grade D** (60-69): Kurang - perlu latihan lebih
199
+ - **Grade E** (<60): Terus berlatih!
200
+
201
+ ### Scoring Metrics
202
+ 1. **Clarity** (0-100): ASR accuracy dari Whisper transcription
203
+ 2. **Energy** (0-100): Kualitas volume dan energi suara (optimal: -30 to -10 dB)
204
+ 3. **Speech Rate** (0-100): Kecepatan bicara (suku kata per detik)
205
+ 4. **Pitch Consistency** (0-100): Stabilitas nada suara
206
+ 5. **SNR** (0-100): Signal-to-Noise Ratio (kualitas rekaman)
207
+ 6. **Articulation** (0-100): Kejernihan artikulasi dari analisis spektral
208
+
209
+ ---
210
+
211
+ ## Error Handling
212
+
213
+ ### Common Errors
214
+
215
+ **503 Service Unavailable**
216
+ ```json
217
+ {
218
+ "detail": "Model not loaded"
219
+ }
220
+ ```
221
+ *Solution*: Tunggu model selesai loading (~30-60 detik saat startup)
222
+
223
+ **400 Bad Request - Invalid Level**
224
+ ```json
225
+ {
226
+ "detail": "Invalid level. Must be 1-5. Available levels: [1, 2, 3, 4, 5]"
227
+ }
228
+ ```
229
+ *Solution*: Gunakan level 1-5
230
+
231
+ **400 Bad Request - Empty Target**
232
+ ```json
233
+ {
234
+ "detail": "target_text cannot be empty"
235
+ }
236
+ ```
237
+ *Solution*: Berikan target_text yang valid
238
+
239
+ **500 Internal Server Error**
240
+ ```json
241
+ {
242
+ "detail": "Error processing audio: [error message]"
243
+ }
244
+ ```
245
+ *Solution*: Pastikan format audio valid (WAV, MP3, M4A, FLAC, OGG)
246
+
247
+ ---
248
+
249
+ ## Testing with Swagger UI
250
+
251
+ 1. Buka: https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/docs
252
+ 2. Click endpoint yang ingin di-test (misal: `POST /score`)
253
+ 3. Click **"Try it out"**
254
+ 4. Fill parameters:
255
+ - `audio`: Upload file audio
256
+ - `target_text`: Masukkan text (misal: "STRATEGI")
257
+ - `level`: Pilih 1-5
258
+ 5. Click **"Execute"**
259
+ 6. Lihat response di bawah
260
+
261
+ ---
262
+
263
+ ## Client Libraries
264
+
265
+ ### Python
266
+ ```python
267
+ # Install requests
268
+ pip install requests
269
+
270
+ # Example code above
271
+ ```
272
+
273
+ ### JavaScript/Node.js
274
+ ```javascript
275
+ const FormData = require('form-data');
276
+ const fs = require('fs');
277
+ const axios = require('axios');
278
+
279
+ const form = new FormData();
280
+ form.append('audio', fs.createReadStream('recording.wav'));
281
+ form.append('target_text', 'STRATEGI');
282
+ form.append('level', '4');
283
+
284
+ axios.post('https://huggingface.co/spaces/Cyberlace/latihan-artikulasi/score', form, {
285
+ headers: form.getHeaders()
286
+ })
287
+ .then(response => {
288
+ console.log('Score:', response.data.overall_score);
289
+ console.log('Grade:', response.data.grade);
290
+ })
291
+ .catch(error => console.error(error));
292
+ ```
293
+
294
+ ### cURL
295
+ ```bash
296
+ # See examples above
297
+ ```
298
+
299
+ ---
300
+
301
+ ## Rate Limits & Performance
302
+
303
+ - **Model**: Whisper Small (~967 MB)
304
+ - **Processing Time**: ~2-5 seconds per audio file
305
+ - **Max Audio Duration**: Recommended < 10 seconds for best results
306
+ - **Supported Formats**: WAV, MP3, M4A, FLAC, OGG
307
+ - **Max File Size**: Recommended < 10 MB
308
+
309
+ ---
310
+
311
+ ## Support & Contact
312
+
313
+ - **HuggingFace Space**: https://huggingface.co/spaces/Cyberlace/latihan-artikulasi
314
+ - **Issues**: Report di HuggingFace Discussions
315
+ - **Version**: 2.0.0
316
+ - **License**: MIT
317
+
318
+ ---
319
+
320
+ **Last Updated**: November 19, 2025
api/routes.py CHANGED
@@ -22,8 +22,40 @@ from core.constants import ARTICULATION_LEVELS
22
 
23
  app = FastAPI(
24
  title="Vocal Articulation Assessment API v2",
25
- description="API untuk penilaian artikulasi vokal Indonesia - Multi-level dengan Whisper ASR",
26
- version="2.0.0"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  )
28
 
29
  # CORS middleware
@@ -139,9 +171,19 @@ async def root():
139
  media_type="application/json"
140
  )
141
 
142
- @app.get("/health", response_model=HealthResponse)
143
  async def health_check():
144
- """Health check endpoint"""
 
 
 
 
 
 
 
 
 
 
145
  return HealthResponse(
146
  status="healthy" if scorer is not None else "unhealthy",
147
  model_loaded=scorer is not None,
@@ -149,27 +191,69 @@ async def health_check():
149
  whisper_model="openai/whisper-small" if scorer else "not loaded"
150
  )
151
 
152
- @app.get("/levels", response_model=LevelsResponse)
153
  async def get_levels():
154
- """Get all articulation levels and their targets"""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
155
  return LevelsResponse(
156
  levels=ARTICULATION_LEVELS,
157
  total_levels=len(ARTICULATION_LEVELS)
158
  )
159
 
160
- @app.post("/score", response_class=JSONResponse)
161
  async def score_audio(
162
- audio: UploadFile = File(..., description="Audio file (WAV, MP3, M4A, etc.)"),
163
  target_text: str = Form(..., description="Target text yang seharusnya diucapkan"),
164
  level: int = Form(1, description="Level artikulasi (1-5)")
165
  ):
166
  """
167
- Score audio file untuk penilaian artikulasi vokal
 
 
 
 
 
 
 
168
 
169
- Args:
170
- audio: File audio yang akan dinilai
171
- target_text: Text target yang seharusnya diucapkan
172
- level: Level artikulasi (1=Vokal, 2=Konsonan, 3=Suku Kata, 4=Kata, 5=Kalimat)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
173
 
174
  Returns:
175
  ScoreResponse dengan hasil penilaian lengkap
@@ -225,22 +309,43 @@ async def score_audio(
225
 
226
  raise HTTPException(status_code=500, detail=f"Error processing audio: {str(e)}")
227
 
228
- @app.post("/batch_score")
229
  async def batch_score_audio(
230
  audios: List[UploadFile] = File(..., description="Multiple audio files"),
231
  target_texts: str = Form(..., description="Comma-separated target texts"),
232
  levels: str = Form("1", description="Comma-separated levels (default: 1 for all)")
233
  ):
234
  """
235
- Score multiple audio files dalam satu request
236
 
237
- Args:
238
- audios: List of audio files
239
- target_texts: Comma-separated target texts
240
- levels: Comma-separated levels (optional, default 1 for all)
241
-
242
- Returns:
243
- List of score results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244
  """
245
  if scorer is None:
246
  raise HTTPException(status_code=503, detail="Model not loaded")
 
22
 
23
  app = FastAPI(
24
  title="Vocal Articulation Assessment API v2",
25
+ description="""
26
+ ## API untuk Penilaian Artikulasi Vokal Indonesia
27
+
28
+ Sistem penilaian berbasis **Whisper ASR** dengan analisis audio komprehensif untuk 5 level artikulasi.
29
+
30
+ ### Features
31
+ - **ASR-based Clarity Scoring** menggunakan Whisper model
32
+ - **6 Metrik Komprehensif**: Clarity, Energy, Speech Rate, Pitch Consistency, SNR, Articulation
33
+ - **Multi-level Support**: Level 1-5 (Vokal → Kalimat)
34
+ - **Grading System**: A-E berdasarkan overall score
35
+
36
+ ### Documentation
37
+ - **Swagger UI**: `/docs` (interactive API testing)
38
+ - **ReDoc**: `/redoc` (alternative documentation)
39
+ - **OpenAPI JSON**: `/openapi.json`
40
+
41
+ ### Endpoints
42
+ - `GET /` - API information
43
+ - `GET /health` - Health check & model status
44
+ - `GET /levels` - List all articulation levels
45
+ - `POST /score` - Score single audio file
46
+ - `POST /batch_score` - Score multiple audio files
47
+ """,
48
+ version="2.0.0",
49
+ docs_url="/docs",
50
+ redoc_url="/redoc",
51
+ openapi_url="/openapi.json",
52
+ contact={
53
+ "name": "Vocal Articulation Assessment Team",
54
+ "url": "https://huggingface.co/spaces/Cyberlace/latihan-artikulasi",
55
+ },
56
+ license_info={
57
+ "name": "MIT License",
58
+ }
59
  )
60
 
61
  # CORS middleware
 
171
  media_type="application/json"
172
  )
173
 
174
+ @app.get("/health", response_model=HealthResponse, tags=["System"])
175
  async def health_check():
176
+ """
177
+ ## Health Check
178
+
179
+ Check API health status and model loading status.
180
+
181
+ **Returns:**
182
+ - `status`: "healthy" or "unhealthy"
183
+ - `model_loaded`: Whether Whisper model is loaded
184
+ - `device`: CPU or CUDA
185
+ - `whisper_model`: Model name
186
+ """
187
  return HealthResponse(
188
  status="healthy" if scorer is not None else "unhealthy",
189
  model_loaded=scorer is not None,
 
191
  whisper_model="openai/whisper-small" if scorer else "not loaded"
192
  )
193
 
194
+ @app.get("/levels", response_model=LevelsResponse, tags=["Articulation"])
195
  async def get_levels():
196
+ """
197
+ ## Get Articulation Levels
198
+
199
+ Retrieve all available articulation levels with their targets.
200
+
201
+ **Levels:**
202
+ - **Level 1**: Vokal Tunggal (A, I, U, E, O)
203
+ - **Level 2**: Konsonan + Vokal (BA, DA, KA, etc.)
204
+ - **Level 3**: Suku Kata Kompleks (BRA, TRI, etc.)
205
+ - **Level 4**: Kata Penuh (RUMAH, STRATEGI, etc.)
206
+ - **Level 5**: Kalimat Lengkap
207
+
208
+ **Returns:**
209
+ - `levels`: Dictionary of all levels with targets
210
+ - `total_levels`: Total number of levels (5)
211
+ """
212
  return LevelsResponse(
213
  levels=ARTICULATION_LEVELS,
214
  total_levels=len(ARTICULATION_LEVELS)
215
  )
216
 
217
+ @app.post("/score", response_class=JSONResponse, tags=["Scoring"])
218
  async def score_audio(
219
+ audio: UploadFile = File(..., description="Audio file (WAV, MP3, M4A, FLAC, OGG)"),
220
  target_text: str = Form(..., description="Target text yang seharusnya diucapkan"),
221
  level: int = Form(1, description="Level artikulasi (1-5)")
222
  ):
223
  """
224
+ ## Score Audio File
225
+
226
+ Upload audio dan dapatkan penilaian artikulasi vokal komprehensif.
227
+
228
+ **Request:**
229
+ - `audio`: Audio file (format: WAV, MP3, M4A, FLAC, OGG)
230
+ - `target_text`: Text yang seharusnya diucapkan (contoh: "A", "BA", "STRATEGI")
231
+ - `level`: Level artikulasi (1-5)
232
 
233
+ **Response:**
234
+ - `success`: Boolean status
235
+ - `overall_score`: Skor keseluruhan (0-100)
236
+ - `grade`: Grade (A-E)
237
+ - 6 component scores (clarity, energy, speech_rate, pitch_consistency, snr, articulation)
238
+ - `transcription`: Hasil ASR dari audio
239
+ - `target`: Target text (uppercase)
240
+ - `similarity`: Similarity score (0-1)
241
+ - `wer`: Word Error Rate (0-1)
242
+ - `feedback`: Feedback teks
243
+ - `suggestions`: List saran perbaikan
244
+ - `audio_features`: Dictionary fitur audio
245
+ - `level`: Level yang digunakan
246
+
247
+ **Example:**
248
+ ```python
249
+ import requests
250
+
251
+ files = {'audio': open('recording.wav', 'rb')}
252
+ data = {'target_text': 'STRATEGI', 'level': 4}
253
+ response = requests.post('http://localhost:8000/score', files=files, data=data)
254
+ result = response.json()
255
+ print(f"Score: {result['overall_score']}, Grade: {result['grade']}")
256
+ ```
257
 
258
  Returns:
259
  ScoreResponse dengan hasil penilaian lengkap
 
309
 
310
  raise HTTPException(status_code=500, detail=f"Error processing audio: {str(e)}")
311
 
312
+ @app.post("/batch_score", tags=["Scoring"])
313
  async def batch_score_audio(
314
  audios: List[UploadFile] = File(..., description="Multiple audio files"),
315
  target_texts: str = Form(..., description="Comma-separated target texts"),
316
  levels: str = Form("1", description="Comma-separated levels (default: 1 for all)")
317
  ):
318
  """
319
+ ## Batch Score Multiple Audio Files
320
 
321
+ Upload beberapa audio files sekaligus dan dapatkan penilaian untuk masing-masing.
322
+
323
+ **Request:**
324
+ - `audios`: List of audio files
325
+ - `target_texts`: Comma-separated target texts (contoh: "A,I,U,E,O")
326
+ - `levels`: Comma-separated levels (contoh: "1,1,1,2,2") atau single value untuk semua
327
+
328
+ **Response:**
329
+ - `results`: Array of score results (sama seperti /score endpoint)
330
+ - `total`: Total number of processed files
331
+
332
+ **Example:**
333
+ ```python
334
+ import requests
335
+
336
+ files = [
337
+ ('audios', open('audio1.wav', 'rb')),
338
+ ('audios', open('audio2.wav', 'rb')),
339
+ ]
340
+ data = {
341
+ 'target_texts': 'A,I',
342
+ 'levels': '1,1'
343
+ }
344
+ response = requests.post('http://localhost:8000/batch_score', files=files, data=data)
345
+ results = response.json()['results']
346
+ for r in results:
347
+ print(f"{r['filename']}: {r['overall_score']}")
348
+ ```
349
  """
350
  if scorer is None:
351
  raise HTTPException(status_code=503, detail="Model not loaded")