DocUA commited on
Commit
8d119a4
·
1 Parent(s): 212d350

Оновлено метадані проекту MarkItDown, включаючи нову інформацію про версії SDK та Python. Додано інструкції для розгортання на Hugging Face Spaces, що включають налаштування секретів та змінних середовища. Змінено залежності у requirements.txt для відповідності новим версіям бібліотек.

Browse files
Files changed (3) hide show
  1. README.md +41 -7
  2. requirements.txt +2 -2
  3. spaces_metadata.yaml +3 -3
README.md CHANGED
@@ -1,8 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  # 🚀 MarkItDown Testing Platform
2
 
3
  **Enterprise-Grade Document Conversion Testing with AI-Powered Analysis**
4
 
5
- [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/your-username/markitdown-testing-platform)
6
  [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
7
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
 
@@ -23,7 +35,7 @@ A comprehensive testing platform for Microsoft's MarkItDown document conversion
23
 
24
  ### Using the Hugging Face Space
25
 
26
- 1. **Visit the Space**: [MarkItDown Testing Platform](https://huggingface.co/spaces/your-username/markitdown-testing-platform)
27
  2. **Upload Document**: Drag & drop or select your document
28
  3. **Configure Analysis**: Enter Gemini API key for AI analysis (optional)
29
  4. **Process**: Click "Process Document" and review results
@@ -78,11 +90,11 @@ A comprehensive testing platform for Microsoft's MarkItDown document conversion
78
 
79
  ### Key Dependencies
80
  ```python
81
- gradio>=4.0.0 # UI framework
82
- markitdown[all]>=0.1.0 # Document conversion
83
- google-genai>=0.1.0 # Gemini integration (new client)
84
- plotly>=5.17.0 # Interactive visualizations
85
- pandas>=1.5.0 # Data processing
86
  ```
87
 
88
  ## 📊 Analysis Capabilities
@@ -164,6 +176,28 @@ export MAX_FILE_SIZE="52428800" # 50MB in bytes
164
  export PROCESSING_TIMEOUT="300" # 5 minutes
165
  ```
166
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
  ## 📚 API Reference
168
 
169
  ### Core Processing Pipeline
 
1
+ ---
2
+ title: MarkItDownTestingPlatform
3
+ emoji: 📊
4
+ colorFrom: pink
5
+ colorTo: gray
6
+ sdk: gradio
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ short_description: Enterprise-Grade Document Conversion Testing with AI-Powered
11
+ ---
12
+
13
  # 🚀 MarkItDown Testing Platform
14
 
15
  **Enterprise-Grade Document Conversion Testing with AI-Powered Analysis**
16
 
17
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/DocSA/MarkItDownTestingPlatform)
18
  [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
19
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
20
 
 
35
 
36
  ### Using the Hugging Face Space
37
 
38
+ 1. **Visit the Space**: [MarkItDown Testing Platform](https://huggingface.co/spaces/DocSA/MarkItDownTestingPlatform)
39
  2. **Upload Document**: Drag & drop or select your document
40
  3. **Configure Analysis**: Enter Gemini API key for AI analysis (optional)
41
  4. **Process**: Click "Process Document" and review results
 
90
 
91
  ### Key Dependencies
92
  ```python
93
+ gradio>=4.44.0 # Gradio interface (HF Spaces compatible)
94
+ markitdown[all]>=0.1.0 # Microsoft conversion engine
95
+ google-genai>=1.0.0 # Gemini integration (new client)
96
+ plotly>=5.17.0 # Interactive visualizations
97
+ pandas>=1.5.0 # Data processing
98
  ```
99
 
100
  ## 📊 Analysis Capabilities
 
176
  export PROCESSING_TIMEOUT="300" # 5 minutes
177
  ```
178
 
179
+ ### Deploying to Hugging Face Spaces
180
+
181
+ 1. **Створіть Space**
182
+ - Відкрийте [huggingface.co/spaces/new](https://huggingface.co/spaces/new)
183
+ - Оберіть SDK **Gradio**, назву `DocSA/MarkItDownTestingPlatform`, runtime **Python 3.11**
184
+ - `app_file` має залишатися `app.py`
185
+
186
+ 2. **Запуште код**
187
+ ```bash
188
+ git remote add hf https://huggingface.co/spaces/DocSA/MarkItDownTestingPlatform
189
+ git push hf main
190
+ ```
191
+
192
+ 3. **Налаштуйте секрети та змінні середовища**
193
+ - Додайте секрет `GEMINI_API_KEY` (Settings → Repository secrets → Add)
194
+ - Додаткові змінні (не секретні): `MAX_FILE_SIZE_MB=50`, `PROCESSING_TIMEOUT=300`, `APP_VERSION=2.0.0-enterprise`
195
+
196
+ 4. **Особливості рантайму**
197
+ - Gemini-аналіз вимкнений за замовчуванням; користувач активує його вручну
198
+ - Стандартні налаштування: тип аналізу **Content Summary**, модель **Gemini 2.0 Flash**
199
+ - Обмеження квот Gemini обробляються автоматичними fallback-моделями
200
+
201
  ## 📚 API Reference
202
 
203
  ### Core Processing Pipeline
requirements.txt CHANGED
@@ -2,11 +2,11 @@
2
  # Strategic dependency selection for enterprise-grade reliability
3
 
4
  # Core Framework Dependencies
5
- gradio>=4.0.0,<5.0.0 # UI framework - pinned major version for stability
6
  markitdown[all]>=0.1.0 # Microsoft's document conversion engine
7
 
8
  # LLM Integration - Gemini Focus
9
- google-genai>=0.1.0 # Google Gemini API client (latest)
10
  google-auth>=2.0.0 # Authentication for Google services
11
 
12
  # Data Processing & Visualization
 
2
  # Strategic dependency selection for enterprise-grade reliability
3
 
4
  # Core Framework Dependencies
5
+ gradio>=4.44.0,<5.0.0 # UI framework - aligned with production deployment
6
  markitdown[all]>=0.1.0 # Microsoft's document conversion engine
7
 
8
  # LLM Integration - Gemini Focus
9
+ google-genai>=1.0.0 # Google Gemini API client (latest)
10
  google-auth>=2.0.0 # Authentication for Google services
11
 
12
  # Data Processing & Visualization
spaces_metadata.yaml CHANGED
@@ -6,9 +6,9 @@ emoji: "🚀"
6
  colorFrom: "blue"
7
  colorTo: "purple"
8
  sdk: "gradio"
9
- sdk_version: "4.0.0"
10
  app_file: "app.py"
11
- python_version: "3.10"
12
 
13
  # Space configuration
14
  models:
@@ -74,4 +74,4 @@ custom:
74
  max_file_size: "50MB (HF Spaces free tier)"
75
  processing_timeout: "5 minutes"
76
  memory_optimization: "Stateless architecture with automatic cleanup"
77
- concurrent_processing: "Async pipeline with resource management"
 
6
  colorFrom: "blue"
7
  colorTo: "purple"
8
  sdk: "gradio"
9
+ sdk_version: "4.44.1"
10
  app_file: "app.py"
11
+ python_version: "3.11"
12
 
13
  # Space configuration
14
  models:
 
74
  max_file_size: "50MB (HF Spaces free tier)"
75
  processing_timeout: "5 minutes"
76
  memory_optimization: "Stateless architecture with automatic cleanup"
77
+ concurrent_processing: "Async pipeline with resource management"