Spaces:
Runtime error
π³ Docker-based HuggingFace Space Deployment
Deploy LinguaCustodia Financial AI as a Docker-based API endpoint.
π― Overview
This creates a professional FastAPI-based endpoint for private LinguaCustodia model inference, deployed as a HuggingFace Space with Docker.
π Space Configuration
Basic Settings:
- Space name:
linguacustodia-financial-api - Title:
π¦ LinguaCustodia Financial AI API - Description:
Professional API endpoint for specialized financial AI models - SDK:
Docker - Hardware:
t4-medium(T4 Medium GPU) - Region:
eu-west-3(Paris, France - EU) - Visibility:
private(Private Space) - Status: β FULLY OPERATIONAL - https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api
π Required Secrets
In your Space Settings > Variables, you need to set:
1. HF_TOKEN_LC (Required)
HF_TOKEN_LC=your_linguacustodia_token_here
- Purpose: Access to private LinguaCustodia models
- Security: Keep this private and secure
2. DOCKER_HUB Credentials (Optional - for custom images)
If you want to push custom Docker images to Docker Hub:
DOCKER_USERNAME=your_dockerhub_username
DOCKER_PASSWORD=your_hf_docker_hub_access_key
Note: Use your HF_DOCKER_HUB_ACCESS_KEY as the Docker password for better security.
π Files to Upload
Upload these files to your Space:
- Dockerfile - Docker configuration
- app.py - FastAPI application (use
respectful_linguacustodia_config.pyas base) - requirements.txt - Python dependencies
- README.md - Space documentation with proper YAML configuration
π Deployment Steps
1. Create New Space
- Go to: https://huggingface.co/new-space
- Make sure you're logged in with your Pro account (
jeanbaptdzd)
2. Configure Space
- Space name:
linguacustodia-financial-api - Title:
π¦ LinguaCustodia Financial AI API - Description:
Professional API endpoint for specialized financial AI models - SDK:
Docker - Hardware:
t4-medium - Region:
eu-west-3 - Visibility:
private
3. Upload Files
Upload all files from your local directory to the Space.
4. Set Environment Variables
In Space Settings > Variables:
- Add
HF_TOKEN_LCwith your LinguaCustodia token - Optionally add Docker Hub credentials if needed
5. Deploy
- Click "Create Space"
- Wait 10-15 minutes for Docker build and deployment
- Space will be available at:
https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api
π§ͺ API Endpoints
Once deployed, your API will have these endpoints:
Health Check
GET /health
Root Information
GET /
List Available Models
GET /models
Load Model
POST /load_model?model_name=LinguaCustodia/llama3.1-8b-fin-v0.3
Inference
POST /inference
Content-Type: application/json
{
"prompt": "What is SFCR in European insurance regulation?",
"max_tokens": 150,
"temperature": 0.6
}
Note: Uses official LinguaCustodia parameters (temperature: 0.6, max_tokens: 150)
API Documentation
GET /docs
π‘ Example Usage
Test with curl:
# Health check
curl https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/health
# Inference (using official LinguaCustodia parameters)
curl -X POST "https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference" \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is SFCR in European insurance regulation?",
"max_tokens": 150,
"temperature": 0.6
}'
Test with Python:
import requests
# Inference request (using official LinguaCustodia parameters)
response = requests.post(
"https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference",
json={
"prompt": "What is SFCR in European insurance regulation?",
"max_tokens": 150,
"temperature": 0.6
}
)
result = response.json()
print(result["response"])
Test with provided scripts:
# Simple test
python test_api.py
# Comprehensive test
python comprehensive_test.py
# Response quality test
python test_response_quality.py
π§ Docker Build Process
The Space will automatically:
- Build the Docker image using the Dockerfile
- Install all dependencies from requirements.txt
- Copy the application code
- Start the FastAPI server on port 8000
- Expose the API endpoints
π― Benefits of Docker Deployment
- β Professional API - FastAPI with proper documentation
- β Private model support - Native support for private models
- β T4 Medium GPU - Cost-effective inference
- β EU region - GDPR compliance
- β Health checks - Built-in monitoring
- β Scalable - Can handle multiple requests
- β Secure - Environment variables for secrets
- β Truncation issue solved - 149 tokens generated (1.9x improvement)
- β Official LinguaCustodia parameters - Temperature 0.6, proper EOS tokens
π¨ Important Notes
- Model Loading: The default model loads on startup (may take 2-3 minutes)
- Memory Usage: 8B models need ~16GB RAM, 12B models need ~32GB
- Cost: T4 Medium costs ~$0.50/hour when active
- Security: Keep HF_TOKEN_LC private and secure
- Monitoring: Use
/healthendpoint to check status
π― Ready to deploy? Follow the steps above to create your professional Docker-based API endpoint!