dragonllm-finance-models / docs /DOCKER_SPACE_DEPLOYMENT.md
jeanbaptdzd's picture
feat: Clean deployment to HuggingFace Space with model config test endpoint
8c0b652

🐳 Docker-based HuggingFace Space Deployment

Deploy LinguaCustodia Financial AI as a Docker-based API endpoint.

🎯 Overview

This creates a professional FastAPI-based endpoint for private LinguaCustodia model inference, deployed as a HuggingFace Space with Docker.

πŸ“‹ Space Configuration

Basic Settings:

  • Space name: linguacustodia-financial-api
  • Title: 🏦 LinguaCustodia Financial AI API
  • Description: Professional API endpoint for specialized financial AI models
  • SDK: Docker
  • Hardware: t4-medium (T4 Medium GPU)
  • Region: eu-west-3 (Paris, France - EU)
  • Visibility: private (Private Space)
  • Status: βœ… FULLY OPERATIONAL - https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api

πŸ” Required Secrets

In your Space Settings > Variables, you need to set:

1. HF_TOKEN_LC (Required)

HF_TOKEN_LC=your_linguacustodia_token_here
  • Purpose: Access to private LinguaCustodia models
  • Security: Keep this private and secure

2. DOCKER_HUB Credentials (Optional - for custom images)

If you want to push custom Docker images to Docker Hub:

DOCKER_USERNAME=your_dockerhub_username
DOCKER_PASSWORD=your_hf_docker_hub_access_key

Note: Use your HF_DOCKER_HUB_ACCESS_KEY as the Docker password for better security.

πŸ“ Files to Upload

Upload these files to your Space:

  1. Dockerfile - Docker configuration
  2. app.py - FastAPI application (use respectful_linguacustodia_config.py as base)
  3. requirements.txt - Python dependencies
  4. README.md - Space documentation with proper YAML configuration

πŸš€ Deployment Steps

1. Create New Space

  1. Go to: https://huggingface.co/new-space
  2. Make sure you're logged in with your Pro account (jeanbaptdzd)

2. Configure Space

  • Space name: linguacustodia-financial-api
  • Title: 🏦 LinguaCustodia Financial AI API
  • Description: Professional API endpoint for specialized financial AI models
  • SDK: Docker
  • Hardware: t4-medium
  • Region: eu-west-3
  • Visibility: private

3. Upload Files

Upload all files from your local directory to the Space.

4. Set Environment Variables

In Space Settings > Variables:

  • Add HF_TOKEN_LC with your LinguaCustodia token
  • Optionally add Docker Hub credentials if needed

5. Deploy

  • Click "Create Space"
  • Wait 10-15 minutes for Docker build and deployment
  • Space will be available at: https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api

πŸ§ͺ API Endpoints

Once deployed, your API will have these endpoints:

Health Check

GET /health

Root Information

GET /

List Available Models

GET /models

Load Model

POST /load_model?model_name=LinguaCustodia/llama3.1-8b-fin-v0.3

Inference

POST /inference
Content-Type: application/json

{
  "prompt": "What is SFCR in European insurance regulation?",
  "max_tokens": 150,
  "temperature": 0.6
}

Note: Uses official LinguaCustodia parameters (temperature: 0.6, max_tokens: 150)

API Documentation

GET /docs

πŸ’‘ Example Usage

Test with curl:

# Health check
curl https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/health

# Inference (using official LinguaCustodia parameters)
curl -X POST "https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is SFCR in European insurance regulation?",
    "max_tokens": 150,
    "temperature": 0.6
  }'

Test with Python:

import requests

# Inference request (using official LinguaCustodia parameters)
response = requests.post(
    "https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference",
    json={
        "prompt": "What is SFCR in European insurance regulation?",
        "max_tokens": 150,
        "temperature": 0.6
    }
)

result = response.json()
print(result["response"])

Test with provided scripts:

# Simple test
python test_api.py

# Comprehensive test
python comprehensive_test.py

# Response quality test
python test_response_quality.py

πŸ”§ Docker Build Process

The Space will automatically:

  1. Build the Docker image using the Dockerfile
  2. Install all dependencies from requirements.txt
  3. Copy the application code
  4. Start the FastAPI server on port 8000
  5. Expose the API endpoints

🎯 Benefits of Docker Deployment

  • βœ… Professional API - FastAPI with proper documentation
  • βœ… Private model support - Native support for private models
  • βœ… T4 Medium GPU - Cost-effective inference
  • βœ… EU region - GDPR compliance
  • βœ… Health checks - Built-in monitoring
  • βœ… Scalable - Can handle multiple requests
  • βœ… Secure - Environment variables for secrets
  • βœ… Truncation issue solved - 149 tokens generated (1.9x improvement)
  • βœ… Official LinguaCustodia parameters - Temperature 0.6, proper EOS tokens

🚨 Important Notes

  • Model Loading: The default model loads on startup (may take 2-3 minutes)
  • Memory Usage: 8B models need ~16GB RAM, 12B models need ~32GB
  • Cost: T4 Medium costs ~$0.50/hour when active
  • Security: Keep HF_TOKEN_LC private and secure
  • Monitoring: Use /health endpoint to check status

🎯 Ready to deploy? Follow the steps above to create your professional Docker-based API endpoint!