Spaces:

jeanbaptdzd
/

dragonllm-finance-models

Runtime error

App Files Files Community

dragonllm-finance-models / docs /DOCKER_SPACE_DEPLOYMENT.md

jeanbaptdzd

feat: Clean deployment to HuggingFace Space with model config test endpoint

8c0b652 about 2 months ago

preview code

raw

history blame contribute delete

5.64 kB

🐳 Docker-based HuggingFace Space Deployment

Deploy LinguaCustodia Financial AI as a Docker-based API endpoint.

🎯 Overview

This creates a professional FastAPI-based endpoint for private LinguaCustodia model inference, deployed as a HuggingFace Space with Docker.

📋 Space Configuration

Basic Settings:

Space name: linguacustodia-financial-api
Title: 🏦 LinguaCustodia Financial AI API
Description: Professional API endpoint for specialized financial AI models
SDK: Docker
Hardware: t4-medium (T4 Medium GPU)
Region: eu-west-3 (Paris, France - EU)
Visibility: private (Private Space)
Status: ✅ FULLY OPERATIONAL - https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api

🔐 Required Secrets

In your Space Settings > Variables, you need to set:

1. HF_TOKEN_LC (Required)

HF_TOKEN_LC=your_linguacustodia_token_here

Purpose: Access to private LinguaCustodia models
Security: Keep this private and secure

2. DOCKER_HUB Credentials (Optional - for custom images)

If you want to push custom Docker images to Docker Hub:

DOCKER_USERNAME=your_dockerhub_username
DOCKER_PASSWORD=your_hf_docker_hub_access_key

Note: Use your HF_DOCKER_HUB_ACCESS_KEY as the Docker password for better security.

📁 Files to Upload

Upload these files to your Space:

Dockerfile - Docker configuration
app.py - FastAPI application (use respectful_linguacustodia_config.py as base)
requirements.txt - Python dependencies
README.md - Space documentation with proper YAML configuration

🚀 Deployment Steps

1. Create New Space

Go to: https://huggingface.co/new-space
Make sure you're logged in with your Pro account (jeanbaptdzd)

2. Configure Space

Space name: linguacustodia-financial-api
Title: 🏦 LinguaCustodia Financial AI API
Description: Professional API endpoint for specialized financial AI models
SDK: Docker
Hardware: t4-medium
Region: eu-west-3
Visibility: private

3. Upload Files

Upload all files from your local directory to the Space.

4. Set Environment Variables

In Space Settings > Variables:

Add HF_TOKEN_LC with your LinguaCustodia token
Optionally add Docker Hub credentials if needed

5. Deploy

Click "Create Space"
Wait 10-15 minutes for Docker build and deployment
Space will be available at: https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api

🧪 API Endpoints

Once deployed, your API will have these endpoints:

Health Check

GET /health

Root Information

GET /

List Available Models

GET /models

Load Model

POST /load_model?model_name=LinguaCustodia/llama3.1-8b-fin-v0.3

Inference

POST /inference
Content-Type: application/json

{
  "prompt": "What is SFCR in European insurance regulation?",
  "max_tokens": 150,
  "temperature": 0.6
}

Note: Uses official LinguaCustodia parameters (temperature: 0.6, max_tokens: 150)

API Documentation

GET /docs

💡 Example Usage

Test with curl:

# Health check
curl https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/health

# Inference (using official LinguaCustodia parameters)
curl -X POST "https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is SFCR in European insurance regulation?",
    "max_tokens": 150,
    "temperature": 0.6
  }'

Test with Python:

import requests

# Inference request (using official LinguaCustodia parameters)
response = requests.post(
    "https://huggingface.co/spaces/jeanbaptdzd/linguacustodia-financial-api/inference",
    json={
        "prompt": "What is SFCR in European insurance regulation?",
        "max_tokens": 150,
        "temperature": 0.6
    }
)

result = response.json()
print(result["response"])

Test with provided scripts:

# Simple test
python test_api.py

# Comprehensive test
python comprehensive_test.py

# Response quality test
python test_response_quality.py

🔧 Docker Build Process

The Space will automatically:

Build the Docker image using the Dockerfile
Install all dependencies from requirements.txt
Copy the application code
Start the FastAPI server on port 8000
Expose the API endpoints

🎯 Benefits of Docker Deployment

✅ Professional API - FastAPI with proper documentation
✅ Private model support - Native support for private models
✅ T4 Medium GPU - Cost-effective inference
✅ EU region - GDPR compliance
✅ Health checks - Built-in monitoring
✅ Scalable - Can handle multiple requests
✅ Secure - Environment variables for secrets
✅ Truncation issue solved - 149 tokens generated (1.9x improvement)
✅ Official LinguaCustodia parameters - Temperature 0.6, proper EOS tokens

🚨 Important Notes

Model Loading: The default model loads on startup (may take 2-3 minutes)
Memory Usage: 8B models need ~16GB RAM, 12B models need ~32GB
Cost: T4 Medium costs ~$0.50/hour when active
Security: Keep HF_TOKEN_LC private and secure
Monitoring: Use /health endpoint to check status

🎯 Ready to deploy? Follow the steps above to create your professional Docker-based API endpoint!