Spaces:

jeanbaptdzd
/

dragonllm-finance-models

Runtime error

App Files Files Community

dragonllm-finance-models / docs /ARCHITECTURE.md

jeanbaptdzd

feat: Clean deployment to HuggingFace Space with model config test endpoint

8c0b652 about 2 months ago

preview code

raw

history blame contribute delete

9.76 kB

🏗️ LinguaCustodia API Architecture

📋 Overview

This document describes the clean, scalable architecture for the LinguaCustodia Financial AI API, designed to support multiple models and inference providers (HuggingFace, Scaleway, Koyeb).

🎯 Design Principles

Configuration Pattern: Centralized configuration management
Provider Abstraction: Support multiple inference providers
Model Registry: Easy model switching and management
Separation of Concerns: Clear module boundaries
Solid Logging: Structured, contextual logging
Testability: Easy to test and maintain

📁 Project Structure

LLM-Pro-Fin-Inference/
├── config/                      # Configuration module
│   ├── __init__.py             # Exports all configs
│   ├── base_config.py          # Base application config
│   ├── model_configs.py        # Model-specific configs
│   ├── provider_configs.py     # Provider-specific configs
│   └── logging_config.py       # Logging setup
│
├── core/                        # Core business logic
│   ├── __init__.py
│   ├── storage_manager.py      # Storage abstraction
│   ├── model_loader.py         # Model loading abstraction
│   └── inference_engine.py     # Inference abstraction
│
├── providers/                   # Provider implementations
│   ├── __init__.py
│   ├── base_provider.py        # Abstract base class
│   ├── huggingface_provider.py # HF implementation
│   ├── scaleway_provider.py    # Scaleway implementation
│   └── koyeb_provider.py       # Koyeb implementation
│
├── api/                         # API layer
│   ├── __init__.py
│   ├── app.py                  # FastAPI application
│   ├── routes.py               # API routes
│   └── models.py               # Pydantic models
│
├── utils/                       # Utilities
│   ├── __init__.py
│   └── helpers.py              # Helper functions
│
├── tests/                       # Tests (keep existing)
│   ├── test_api.py
│   ├── test_model_loading.py
│   └── ...
│
├── docs/                        # Documentation
│   ├── ARCHITECTURE.md         # This file
│   ├── API_REFERENCE.md        # API documentation
│   └── DEPLOYMENT.md           # Deployment guide
│
├── app.py                       # Main entry point
├── requirements.txt             # Dependencies
├── .env.example                 # Environment template
└── README.md                    # Project overview

🔧 Configuration Pattern

Base Configuration (`config/base_config.py`)

Purpose: Provides foundational settings and defaults for the entire application.

Features:

API settings (host, port, CORS)
Storage configuration
Logging configuration
Environment variable loading
Provider selection

Usage:

from config import BaseConfig

config = BaseConfig.from_env()
print(config.to_dict())

Model Configurations (`config/model_configs.py`)

Purpose: Defines model-specific parameters and generation settings.

Features:

Model registry for all LinguaCustodia models
Generation configurations per model
Memory requirements
Hardware recommendations

Usage:

from config import get_model_config, list_available_models

# List available models
models = list_available_models()  # ['llama3.1-8b', 'qwen3-8b', ...]

# Get specific model config
config = get_model_config('llama3.1-8b')
print(config.generation_config.temperature)

Provider Configurations (`config/provider_configs.py`)

Purpose: Defines provider-specific settings for different inference platforms.

Features:

Provider registry (HuggingFace, Scaleway, Koyeb)
API endpoints and authentication
Provider capabilities (streaming, batching)
Rate limiting and timeouts

Usage:

from config import get_provider_config

provider = get_provider_config('huggingface')
print(provider.api_endpoint)

Logging Configuration (`config/logging_config.py`)

Purpose: Provides structured, contextual logging.

Features:

Colored console output
JSON structured logs
File rotation
Context managers for extra fields
Multiple log levels

Usage:

from config import setup_logging, get_logger, LogContext

# Setup logging (once at startup)
setup_logging(log_level="INFO", log_to_file=True)

# Get logger in any module
logger = get_logger(__name__)
logger.info("Starting application")

# Add context to logs
with LogContext(logger, user_id="123", request_id="abc"):
    logger.info("Processing request")

🎨 Benefits of This Architecture

1. Multi-Provider Support

Easy to switch between HuggingFace, Scaleway, Koyeb
Consistent interface across providers
Provider-specific optimizations

2. Model Flexibility

Easy to add new models
Centralized model configurations
Model-specific generation parameters

3. Maintainability

Clear separation of concerns
Small, focused modules
Easy to test and debug

4. Scalability

Provider abstraction allows horizontal scaling
Configuration-driven behavior
Easy to add new features

5. Production-Ready

Proper logging and monitoring
Error handling and retries
Configuration management

📦 Files to Keep

Core Application Files

✅ app.py                    # Main entry point
✅ requirements.txt          # Dependencies
✅ .env.example             # Environment template
✅ README.md                # Project documentation
✅ Dockerfile               # Docker configuration

Test Files (All in tests/ directory)

✅ test_api.py
✅ test_model_loading.py
✅ test_private_access.py
✅ comprehensive_test.py
✅ test_response_quality.py

Documentation Files

✅ PROJECT_RULES.md
✅ MODEL_PARAMETERS_GUIDE.md
✅ PERSISTENT_STORAGE_SETUP.md
✅ DOCKER_SPACE_DEPLOYMENT.md

🗑️ Files to Remove

Redundant/Old Implementation Files

❌ space_app.py                    # Old Space app
❌ space_app_with_storage.py       # Old storage app
❌ persistent_storage_app.py       # Old storage app
❌ memory_efficient_app.py         # Old optimized app
❌ respectful_linguacustodia_config.py  # Old config
❌ storage_enabled_respectful_app.py    # Refactored version
❌ app_refactored.py               # Intermediate refactor

Test Files to Organize/Remove

❌ test_app_locally.py            # Move to tests/
❌ test_fallback_locally.py       # Move to tests/
❌ test_storage_detection.py      # Move to tests/
❌ test_storage_setup.py          # Move to tests/
❌ test_private_endpoint.py       # Move to tests/

Investigation/Temporary Files

❌ investigate_model_configs.py   # One-time investigation
❌ evaluate_remote_models.py      # Development script
❌ verify_*.py                    # All verification scripts

Analysis/Documentation (Archive)

❌ LINGUACUSTODIA_INFERENCE_ANALYSIS.md  # Archive to docs/archive/

🚀 Migration Plan

Phase 1: Configuration Layer ✅

Create config module structure
Implement base config
Implement model configs
Implement provider configs
Implement logging config

Phase 2: Core Layer (Next)

Implement StorageManager
Implement ModelLoader
Implement InferenceEngine

Phase 3: Provider Layer

Implement BaseProvider
Implement HuggingFaceProvider
Implement ScalewayProvider (stub)
Implement KoyebProvider (stub)

Phase 4: API Layer

Refactor FastAPI app
Implement routes module
Update Pydantic models

Phase 5: Cleanup

Move test files to tests/
Remove redundant files
Update documentation
Update deployment configs

📝 Usage Examples

Example 1: Basic Usage

from config import BaseConfig, get_model_config, setup_logging
from core import StorageManager, ModelLoader, InferenceEngine

# Setup
config = BaseConfig.from_env()
setup_logging(config.log_level)
model_config = get_model_config('llama3.1-8b')

# Initialize
storage = StorageManager(config)
loader = ModelLoader(config, model_config)
engine = InferenceEngine(loader)

# Inference
result = engine.generate("What is SFCR?", max_tokens=150)
print(result)

Example 2: Provider Switching

from config import BaseConfig, ProviderType

# HuggingFace (local)
config = BaseConfig(provider=ProviderType.HUGGINGFACE)

# Scaleway (cloud)
config = BaseConfig(provider=ProviderType.SCALEWAY)

# Koyeb (cloud)
config = BaseConfig(provider=ProviderType.KOYEB)

Example 3: Model Switching

from config import get_model_config

# Load different models
llama_config = get_model_config('llama3.1-8b')
qwen_config = get_model_config('qwen3-8b')
gemma_config = get_model_config('gemma3-12b')

🎯 Next Steps

Review this architecture - Ensure it meets your needs
Implement core layer - StorageManager, ModelLoader, InferenceEngine
Implement provider layer - Start with HuggingFaceProvider
Refactor API layer - Update FastAPI app
Clean up files - Remove redundant files
Update tests - Test new architecture
Deploy - Test in production

📞 Questions?

This architecture provides:

✅ Configuration pattern for flexibility
✅ Multi-provider support (HF, Scaleway, Koyeb)
✅ Solid logging implementation
✅ Clean, maintainable code structure
✅ Easy to extend and test

Ready to proceed with Phase 2 (Core Layer)?

🏗️ LinguaCustodia API Architecture

📋 Overview

🎯 Design Principles

📁 Project Structure

🔧 Configuration Pattern

Base Configuration (config/base_config.py)

Model Configurations (config/model_configs.py)

Provider Configurations (config/provider_configs.py)

Logging Configuration (config/logging_config.py)

🎨 Benefits of This Architecture

1. Multi-Provider Support

2. Model Flexibility

3. Maintainability

4. Scalability

5. Production-Ready

📦 Files to Keep

Core Application Files

Test Files (All in tests/ directory)

Documentation Files

🗑️ Files to Remove

Redundant/Old Implementation Files

Test Files to Organize/Remove

Investigation/Temporary Files

Analysis/Documentation (Archive)

🚀 Migration Plan

Phase 1: Configuration Layer ✅

Phase 2: Core Layer (Next)

Phase 3: Provider Layer

Phase 4: API Layer

Phase 5: Cleanup

📝 Usage Examples

Example 1: Basic Usage

Example 2: Provider Switching

Example 3: Model Switching

🎯 Next Steps

📞 Questions?

Base Configuration (`config/base_config.py`)

Model Configurations (`config/model_configs.py`)

Provider Configurations (`config/provider_configs.py`)

Logging Configuration (`config/logging_config.py`)