Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch

A specialized LoRA fine-tuned adapter based on Kwaipilot/KAT-Dev (32B) for the Hyperswitch Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.

🎯 Model Description

This LoRA adapter was trained on samples extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain. The adapter uses a FFN-focused approach (MLP blocks only) for efficient fine-tuning.

  • Base Model: Kwaipilot/KAT-Dev (32B parameters)
  • Training Type: Causal Language Modeling (CLM) with LoRA
  • Domain: Payment Processing, Rust Development
  • Specialization: Hyperswitch codebase patterns and architecture
  • Adapter Focus: Feed-Forward Network (FFN) blocks only

πŸ“Š Training Details

LoRA Configuration

r: 64                   # LoRA rank
alpha: 128              # LoRA alpha (2*r)
dropout: 0.05           # LoRA dropout
target_modules:         # FFN blocks only
  - gate_proj           # FFN gating projection
  - up_proj             # FFN up projection
  - down_proj           # FFN down projection

Training Hyperparameters

  • Epochs: 2
  • Batch Size: 8 per device (128 effective with 4 GPUs Γ— gradient accumulation)
  • Learning Rate: 1e-4 (cosine schedule with 3% warmup)
  • Weight Decay: 0.1
  • Max Context: 8,192 tokens
  • Precision: bfloat16 + TF32

πŸš€ Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Kwaipilot/KAT-Dev",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    "Kwaipilot/KAT-Dev",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model, 
    "AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch"
)

# Generate code
prompt = """// File: hyperswitch/crates/router/src/core/payments.rs
// Task: Complete the payment validation function

pub fn validate_payment_method("""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.2,  # Lower temperature for code generation
    do_sample=True,
    top_p=0.95,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended Settings

For Code Generation

generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.2,
    "top_p": 0.95,
    "do_sample": True,
    "repetition_penalty": 1.1
}

For Code Explanation

generation_config = {
    "max_new_tokens": 512,
    "temperature": 0.5,
    "top_p": 0.9,
    "do_sample": True
}

For Documentation Generation

generation_config = {
    "max_new_tokens": 384,
    "temperature": 0.4,
    "top_p": 0.9,
    "do_sample": True
}

πŸ› οΈ Technical Specifications

  • Context Window: 8,192 tokens
  • Precision: bfloat16
  • Memory Usage:
    • Inference: ~65GB VRAM (base model + adapter)
  • Inference Speed: Optimized with Flash Attention 2

πŸ™ Acknowledgments

  • Kwaipilot Team for the excellent KAT-Dev base model
  • Juspay/Hyperswitch Team for the open-source payment processing platform
  • Hugging Face for transformers, PEFT, and TRL libraries

πŸ“ž Citation

@misc{hyperswitch-kat-dev-ffn-lora-2025,
  title={Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch},
  author={Aditya Narayan},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch},
}

Note: This is a FFN-block-only LoRA adapter.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch

Base model

Kwaipilot/KAT-Dev
Finetuned
(3)
this model

Dataset used to train AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch