Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch
A specialized LoRA fine-tuned adapter based on Kwaipilot/KAT-Dev (32B) for the Hyperswitch Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.
π― Model Description
This LoRA adapter was trained on samples extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain. The adapter uses a FFN-focused approach (MLP blocks only) for efficient fine-tuning.
- Base Model: Kwaipilot/KAT-Dev (32B parameters)
- Training Type: Causal Language Modeling (CLM) with LoRA
- Domain: Payment Processing, Rust Development
- Specialization: Hyperswitch codebase patterns and architecture
- Adapter Focus: Feed-Forward Network (FFN) blocks only
π Training Details
LoRA Configuration
r: 64 # LoRA rank
alpha: 128 # LoRA alpha (2*r)
dropout: 0.05 # LoRA dropout
target_modules: # FFN blocks only
- gate_proj # FFN gating projection
- up_proj # FFN up projection
- down_proj # FFN down projection
Training Hyperparameters
- Epochs: 2
- Batch Size: 8 per device (128 effective with 4 GPUs Γ gradient accumulation)
- Learning Rate: 1e-4 (cosine schedule with 3% warmup)
- Weight Decay: 0.1
- Max Context: 8,192 tokens
- Precision: bfloat16 + TF32
π Usage
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Kwaipilot/KAT-Dev",
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(
"Kwaipilot/KAT-Dev",
trust_remote_code=True
)
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch"
)
# Generate code
prompt = """// File: hyperswitch/crates/router/src/core/payments.rs
// Task: Complete the payment validation function
pub fn validate_payment_method("""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.2, # Lower temperature for code generation
do_sample=True,
top_p=0.95,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Recommended Settings
For Code Generation
generation_config = {
"max_new_tokens": 256,
"temperature": 0.2,
"top_p": 0.95,
"do_sample": True,
"repetition_penalty": 1.1
}
For Code Explanation
generation_config = {
"max_new_tokens": 512,
"temperature": 0.5,
"top_p": 0.9,
"do_sample": True
}
For Documentation Generation
generation_config = {
"max_new_tokens": 384,
"temperature": 0.4,
"top_p": 0.9,
"do_sample": True
}
π οΈ Technical Specifications
- Context Window: 8,192 tokens
- Precision: bfloat16
- Memory Usage:
- Inference: ~65GB VRAM (base model + adapter)
- Inference Speed: Optimized with Flash Attention 2
π Acknowledgments
- Kwaipilot Team for the excellent KAT-Dev base model
- Juspay/Hyperswitch Team for the open-source payment processing platform
- Hugging Face for transformers, PEFT, and TRL libraries
π Citation
@misc{hyperswitch-kat-dev-ffn-lora-2025,
title={Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch},
author={Aditya Narayan},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch},
}
Note: This is a FFN-block-only LoRA adapter.
Model tree for AdityaNarayan/Kwaipilot-KAT-Dev-CPT-LoRA-FFN-Block-Adapter-HyperSwitch
Base model
Kwaipilot/KAT-Dev