Upload folder using huggingface_hub
Browse files- .gitattributes +0 -34
- README.md +148 -0
- best_model_run_eif1jakb.pth +3 -0
.gitattributes
CHANGED
|
@@ -1,35 +1 @@
|
|
| 1 |
-
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
-
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
-
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
-
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
-
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
-
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
-
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
-
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
-
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
-
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
-
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
-
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
-
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
-
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
-
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
-
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
-
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
-
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
-
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
-
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
-
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
-
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
-
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
-
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
-
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
-
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
-
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
-
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
-
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
-
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
-
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
-
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
-
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
-
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
*.pth filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
ADDED
|
@@ -0,0 +1,148 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Vision Transformer for Face Anti-Spoofing (CelebA Spoof PDA)
|
| 2 |
+
|
| 3 |
+
This repository contains a fine-tuned **Vision Transformer (ViT-Base-Patch16-224)** model for **face anti-spoofing** on the **CelebA Spoof (PDA)** dataset.
|
| 4 |
+
The model was trained on the first 18 splits of the dataset and evaluated on splits **19–21**, following the standard CelebA Spoof partitioning strategy.
|
| 5 |
+
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
## Overview
|
| 9 |
+
|
| 10 |
+
The objective of this project is to develop a robust deep learning–based system capable of distinguishing **live** from **spoofed** faces in real-world conditions.
|
| 11 |
+
The model leverages the **ViT architecture** fine-tuned on GPU-augmented CelebA Spoof data with advanced training techniques, including:
|
| 12 |
+
|
| 13 |
+
- Focal Loss for class imbalance
|
| 14 |
+
- Threshold optimization
|
| 15 |
+
- Weighted regularization
|
| 16 |
+
- Early stopping
|
| 17 |
+
- Hyperparameter tuning (via W&B sweeps)
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
## Dataset
|
| 22 |
+
|
| 23 |
+
**Dataset:** [CelebA Spoof (PDA)](https://github.com/Davidzhangyuanhan/CelebA-Spoof)
|
| 24 |
+
|
| 25 |
+
- **Training splits:** 1–18
|
| 26 |
+
- **Testing splits:** 19–21
|
| 27 |
+
- **Classes:** Binary classification (Live vs Spoof)
|
| 28 |
+
- **Total test samples:** 1,747
|
| 29 |
+
- Live: 1,076
|
| 30 |
+
- Spoof: 671
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Data Augmentation Pipeline
|
| 35 |
+
|
| 36 |
+
The augmentation process was GPU-accelerated using **Kornia** and executed on an **NVIDIA RTX A5000** (32 vCPU).
|
| 37 |
+
Augmentation was designed to improve model generalization across lighting, pose, and spoof mediums.
|
| 38 |
+
|
| 39 |
+
**Augmentation strategy:**
|
| 40 |
+
|
| 41 |
+
| Class | Augmentations per image | Techniques |
|
| 42 |
+
|-------|--------------------------|-------------|
|
| 43 |
+
| Live | 8× | Random flip, rotation, color jitter, Gaussian blur/noise, perspective, elastic transform, sharpness adjustment |
|
| 44 |
+
| Spoof | 2× | Same set, applied with lower probability |
|
| 45 |
+
|
| 46 |
+
**Core augmentation methods:**
|
| 47 |
+
- Heavy, medium, and light pipelines (with variable transform intensity)
|
| 48 |
+
- GPU-based batch processing with Kornia
|
| 49 |
+
- Normalization aligned with ViT preprocessing (`mean=[0.485, 0.456, 0.406]`, `std=[0.229, 0.224, 0.225]`)
|
| 50 |
+
|
| 51 |
+
The complete augmentation logic is implemented in [`augument_data.py`](./augument_data.py):contentReference[oaicite:0]{index=0}.
|
| 52 |
+
|
| 53 |
+
---
|
| 54 |
+
|
| 55 |
+
## Model Architecture
|
| 56 |
+
|
| 57 |
+
The base model is a **ViT-Base-Patch16-224**, initialized with pretrained ImageNet weights and fine-tuned for binary classification.
|
| 58 |
+
A custom classification head was added:
|
| 59 |
+
|
| 60 |
+
```python
|
| 61 |
+
LayerNorm(embed_dim) → Dropout(0.1) → Linear(512) → GELU → Dropout(0.1) → Linear(2)
|
| 62 |
+
````
|
| 63 |
+
|
| 64 |
+
**Model configuration:**
|
| 65 |
+
|
| 66 |
+
* Patch size: 16
|
| 67 |
+
* Dropout: 0.1
|
| 68 |
+
* Optimizer: `AdamW`
|
| 69 |
+
* Scheduler: Cosine Annealing with warm-up
|
| 70 |
+
* Batch size: 128
|
| 71 |
+
* Mixed precision: Enabled (AMP)
|
| 72 |
+
* Early stopping and F1-based checkpointing
|
| 73 |
+
|
| 74 |
+
The full training procedure is implemented in [`train_advanced.py`](./train_advanced.py).
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## Training Details
|
| 79 |
+
|
| 80 |
+
| Parameter | Value |
|
| 81 |
+
| ---------------------- | ------------------------------------ |
|
| 82 |
+
| Dataset | Augmented CelebA Spoof (Splits 1–18) |
|
| 83 |
+
| Optimizer | AdamW |
|
| 84 |
+
| Learning Rate | 3e-4 (swept) |
|
| 85 |
+
| Weight Decay | 0.05 |
|
| 86 |
+
| Batch Size | 128 |
|
| 87 |
+
| Epochs | 50 |
|
| 88 |
+
| Loss | Focal Loss (α=0.25, γ=2.0) |
|
| 89 |
+
| Early Stopping | Patience = 10, Δ = 0.001 |
|
| 90 |
+
| Threshold Optimization | Enabled |
|
| 91 |
+
| Scheduler | CosineAnnealingLR |
|
| 92 |
+
| Mixed Precision | True |
|
| 93 |
+
| Device | NVIDIA RTX A5000 |
|
| 94 |
+
|
| 95 |
+
Training and validation metrics were tracked using **Weights & Biases** for all runs.
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## Testing Procedure
|
| 100 |
+
|
| 101 |
+
Testing was conducted on **splits 19–21**, following the CelebA Spoof PDA protocol.
|
| 102 |
+
The testing pipeline (`test.py`) evaluates the model on per-image and per-subject levels, generating:
|
| 103 |
+
|
| 104 |
+
* Accuracy, F1, AUC
|
| 105 |
+
* Precision, Recall, Specificity, NPV
|
| 106 |
+
* FAR, FRR, and EER
|
| 107 |
+
* Confusion Matrix
|
| 108 |
+
* ROC Curve
|
| 109 |
+
|
| 110 |
+
Results and plots are automatically exported to disk during testing.
|
| 111 |
+
|
| 112 |
+
---
|
| 113 |
+
|
| 114 |
+
## Results
|
| 115 |
+
|
| 116 |
+
### Overall Performance
|
| 117 |
+
|
| 118 |
+
| Metric | Score |
|
| 119 |
+
| ------------ | ---------- |
|
| 120 |
+
| **Accuracy** | **83.29%** |
|
| 121 |
+
| **AUC-ROC** | **0.9561** |
|
| 122 |
+
| **F1-Score** | **0.8780** |
|
| 123 |
+
|
| 124 |
+
### Detection Metrics
|
| 125 |
+
|
| 126 |
+
| Metric | Value |
|
| 127 |
+
| --------------- | ------ |
|
| 128 |
+
| Precision (PPV) | 0.7974 |
|
| 129 |
+
| Recall (TPR) | 0.9768 |
|
| 130 |
+
| Specificity | 0.6021 |
|
| 131 |
+
| NPV | 0.9417 |
|
| 132 |
+
|
| 133 |
+
### Error Rates
|
| 134 |
+
|
| 135 |
+
| Metric | Value |
|
| 136 |
+
| --------------------------- | ------ |
|
| 137 |
+
| False Acceptance Rate (FAR) | 0.3979 |
|
| 138 |
+
| False Rejection Rate (FRR) | 0.0232 |
|
| 139 |
+
| Equal Error Rate (EER) | 0.1083 |
|
| 140 |
+
|
| 141 |
+
---
|
| 142 |
+
|
| 143 |
+
## Confusion Matrix
|
| 144 |
+
|
| 145 |
+
| | Predicted Spoof | Predicted Live |
|
| 146 |
+
| ---------------- | --------------- | -------------- |
|
| 147 |
+
| **Actual Spoof** | 404 | 267 |
|
| 148 |
+
| **Actual Live** | 25 | 1051 |
|
best_model_run_eif1jakb.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af65762843da1a6781b495814ca0784e7368dd3b127b3393830be93b0d9c0c08
|
| 3 |
+
size 1034544191
|