Upload folder using huggingface_hub

Browse files

Files changed (14) hide show

.gitattributes +3 -0
README.md +92 -0
__pycache__/miner.cpython-312.pyc +0 -0
__pycache__/pitch.cpython-312.pyc +0 -0
config.yml +22 -0
hrnetv2_w48.yaml +35 -0
keypoint +3 -0
miner.py +561 -0
objdetect.onnx +3 -0
osnet_ain.pyc +0 -0
osnet_model.pth.tar-100 +3 -0
pitch.py +688 -0
team_cluster.pyc +0 -0
utils.pyc +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+SV_kp.engine filter=lfs diff=lfs merge=lfs -text
+keypoint filter=lfs diff=lfs merge=lfs -text
+osnet_model.pth.tar-100 filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,92 @@

+# 🚀 Example Chute for Turbovision 🪂
+This repository demonstrates how to deploy a **Chute** via the **Turbovision CLI**, hosted on **Hugging Face Hub**.
+It serves as a minimal example showcasing the required structure and workflow for integrating machine learning models, preprocessing, and orchestration into a reproducible Chute environment.
+## Repository Structure
+The following two files **must be present** (in their current locations) for a successful deployment — their content can be modified as needed:
+| File | Purpose |
+|------|----------|
+| `miner.py` | Defines the ML model type(s), orchestration, and all pre/postprocessing logic. |
+| `config.yml` | Specifies machine configuration (e.g., GPU type, memory, environment variables). |
+Other files — e.g., model weights, utility scripts, or dependencies — are **optional** and can be included as needed for your model. Note: Any required assets must be defined or contained **within this repo**, which is fully open-source, since all network-related operations (downloading challenge data, weights, etc.) are disabled **inside the Chute**
+## Overview
+Below is a high-level diagram showing the interaction between Huggingface, Chutes and Turbovision:
+![](../images/miner.png)
+## Local Testing
+After editing the `config.yml` and `miner.py` and saving it into your Huggingface Repo, you will want to test it works locally.
+1. Copy the file `scorevision/chute_tmeplate/turbovision_chute.py.j2` as a python file called `my_chute.py` and fill in the missing variables:
+```python
+HF_REPO_NAME = "{{ huggingface_repository_name }}"
+HF_REPO_REVISION = "{{ huggingface_repository_revision }}"
+CHUTES_USERNAME = "{{ chute_username }}"
+CHUTE_NAME = "{{ chute_name }}"
+```
+2. Run the following command to build the chute locally (Caution: there are known issues with the docker location when running this on a mac)
+```bash
+chutes build my_chute:chute --local --public
+```
+3. Run the name of the docker image just built (i.e. `CHUTE_NAME`) and enter it
+```bash
+docker run -p 8000:8000 -e CHUTES_EXECUTION_CONTEXT=REMOTE -it <image-name> /bin/bash
+```
+4. Run the file from within the container
+```bash
+chutes run my_chute:chute --dev --debug
+```
+5. In another terminal, test the local endpoints to ensure there are no bugs
+```bash
+curl -X POST http://localhost:8000/health -d '{}'
+curl -X POST http://localhost:8000/predict -d '{"url": "https://scoredata.me/2025_03_14/35ae7a/h1_0f2ca0.mp4","meta": {}}'
+```
+## Live Testing
+1. If you have any chute with the same name (ie from a previous deployment), ensure you delete that first (or you will get an error when trying to build).
+```bash
+chutes chutes list
+```
+Take note of the chute id that you wish to delete (if any)
+```bash
+chutes chutes delete <chute-id>
+```
+You should also delete its associated image
+```bash
+chutes images list
+```
+Take note of the chute image id
+```bash
+chutes images delete <chute-image-id>
+```
+2. Use Turbovision's CLI to build, deploy and commit on-chain (Note: you can skip the on-chain commit using `--no-commit`.  You can also specify a past huggingface revision to point to using `--revision` and/or the local files you want to upload to your huggingface repo using `--model-path`)
+```bash
+sv -vv push
+```
+3. When completed, warm up the chute (if its cold 🧊). (You can confirm its status using `chutes chutes list` or `chutes chutes get <chute-id>` if you already know its id). Note: Warming up can sometimes take a while but if the chute runs without errors (should be if you've tested locally first) and there are sufficient nodes (i.e. machines) available matching the `config.yml` you specified, the chute should become hot 🔥!
+```bash
+chutes warmup <chute-id>
+```
+4. Test the chute's endpoints
+```bash
+curl -X POST https://<YOUR-CHUTE-SLUG>.chutes.ai/health -d '{}' -H "Authorization: Bearer $CHUTES_API_KEY"
+curl -X POST https://<YOUR-CHUTE-SLUG>.chutes.ai/predict -d '{"url": "https://scoredata.me/2025_03_14/35ae7a/h1_0f2ca0.mp4","meta": {}}' -H "Authorization: Bearer $CHUTES_API_KEY"
+```
+5. Test what your chute would get on a validator (this also applies any validation/integrity checks which may fail if you did not use the Turbovision CLI above to deploy the chute)
+```bash
+sv -vv run-once
+```

__pycache__/miner.cpython-312.pyc ADDED Viewed

Binary file (24.9 kB). View file

__pycache__/pitch.cpython-312.pyc ADDED Viewed

Binary file (31.3 kB). View file

config.yml ADDED Viewed

	@@ -0,0 +1,22 @@

+Image:
+  from_base: parachutes/python:3.12
+  run_command:
+    - pip install --upgrade setuptools wheel
+    - pip install "ultralytics==8.3.222" "opencv-python-headless" "numpy" "pydantic"
+    - pip install "tensorflow" "torch==2.7.1" "torchvision==0.22.1" "torch-tensorrt==2.7"
+  set_workdir: /app
+NodeSelector:
+  gpu_count: 1
+  min_vram_gb_per_gpu: 16
+  exclude:
+    - "5090"
+    - b200
+    - h200
+    - mi300x
+Chute:
+  timeout_seconds: 900
+  concurrency: 4
+  max_instances: 5
+  scaling_threshold: 0.5

hrnetv2_w48.yaml ADDED Viewed

	@@ -0,0 +1,35 @@

+MODEL:
+  IMAGE_SIZE: [960, 540]
+  NUM_JOINTS: 58
+  PRETRAIN: ''
+  EXTRA:
+    FINAL_CONV_KERNEL: 1
+    STAGE1:
+      NUM_MODULES: 1
+      NUM_BRANCHES: 1
+      BLOCK: BOTTLENECK
+      NUM_BLOCKS: [4]
+      NUM_CHANNELS: [64]
+      FUSE_METHOD: SUM
+    STAGE2:
+      NUM_MODULES: 1
+      NUM_BRANCHES: 2
+      BLOCK: BASIC
+      NUM_BLOCKS: [4, 4]
+      NUM_CHANNELS: [48, 96]
+      FUSE_METHOD: SUM
+    STAGE3:
+      NUM_MODULES: 4
+      NUM_BRANCHES: 3
+      BLOCK: BASIC
+      NUM_BLOCKS: [4, 4, 4]
+      NUM_CHANNELS: [48, 96, 192]
+      FUSE_METHOD: SUM
+    STAGE4:
+      NUM_MODULES: 3
+      NUM_BRANCHES: 4
+      BLOCK: BASIC
+      NUM_BLOCKS: [4, 4, 4, 4]
+      NUM_CHANNELS: [48, 96, 192, 384]
+      FUSE_METHOD: SUM

keypoint ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ea78fa76aaf94976a8eca428d6e3c59697a93430cba1a4603e20284b61f5113
+size 264964645

miner.py ADDED Viewed

	@@ -0,0 +1,561 @@

+from pathlib import Path
+from typing import List, Tuple, Dict
+import sys
+import os
+from numpy import ndarray
+import numpy as np
+from pydantic import BaseModel
+import cv2
+import onnxruntime as ort
+from torchvision.ops import batched_nms
+from ultralytics import YOLO
+from team_cluster import TeamClassifier
+from utils import (
+    BoundingBox,
+    Constants,
+    suppress_small_contained_boxes,
+    classify_teams_batch,
+)
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
+os.environ["OMP_NUM_THREADS"] = "16"
+os.environ["TF_NUM_INTRAOP_THREADS"] = "16"
+os.environ["TF_NUM_INTEROP_THREADS"] = "2"
+os.environ["CUDA_LAUNCH_BLOCKING"] = "0"
+os.environ["ORT_LOGGING_LEVEL"] = "3"
+os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0"
+import logging
+import tensorflow as tf
+from tensorflow.keras import mixed_precision
+import torch._dynamo
+import torch
+# import torch_tensorrt
+import gc
+from ultralytics import YOLO
+from pitch import process_batch_input, get_cls_net
+import yaml
+logging.getLogger("tensorflow").setLevel(logging.ERROR)
+tf.config.threading.set_intra_op_parallelism_threads(16)
+tf.config.threading.set_inter_op_parallelism_threads(2)
+tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
+tf.get_logger().setLevel("ERROR")
+tf.autograph.set_verbosity(0)
+mixed_precision.set_global_policy("mixed_float16")
+tf.config.optimizer.set_jit(True)
+torch._dynamo.config.suppress_errors = True
+class BoundingBox(BaseModel):
+    x1: int
+    y1: int
+    x2: int
+    y2: int
+    cls_id: int
+    conf: float
+class TVFrameResult(BaseModel):
+    frame_id: int
+    boxes: List[BoundingBox]
+    keypoints: List[Tuple[int, int]]
+class Miner:
+    QUASI_TOTAL_IOA: float = 0.90
+    SMALL_CONTAINED_IOA: float = 0.85
+    SMALL_RATIO_MAX: float = 0.50
+    SINGLE_PLAYER_HUE_PIVOT: float = 90.0
+    # Use constants from utils
+    SMALL_CONTAINED_IOA = Constants.SMALL_CONTAINED_IOA
+    SMALL_RATIO_MAX = Constants.SMALL_RATIO_MAX
+    SINGLE_PLAYER_HUE_PIVOT = Constants.SINGLE_PLAYER_HUE_PIVOT
+    CORNER_INDICES = Constants.CORNER_INDICES
+    KEYPOINTS_CONFIDENCE = Constants.KEYPOINTS_CONFIDENCE
+    CORNER_CONFIDENCE = Constants.CORNER_CONFIDENCE
+    GOALKEEPER_POSITION_MARGIN = Constants.GOALKEEPER_POSITION_MARGIN
+    MIN_SAMPLES_FOR_FIT = 16  # Minimum player crops needed before fitting TeamClassifier
+    MAX_SAMPLES_FOR_FIT = 500  # Maximum samples to avoid overfitting
+    def __init__(self, path_hf_repo: Path) -> None:
+        try:
+            device = "cuda" if torch.cuda.is_available() else "cpu"
+            print(device)
+            providers = [
+                'CUDAExecutionProvider',
+                'CPUExecutionProvider'
+            ]
+            model_path = path_hf_repo / "objdetect.onnx"
+            session = ort.InferenceSession(model_path, providers=providers)
+            input_name = session.get_inputs()[0].name
+            height = width = 640
+            dummy = np.zeros((1, 3, height, width), dtype=np.float32)
+            session.run(None, {input_name: dummy})
+            model = session
+            self.bbox_model = model
+            print("BBox Model Loaded")
+            team_model_path = path_hf_repo / "osnet_model.pth.tar-100"
+            self.team_classifier = TeamClassifier(
+                device=device,
+                batch_size=32,
+                model_name=str(team_model_path)
+            )
+            print("Team Classifier Loaded")
+            # Team classification state
+            self.team_classifier_fitted = False
+            self.player_crops_for_fit = []
+            model_kp_path = path_hf_repo / 'keypoint'
+            config_kp_path = path_hf_repo / 'hrnetv2_w48.yaml'
+            cfg_kp = yaml.safe_load(open(config_kp_path, 'r'))
+            loaded_state_kp = torch.load(model_kp_path, map_location=device)
+            model = get_cls_net(cfg_kp)
+            model.load_state_dict(loaded_state_kp)
+            model.to(device)
+            model.eval()
+            # @torch.inference_mode()
+            # def run_inference(model, input_tensor: torch.Tensor):
+            #     input_tensor = input_tensor.to(device).to(memory_format=torch.channels_last)
+            #     output = model.module().forward(input_tensor)
+            #     return output
+            # run_inference(model_kp, torch.randn(8, 3, 540, 960, device=device, dtype=torch.float32))
+            self.keypoints_model = model
+            self.kp_threshold = 0.1
+            self.pitch_batch_size = 8
+            self.health = "✅ Miner initialized successfully"
+            print("✅ Keypoints Model Loaded")
+        except Exception as e:
+            self.health = "❌ Miner initialization failed: " + str(e)
+            print(self.health)
+    def __repr__(self) -> str:
+        return (
+            f"BBox Model: {type(self.bbox_model).__name__}\n"
+            f"Keypoints Model: {type(self.keypoints_model).__name__}"
+        )
+    def _handle_multiple_goalkeepers(self, boxes: List[BoundingBox]) -> List[BoundingBox]:
+        """
+        Handle goalkeeper detection issues:
+        1. Fix misplaced goalkeepers (standing in middle of field)
+        2. Limit to maximum 2 goalkeepers (one from each team)
+        Returns:
+            Filtered list of boxes with corrected goalkeepers
+        """
+        # Step 1: Fix misplaced goalkeepers first
+        # Convert goalkeepers in middle of field to regular players
+        boxes = self._fix_misplaced_goalkeepers(boxes)
+        # Step 2: Handle multiple goalkeepers (after fixing misplaced ones)
+        gk_idxs = [i for i, bb in enumerate(boxes) if int(bb.cls_id) == 1]
+        if len(gk_idxs) <= 2:
+            return boxes
+        # Sort goalkeepers by confidence (highest first)
+        gk_idxs_sorted = sorted(gk_idxs, key=lambda i: boxes[i].conf, reverse=True)
+        keep_gk_idxs = set(gk_idxs_sorted[:2])  # Keep top 2 goalkeepers
+        # Create new list keeping only top 2 goalkeepers
+        filtered_boxes = []
+        for i, box in enumerate(boxes):
+            if int(box.cls_id) == 1:
+                # Only keep the top 2 goalkeepers by confidence
+                if i in keep_gk_idxs:
+                    filtered_boxes.append(box)
+                # Skip extra goalkeepers
+            else:
+                # Keep all non-goalkeeper boxes
+                filtered_boxes.append(box)
+        return filtered_boxes
+    def _fix_misplaced_goalkeepers(self, boxes: List[BoundingBox]) -> List[BoundingBox]:
+        """
+        """
+        gk_idxs = [i for i, bb in enumerate(boxes) if int(bb.cls_id) == 1]
+        player_idxs = [i for i, bb in enumerate(boxes) if int(bb.cls_id) == 2]
+        if len(gk_idxs) == 0 or len(player_idxs) < 2:
+            return boxes
+        updated_boxes = boxes.copy()
+        for gk_idx in gk_idxs:
+            if boxes[gk_idx].conf < 0.3:
+                updated_boxes[gk_idx].cls_id = 2
+        return updated_boxes
+    def _pre_process_img(self, frames: List[np.ndarray], scale: float = 640.0) -> np.ndarray:
+        """
+        Preprocess images for ONNX inference.
+        Args:
+            frames: List of BGR frames
+            scale: Target scale for resizing
+        Returns:
+            Preprocessed numpy array ready for ONNX inference
+        """
+        imgs = np.stack([cv2.resize(frame, (int(scale), int(scale))) for frame in frames])
+        imgs = imgs.transpose(0, 3, 1, 2)  # BHWC to BCHW
+        imgs = imgs.astype(np.float32) / 255.0  # Normalize to [0, 1]
+        return imgs
+    def _post_process_output(self, outputs: np.ndarray, x_scale: float, y_scale: float,
+                            conf_thresh: float = 0.6, nms_thresh: float = 0.55) -> List[List[Tuple]]:
+        """
+        Post-process ONNX model outputs to get detections.
+        Args:
+            outputs: Raw ONNX model outputs
+            x_scale: X-axis scaling factor
+            y_scale: Y-axis scaling factor
+            conf_thresh: Confidence threshold
+            nms_thresh: NMS threshold
+        Returns:
+            List of detections for each frame: [(box, conf, class_id), ...]
+        """
+        B, C, N = outputs.shape
+        outputs = torch.from_numpy(outputs)
+        outputs = outputs.permute(0, 2, 1)  # B,C,N -> B,N,C
+        boxes = outputs[..., :4]
+        class_scores = 1 / (1 + torch.exp(-outputs[..., 4:]))  # Sigmoid activation
+        conf, class_id = class_scores.max(dim=2)
+        mask = conf > conf_thresh
+        # Special handling for balls - keep best one even with lower confidence
+        for i in range(class_id.shape[0]):  # loop over batch
+            # Find detections that are balls
+            ball_mask = class_id[i] == 0
+            ball_idx = ball_mask.nonzero(as_tuple=True)[0]
+            if ball_idx.numel() > 0:
+                # Pick the one with the highest confidence
+                best_ball_idx = ball_idx[conf[i, ball_idx].argmax()]
+                if conf[i, best_ball_idx] >= 0.55:  # apply confidence threshold
+                    mask[i, best_ball_idx] = True
+        batch_idx, pred_idx = mask.nonzero(as_tuple=True)
+        if len(batch_idx) == 0:
+            return [[] for _ in range(B)]
+        boxes = boxes[batch_idx, pred_idx]
+        conf = conf[batch_idx, pred_idx]
+        class_id = class_id[batch_idx, pred_idx]
+        # Convert from center format to xyxy format
+        x, y, w, h = boxes[:, 0], boxes[:, 1], boxes[:, 2], boxes[:, 3]
+        x1 = (x - w / 2) * x_scale
+        y1 = (y - h / 2) * y_scale
+        x2 = (x + w / 2) * x_scale
+        y2 = (y + h / 2) * y_scale
+        boxes_xyxy = torch.stack([x1, y1, x2, y2], dim=1)
+        # Apply batched NMS
+        max_coord = 1e4
+        offset = batch_idx.to(boxes_xyxy) * max_coord
+        boxes_for_nms = boxes_xyxy + offset[:, None]
+        keep = batched_nms(boxes_for_nms, conf, batch_idx, nms_thresh)
+        boxes_final = boxes_xyxy[keep]
+        conf_final = conf[keep]
+        class_final = class_id[keep]
+        batch_final = batch_idx[keep]
+        # Group results by batch
+        results = [[] for _ in range(B)]
+        for b in range(B):
+            mask_b = batch_final == b
+            if mask_b.sum() == 0:
+                continue
+            results[b] = list(zip(boxes_final[mask_b].numpy(),
+                                  conf_final[mask_b].numpy(),
+                                  class_final[mask_b].numpy()))
+        return results
+    def _ioa(self, a: BoundingBox, b: BoundingBox) -> float:
+        inter = self._intersect_area(a, b)
+        aa = self._area(a)
+        if aa <= 0:
+            return 0.0
+        return inter / aa
+    def suppress_small_contained(self, boxes: List[BoundingBox]) -> List[BoundingBox]:
+        if len(boxes) <= 1:
+            return boxes
+        keep = [True] * len(boxes)
+        areas = [self._area(bb) for bb in boxes]
+        for i in range(len(boxes)):
+            if not keep[i]:
+                continue
+            for j in range(len(boxes)):
+                if i == j or not keep[j]:
+                    continue
+                ai, aj = areas[i], areas[j]
+                if ai == 0 or aj == 0:
+                    continue
+                if ai <= aj:
+                    ratio = ai / aj
+                    if ratio <= self.SMALL_RATIO_MAX:
+                        ioa_i_in_j = self._ioa(boxes[i], boxes[j])
+                        if ioa_i_in_j >= self.SMALL_CONTAINED_IOA:
+                            keep[i] = False
+                            break
+                else:
+                    ratio = aj / ai
+                    if ratio <= self.SMALL_RATIO_MAX:
+                        ioa_j_in_i = self._ioa(boxes[j], boxes[i])
+                        if ioa_j_in_i >= self.SMALL_CONTAINED_IOA:
+                            keep[j] = False
+        return [bb for bb, k in zip(boxes, keep) if k]
+    def _detect_objects_batch(self, batch_images: List[ndarray], offset: int) -> Dict[int, List[BoundingBox]]:
+        """
+        Phase 1: Object detection for all frames in batch.
+        Returns detected objects with players still having class_id=2 (before team classification).
+        Args:
+            batch_images: List of images to process
+            offset: Frame offset for numbering
+        Returns:
+            Dictionary mapping frame_id to list of detected boxes
+        """
+        bboxes: Dict[int, List[BoundingBox]] = {}
+        if len(batch_images) == 0:
+            return bboxes
+        print(f"Processing batch of {len(batch_images)} images")
+        # Get original image dimensions for scaling
+        height, width = batch_images[0].shape[:2]
+        scale = 640.0
+        x_scale = width / scale
+        y_scale = height / scale
+        # Memory optimization: Process smaller batches if needed
+        max_batch_size = 32  # Reduce batch size further to prevent memory issues
+        if len(batch_images) > max_batch_size:
+            print(f"Large batch detected ({len(batch_images)} images), splitting into smaller batches of {max_batch_size}")
+            # Process in smaller chunks
+            all_bboxes = {}
+            for chunk_start in range(0, len(batch_images), max_batch_size):
+                chunk_end = min(chunk_start + max_batch_size, len(batch_images))
+                chunk_images = batch_images[chunk_start:chunk_end]
+                chunk_offset = offset + chunk_start
+                print(f"Processing chunk {chunk_start//max_batch_size + 1}: images {chunk_start}-{chunk_end-1}")
+                chunk_bboxes = self._detect_objects_batch(chunk_images, chunk_offset)
+                all_bboxes.update(chunk_bboxes)
+            return all_bboxes
+        # Preprocess images for ONNX inference
+        imgs = self._pre_process_img(batch_images, scale)
+        actual_batch_size = len(batch_images)
+        # Handle batch size mismatch - pad if needed
+        model_batch_size = self.bbox_model.get_inputs()[0].shape[0]
+        print(f"Model input shape: {self.bbox_model.get_inputs()[0].shape}, batch_size: {model_batch_size}")
+        if model_batch_size is not None:
+            try:
+                # Handle dynamic batch size (None, -1, 'None')
+                if str(model_batch_size) in ['None', '-1'] or model_batch_size == -1:
+                    model_batch_size = None
+                else:
+                    model_batch_size = int(model_batch_size)
+            except (ValueError, TypeError):
+                model_batch_size = None
+        print(f"Processed model_batch_size: {model_batch_size}, actual_batch_size: {actual_batch_size}")
+        if model_batch_size and actual_batch_size < model_batch_size:
+            padding_size = model_batch_size - actual_batch_size
+            dummy_img = np.zeros((1, 3, int(scale), int(scale)), dtype=np.float32)
+            padding = np.repeat(dummy_img, padding_size, axis=0)
+            imgs = np.vstack([imgs, padding])
+        # ONNX inference with error handling
+        try:
+            input_name = self.bbox_model.get_inputs()[0].name
+            import time
+            start_time = time.time()
+            outputs = self.bbox_model.run(None, {input_name: imgs})[0]
+            inference_time = time.time() - start_time
+            print(f"Inference time: {inference_time:.3f}s for {actual_batch_size} images")
+            # Remove padded results if we added padding
+            if model_batch_size and isinstance(model_batch_size, int) and actual_batch_size < model_batch_size:
+                outputs = outputs[:actual_batch_size]
+            # Post-process outputs to get detections
+            raw_results = self._post_process_output(np.array(outputs), x_scale, y_scale)
+        except Exception as e:
+            print(f"Error during ONNX inference: {e}")
+            return bboxes
+        if not raw_results:
+            return bboxes
+        # Convert raw results to BoundingBox objects and apply processing
+        for frame_idx_in_batch, frame_detections in enumerate(raw_results):
+            if not frame_detections:
+                continue
+            # Convert to BoundingBox objects
+            boxes: List[BoundingBox] = []
+            for box, conf, cls_id in frame_detections:
+                x1, y1, x2, y2 = box
+                if int(cls_id) < 4:
+                    boxes.append(
+                        BoundingBox(
+                            x1=int(x1),
+                            y1=int(y1),
+                            x2=int(x2),
+                            y2=int(y2),
+                            cls_id=int(cls_id),
+                            conf=float(conf),
+                        )
+                    )
+            # Handle footballs - keep only the best one
+            footballs = [bb for bb in boxes if int(bb.cls_id) == 0]
+            if len(footballs) > 1:
+                best_ball = max(footballs, key=lambda b: b.conf)
+                boxes = [bb for bb in boxes if int(bb.cls_id) != 0]
+                boxes.append(best_ball)
+            # Remove overlapping small boxes
+            boxes = suppress_small_contained_boxes(boxes, self.SMALL_CONTAINED_IOA, self.SMALL_RATIO_MAX)
+            # Handle goalkeeper detection issues:
+            # 1. Fix misplaced goalkeepers (convert to players if standing in middle)
+            # 2. Allow up to 2 goalkeepers maximum (one from each team)
+            # Goalkeepers remain class_id = 1 (no team assignment)
+            boxes = self._handle_multiple_goalkeepers(boxes)
+            # Store results (players still have class_id=2, will be classified in phase 2)
+            frame_id = offset + frame_idx_in_batch
+            bboxes[frame_id] = boxes
+        return bboxes
+    def predict_batch(self, batch_images: List[ndarray], offset: int, n_keypoints: int) -> List[TVFrameResult]:
+        bboxes: Dict[int, List[BoundingBox]] = {}
+        bboxes = self._detect_objects_batch(batch_images, offset)
+        if bboxes:
+            bboxes, self.team_classifier_fitted, self.player_crops_for_fit = classify_teams_batch(
+                self.team_classifier,
+                self.team_classifier_fitted,
+                self.player_crops_for_fit,
+                batch_images,
+                bboxes,
+                offset,
+                self.MIN_SAMPLES_FOR_FIT,
+                self.MAX_SAMPLES_FOR_FIT,
+                self.SINGLE_PLAYER_HUE_PIVOT
+            )
+        self.team_classifier_fitted = False
+        self.player_crops_for_fit = []
+        pitch_batch_size = min(self.pitch_batch_size, len(batch_images))
+        keypoints: Dict[int, List[Tuple[int, int]]] = {}
+        while True:
+            # try:
+            gc.collect()
+            if torch.cuda.is_available():
+                tf.keras.backend.clear_session()
+                torch.cuda.empty_cache()
+                torch.cuda.synchronize()
+            device_str = "cuda" if torch.cuda.is_available() else "cpu"
+            keypoints_result = process_batch_input(
+                batch_images,
+                self.keypoints_model,
+                self.kp_threshold,
+                device_str,
+                batch_size=pitch_batch_size,
+            )
+            if keypoints_result is not None and len(keypoints_result) > 0:
+                for frame_number_in_batch, kp_dict in enumerate(keypoints_result):
+                    if frame_number_in_batch >= len(batch_images):
+                        break
+                    frame_keypoints: List[Tuple[int, int]] = []
+                    try:
+                        height, width = batch_images[frame_number_in_batch].shape[:2]
+                        if kp_dict is not None and isinstance(kp_dict, dict):
+                            for idx in range(32):
+                                x, y = 0, 0
+                                kp_idx = idx + 1
+                                if kp_idx in kp_dict:
+                                    try:
+                                        kp_data = kp_dict[kp_idx]
+                                        if isinstance(kp_data, dict) and "x" in kp_data and "y" in kp_data:
+                                            x = int(kp_data["x"] * width)
+                                            y = int(kp_data["y"] * height)
+                                    except (KeyError, TypeError, ValueError):
+                                        pass
+                                frame_keypoints.append((x, y))
+                    except (IndexError, ValueError, AttributeError):
+                        frame_keypoints = [(0, 0)] * 32
+                    if len(frame_keypoints) < n_keypoints:
+                        frame_keypoints.extend([(0, 0)] * (n_keypoints - len(frame_keypoints)))
+                    else:
+                        frame_keypoints = frame_keypoints[:n_keypoints]
+                    keypoints[offset + frame_number_in_batch] = frame_keypoints
+            print("✅ Keypoints predicted")
+            break
+            # except RuntimeError as e:
+            #     print(self.pitch_batch_size)
+            #     print(e)
+            #     if "out of memory" in str(e):
+            #         if self.pitch_batch_size == 1:
+            #             break
+            #         self.pitch_batch_size = self.pitch_batch_size // 2 if self.pitch_batch_size > 1 else 1
+            #         pitch_batch_size = min(self.pitch_batch_size, len(batch_images))
+            #     else:
+            #         break
+            # except Exception as e:
+            #     print(f"❌ Error during keypoints prediction: {e}")
+            #     break
+        results: List[TVFrameResult] = []
+        for frame_number in range(offset, offset + len(batch_images)):
+            frame_boxes = bboxes.get(frame_number, [])
+            frame_keypoints = keypoints.get(frame_number, [(0, 0) for _ in range(n_keypoints)])
+            result = TVFrameResult(
+                frame_id=frame_number,
+                boxes=frame_boxes,
+                keypoints=frame_keypoints,
+            )
+            results.append(result)
+        gc.collect()
+        if torch.cuda.is_available():
+            tf.keras.backend.clear_session()
+            torch.cuda.empty_cache()
+            torch.cuda.synchronize()
+        return results

objdetect.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7b51470cb703f5a9a789df38674b67d4bbe7f8f31846d69dbc97ce484f790cf9
+size 10245169

osnet_ain.pyc ADDED Viewed

Binary file (24.2 kB). View file

osnet_model.pth.tar-100 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:64873ef0e8abf28df31facd113f27634e2d085a2dcf8d19123409b1d0e2566c8
+size 36189526

pitch.py ADDED Viewed

	@@ -0,0 +1,688 @@

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import os
+import sys
+import time
+from typing import List, Optional, Tuple
+import cv2
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torchvision.transforms as T
+import torchvision.transforms.functional as f
+from pydantic import BaseModel
+import logging
+logger = logging.getLogger(__name__)
+class BoundingBox(BaseModel):
+    x1: int
+    y1: int
+    x2: int
+    y2: int
+    cls_id: int
+    conf: float
+class TVFrameResult(BaseModel):
+    frame_id: int
+    boxes: list[BoundingBox]
+    keypoints: list[tuple[int, int]]
+BatchNorm2d = nn.BatchNorm2d
+BN_MOMENTUM = 0.1
+def conv3x3(in_planes, out_planes, stride=1):
+    """3x3 convolution with padding"""
+    return nn.Conv2d(in_planes, out_planes, kernel_size=3,
+                     stride=stride, padding=1, bias=False)
+class BasicBlock(nn.Module):
+    expansion = 1
+    def __init__(self, inplanes, planes, stride=1, downsample=None):
+        super(BasicBlock, self).__init__()
+        self.conv1 = conv3x3(inplanes, planes, stride)
+        self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
+        self.relu = nn.ReLU(inplace=True)
+        self.conv2 = conv3x3(planes, planes)
+        self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
+        self.downsample = downsample
+        self.stride = stride
+    def forward(self, x):
+        residual = x
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.relu(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        if self.downsample is not None:
+            residual = self.downsample(x)
+        out += residual
+        out = self.relu(out)
+        return out
+class Bottleneck(nn.Module):
+    expansion = 4
+    def __init__(self, inplanes, planes, stride=1, downsample=None):
+        super(Bottleneck, self).__init__()
+        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
+        self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
+        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
+                               padding=1, bias=False)
+        self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
+        self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
+                               bias=False)
+        self.bn3 = BatchNorm2d(planes * self.expansion,
+                               momentum=BN_MOMENTUM)
+        self.relu = nn.ReLU(inplace=True)
+        self.downsample = downsample
+        self.stride = stride
+    def forward(self, x):
+        residual = x
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.relu(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        out = self.relu(out)
+        out = self.conv3(out)
+        out = self.bn3(out)
+        if self.downsample is not None:
+            residual = self.downsample(x)
+        out += residual
+        out = self.relu(out)
+        return out
+class HighResolutionModule(nn.Module):
+    def __init__(self, num_branches, blocks, num_blocks, num_inchannels,
+                 num_channels, fuse_method, multi_scale_output=True):
+        super(HighResolutionModule, self).__init__()
+        self._check_branches(
+            num_branches, blocks, num_blocks, num_inchannels, num_channels)
+        self.num_inchannels = num_inchannels
+        self.fuse_method = fuse_method
+        self.num_branches = num_branches
+        self.multi_scale_output = multi_scale_output
+        self.branches = self._make_branches(
+            num_branches, blocks, num_blocks, num_channels)
+        self.fuse_layers = self._make_fuse_layers()
+        self.relu = nn.ReLU(inplace=True)
+    def _check_branches(self, num_branches, blocks, num_blocks,
+                        num_inchannels, num_channels):
+        if num_branches != len(num_blocks):
+            error_msg = 'NUM_BRANCHES({}) <> NUM_BLOCKS({})'.format(
+                num_branches, len(num_blocks))
+            logger.error(error_msg)
+            raise ValueError(error_msg)
+        if num_branches != len(num_channels):
+            error_msg = 'NUM_BRANCHES({}) <> NUM_CHANNELS({})'.format(
+                num_branches, len(num_channels))
+            logger.error(error_msg)
+            raise ValueError(error_msg)
+        if num_branches != len(num_inchannels):
+            error_msg = 'NUM_BRANCHES({}) <> NUM_INCHANNELS({})'.format(
+                num_branches, len(num_inchannels))
+            logger.error(error_msg)
+            raise ValueError(error_msg)
+    def _make_one_branch(self, branch_index, block, num_blocks, num_channels,
+                         stride=1):
+        downsample = None
+        if stride != 1 or \
+                self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion:
+            downsample = nn.Sequential(
+                nn.Conv2d(self.num_inchannels[branch_index],
+                          num_channels[branch_index] * block.expansion,
+                          kernel_size=1, stride=stride, bias=False),
+                BatchNorm2d(num_channels[branch_index] * block.expansion,
+                            momentum=BN_MOMENTUM),
+            )
+        layers = []
+        layers.append(block(self.num_inchannels[branch_index],
+                            num_channels[branch_index], stride, downsample))
+        self.num_inchannels[branch_index] = \
+            num_channels[branch_index] * block.expansion
+        for i in range(1, num_blocks[branch_index]):
+            layers.append(block(self.num_inchannels[branch_index],
+                                num_channels[branch_index]))
+        return nn.Sequential(*layers)
+    def _make_branches(self, num_branches, block, num_blocks, num_channels):
+        branches = []
+        for i in range(num_branches):
+            branches.append(
+                self._make_one_branch(i, block, num_blocks, num_channels))
+        return nn.ModuleList(branches)
+    def _make_fuse_layers(self):
+        if self.num_branches == 1:
+            return None
+        num_branches = self.num_branches
+        num_inchannels = self.num_inchannels
+        fuse_layers = []
+        for i in range(num_branches if self.multi_scale_output else 1):
+            fuse_layer = []
+            for j in range(num_branches):
+                if j > i:
+                    fuse_layer.append(nn.Sequential(
+                        nn.Conv2d(num_inchannels[j],
+                                  num_inchannels[i],
+                                  1,
+                                  1,
+                                  0,
+                                  bias=False),
+                        BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM)))
+                    # nn.Upsample(scale_factor=2**(j-i), mode='nearest')))
+                elif j == i:
+                    fuse_layer.append(None)
+                else:
+                    conv3x3s = []
+                    for k in range(i - j):
+                        if k == i - j - 1:
+                            num_outchannels_conv3x3 = num_inchannels[i]
+                            conv3x3s.append(nn.Sequential(
+                                nn.Conv2d(num_inchannels[j],
+                                          num_outchannels_conv3x3,
+                                          3, 2, 1, bias=False),
+                                BatchNorm2d(num_outchannels_conv3x3, momentum=BN_MOMENTUM)))
+                        else:
+                            num_outchannels_conv3x3 = num_inchannels[j]
+                            conv3x3s.append(nn.Sequential(
+                                nn.Conv2d(num_inchannels[j],
+                                          num_outchannels_conv3x3,
+                                          3, 2, 1, bias=False),
+                                BatchNorm2d(num_outchannels_conv3x3,
+                                            momentum=BN_MOMENTUM),
+                                nn.ReLU(inplace=True)))
+                    fuse_layer.append(nn.Sequential(*conv3x3s))
+            fuse_layers.append(nn.ModuleList(fuse_layer))
+        return nn.ModuleList(fuse_layers)
+    def get_num_inchannels(self):
+        return self.num_inchannels
+    def forward(self, x):
+        if self.num_branches == 1:
+            return [self.branches[0](x[0])]
+        for i in range(self.num_branches):
+            x[i] = self.branches[i](x[i])
+        x_fuse = []
+        for i in range(len(self.fuse_layers)):
+            y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
+            for j in range(1, self.num_branches):
+                if i == j:
+                    y = y + x[j]
+                elif j > i:
+                    y = y + F.interpolate(
+                        self.fuse_layers[i][j](x[j]),
+                        size=[x[i].shape[2], x[i].shape[3]],
+                        mode='bilinear')
+                else:
+                    y = y + self.fuse_layers[i][j](x[j])
+            x_fuse.append(self.relu(y))
+        return x_fuse
+blocks_dict = {
+    'BASIC': BasicBlock,
+    'BOTTLENECK': Bottleneck
+}
+class HighResolutionNet(nn.Module):
+    def __init__(self, config, **kwargs):
+        self.inplanes = 64
+        extra = config['MODEL']['EXTRA']
+        super(HighResolutionNet, self).__init__()
+        # stem net
+        self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=3, stride=2, padding=1,
+                               bias=False)
+        self.bn1 = BatchNorm2d(self.inplanes, momentum=BN_MOMENTUM)
+        self.conv2 = nn.Conv2d(self.inplanes, self.inplanes, kernel_size=3, stride=2, padding=1,
+                               bias=False)
+        self.bn2 = BatchNorm2d(self.inplanes, momentum=BN_MOMENTUM)
+        self.relu = nn.ReLU(inplace=True)
+        self.sf = nn.Softmax(dim=1)
+        self.layer1 = self._make_layer(Bottleneck, 64, 64, 4)
+        self.stage2_cfg = extra['STAGE2']
+        num_channels = self.stage2_cfg['NUM_CHANNELS']
+        block = blocks_dict[self.stage2_cfg['BLOCK']]
+        num_channels = [
+            num_channels[i] * block.expansion for i in range(len(num_channels))]
+        self.transition1 = self._make_transition_layer(
+            [256], num_channels)
+        self.stage2, pre_stage_channels = self._make_stage(
+            self.stage2_cfg, num_channels)
+        self.stage3_cfg = extra['STAGE3']
+        num_channels = self.stage3_cfg['NUM_CHANNELS']
+        block = blocks_dict[self.stage3_cfg['BLOCK']]
+        num_channels = [
+            num_channels[i] * block.expansion for i in range(len(num_channels))]
+        self.transition2 = self._make_transition_layer(
+            pre_stage_channels, num_channels)
+        self.stage3, pre_stage_channels = self._make_stage(
+            self.stage3_cfg, num_channels)
+        self.stage4_cfg = extra['STAGE4']
+        num_channels = self.stage4_cfg['NUM_CHANNELS']
+        block = blocks_dict[self.stage4_cfg['BLOCK']]
+        num_channels = [
+            num_channels[i] * block.expansion for i in range(len(num_channels))]
+        self.transition3 = self._make_transition_layer(
+            pre_stage_channels, num_channels)
+        self.stage4, pre_stage_channels = self._make_stage(
+            self.stage4_cfg, num_channels, multi_scale_output=True)
+        self.upsample = nn.Upsample(scale_factor=2, mode='nearest')
+        final_inp_channels = sum(pre_stage_channels) + self.inplanes
+        self.head = nn.Sequential(nn.Sequential(
+            nn.Conv2d(
+                in_channels=final_inp_channels,
+                out_channels=final_inp_channels,
+                kernel_size=1),
+            BatchNorm2d(final_inp_channels, momentum=BN_MOMENTUM),
+            nn.ReLU(inplace=True),
+            nn.Conv2d(
+                in_channels=final_inp_channels,
+                out_channels=config['MODEL']['NUM_JOINTS'],
+                kernel_size=extra['FINAL_CONV_KERNEL']),
+            nn.Softmax(dim=1)))
+    def _make_head(self, x, x_skip):
+        x = self.upsample(x)
+        x = torch.cat([x, x_skip], dim=1)
+        x = self.head(x)
+        return x
+    def _make_transition_layer(
+            self, num_channels_pre_layer, num_channels_cur_layer):
+        num_branches_cur = len(num_channels_cur_layer)
+        num_branches_pre = len(num_channels_pre_layer)
+        transition_layers = []
+        for i in range(num_branches_cur):
+            if i < num_branches_pre:
+                if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
+                    transition_layers.append(nn.Sequential(
+                        nn.Conv2d(num_channels_pre_layer[i],
+                                  num_channels_cur_layer[i],
+                                  3,
+                                  1,
+                                  1,
+                                  bias=False),
+                        BatchNorm2d(
+                            num_channels_cur_layer[i], momentum=BN_MOMENTUM),
+                        nn.ReLU(inplace=True)))
+                else:
+                    transition_layers.append(None)
+            else:
+                conv3x3s = []
+                for j in range(i + 1 - num_branches_pre):
+                    inchannels = num_channels_pre_layer[-1]
+                    outchannels = num_channels_cur_layer[i] \
+                        if j == i - num_branches_pre else inchannels
+                    conv3x3s.append(nn.Sequential(
+                        nn.Conv2d(
+                            inchannels, outchannels, 3, 2, 1, bias=False),
+                        BatchNorm2d(outchannels, momentum=BN_MOMENTUM),
+                        nn.ReLU(inplace=True)))
+                transition_layers.append(nn.Sequential(*conv3x3s))
+        return nn.ModuleList(transition_layers)
+    def _make_layer(self, block, inplanes, planes, blocks, stride=1):
+        downsample = None
+        if stride != 1 or inplanes != planes * block.expansion:
+            downsample = nn.Sequential(
+                nn.Conv2d(inplanes, planes * block.expansion,
+                          kernel_size=1, stride=stride, bias=False),
+                BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
+            )
+        layers = []
+        layers.append(block(inplanes, planes, stride, downsample))
+        inplanes = planes * block.expansion
+        for i in range(1, blocks):
+            layers.append(block(inplanes, planes))
+        return nn.Sequential(*layers)
+    def _make_stage(self, layer_config, num_inchannels,
+                    multi_scale_output=True):
+        num_modules = layer_config['NUM_MODULES']
+        num_branches = layer_config['NUM_BRANCHES']
+        num_blocks = layer_config['NUM_BLOCKS']
+        num_channels = layer_config['NUM_CHANNELS']
+        block = blocks_dict[layer_config['BLOCK']]
+        fuse_method = layer_config['FUSE_METHOD']
+        modules = []
+        for i in range(num_modules):
+            # multi_scale_output is only used last module
+            if not multi_scale_output and i == num_modules - 1:
+                reset_multi_scale_output = False
+            else:
+                reset_multi_scale_output = True
+            modules.append(
+                HighResolutionModule(num_branches,
+                                     block,
+                                     num_blocks,
+                                     num_inchannels,
+                                     num_channels,
+                                     fuse_method,
+                                     reset_multi_scale_output)
+            )
+            num_inchannels = modules[-1].get_num_inchannels()
+        return nn.Sequential(*modules), num_inchannels
+    def forward(self, x):
+        # h, w = x.size(2), x.size(3)
+        x = self.conv1(x)
+        x_skip = x.clone()
+        x = self.bn1(x)
+        x = self.relu(x)
+        x = self.conv2(x)
+        x = self.bn2(x)
+        x = self.relu(x)
+        x = self.layer1(x)
+        x_list = []
+        for i in range(self.stage2_cfg['NUM_BRANCHES']):
+            if self.transition1[i] is not None:
+                x_list.append(self.transition1[i](x))
+            else:
+                x_list.append(x)
+        y_list = self.stage2(x_list)
+        x_list = []
+        for i in range(self.stage3_cfg['NUM_BRANCHES']):
+            if self.transition2[i] is not None:
+                x_list.append(self.transition2[i](y_list[-1]))
+            else:
+                x_list.append(y_list[i])
+        y_list = self.stage3(x_list)
+        x_list = []
+        for i in range(self.stage4_cfg['NUM_BRANCHES']):
+            if self.transition3[i] is not None:
+                x_list.append(self.transition3[i](y_list[-1]))
+            else:
+                x_list.append(y_list[i])
+        x = self.stage4(x_list)
+        # Head Part
+        height, width = x[0].size(2), x[0].size(3)
+        x1 = F.interpolate(x[1], size=(height, width), mode='bilinear', align_corners=False)
+        x2 = F.interpolate(x[2], size=(height, width), mode='bilinear', align_corners=False)
+        x3 = F.interpolate(x[3], size=(height, width), mode='bilinear', align_corners=False)
+        x = torch.cat([x[0], x1, x2, x3], 1)
+        x = self._make_head(x, x_skip)
+        return x
+    def init_weights(self, pretrained=''):
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
+                #nn.init.normal_(m.weight, std=0.001)
+                #nn.init.constant_(m.bias, 0)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.constant_(m.weight, 1)
+                nn.init.constant_(m.bias, 0)
+        if pretrained != '':
+            if os.path.isfile(pretrained):
+                pretrained_dict = torch.load(pretrained)
+                model_dict = self.state_dict()
+                pretrained_dict = {k: v for k, v in pretrained_dict.items()
+                                   if k in model_dict.keys()}
+                model_dict.update(pretrained_dict)
+                self.load_state_dict(model_dict)
+            else:
+                sys.exit(f'Weights {pretrained} not found.')
+def get_cls_net(config, pretrained='', **kwargs):
+    """Create keypoint detection model with softmax activation"""
+    model = HighResolutionNet(config, **kwargs)
+    model.init_weights(pretrained)
+    return model
+def get_cls_net_l(config, pretrained='', **kwargs):
+    """Create line detection model with sigmoid activation"""
+    model = HighResolutionNet(config, **kwargs)
+    model.init_weights(pretrained)
+    # After loading weights, replace just the activation function
+    # The saved model expects the nested Sequential structure
+    inner_seq = model.head[0]
+    # Replace softmax (index 4) with sigmoid
+    model.head[0][4] = nn.Sigmoid()
+    return model
+# Simplified utility functions - removed complex Gaussian generation functions
+# These were mainly used for training data generation, not inference
+# generate_gaussian_array_vectorized_dist_l function removed - not used in current implementation
+@torch.inference_mode()
+def run_inference(model, input_tensor: torch.Tensor, device):
+    input_tensor = input_tensor.to(device).to(memory_format=torch.channels_last)
+    output = model.module().forward(input_tensor)
+    return output
+def preprocess_batch_fast(frames):
+    """Ultra-fast batch preprocessing using optimized tensor operations"""
+    target_size = (540, 960)  # H, W format for model input
+    batch = []
+    for i, frame in enumerate(frames):
+        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+        img = cv2.resize(frame_rgb, (target_size[1], target_size[0]))
+        img = img.astype(np.float32) / 255.0
+        img = np.transpose(img, (2, 0, 1))  # HWC -> CHW
+        batch.append(img)
+    batch = torch.from_numpy(np.stack(batch)).float()
+    return batch
+def extract_keypoints_from_heatmap(heatmap: torch.Tensor, scale: int = 2, max_keypoints: int = 1):
+    """Optimized keypoint extraction from heatmaps"""
+    batch_size, n_channels, height, width = heatmap.shape
+    # Find local maxima using max pooling (keep on GPU)
+    kernel = 3
+    pad = 1
+    max_pooled = F.max_pool2d(heatmap, kernel, stride=1, padding=pad)
+    local_maxima = (max_pooled == heatmap)
+    heatmap = heatmap * local_maxima
+    # Get top keypoints (keep on GPU longer)
+    scores, indices = torch.topk(heatmap.view(batch_size, n_channels, -1), max_keypoints, sorted=False)
+    y_coords = torch.div(indices, width, rounding_mode="floor")
+    x_coords = indices % width
+    # Optimized tensor operations
+    x_coords = x_coords * scale
+    y_coords = y_coords * scale
+    # Create result tensor directly on GPU
+    results = torch.stack([x_coords.float(), y_coords.float(), scores], dim=-1)
+    return results
+def extract_keypoints_from_heatmap_fast(heatmap: torch.Tensor, scale: int = 2, max_keypoints: int = 1):
+    """Ultra-fast keypoint extraction optimized for speed"""
+    batch_size, n_channels, height, width = heatmap.shape
+    # Simplified local maxima detection (faster but slightly less accurate)
+    max_pooled = F.max_pool2d(heatmap, 3, stride=1, padding=1)
+    local_maxima = (max_pooled == heatmap)
+    # Apply mask and get top keypoints in one go
+    masked_heatmap = heatmap * local_maxima
+    flat_heatmap = masked_heatmap.view(batch_size, n_channels, -1)
+    scores, indices = torch.topk(flat_heatmap, max_keypoints, dim=-1, sorted=False)
+    # Vectorized coordinate calculation
+    y_coords = torch.div(indices, width, rounding_mode="floor") * scale
+    x_coords = (indices % width) * scale
+    # Stack results efficiently
+    results = torch.stack([x_coords.float(), y_coords.float(), scores], dim=-1)
+    return results
+def process_keypoints_vectorized(kp_coords, kp_threshold, w, h, batch_size):
+    """Ultra-fast vectorized keypoint processing"""
+    batch_results = []
+    # Convert to numpy once for faster CPU operations
+    kp_np = kp_coords.cpu().numpy()
+    for batch_idx in range(batch_size):
+        kp_dict = {}
+        # Vectorized threshold check
+        valid_kps = kp_np[batch_idx, :, 0, 2] > kp_threshold
+        valid_indices = np.where(valid_kps)[0]
+        for ch_idx in valid_indices:
+            x = float(kp_np[batch_idx, ch_idx, 0, 0]) / w
+            y = float(kp_np[batch_idx, ch_idx, 0, 1]) / h
+            p = float(kp_np[batch_idx, ch_idx, 0, 2])
+            kp_dict[ch_idx + 1] = {'x': x, 'y': y, 'p': p}
+        batch_results.append(kp_dict)
+    return batch_results
+def inference_batch(frames, model, kp_threshold, device, batch_size=8):
+    """Optimized batch inference for multiple frames"""
+    results = []
+    num_frames = len(frames)
+    # Get the device from the model itself
+    model_device = next(model.parameters()).device
+    print(model_device)
+    # Process all frames in optimally-sized batches
+    for i in range(0, num_frames, batch_size):
+        current_batch_size = min(batch_size, num_frames - i)
+        batch_frames = frames[i:i + current_batch_size]
+        # Fast preprocessing - create on CPU first
+        batch = preprocess_batch_fast(batch_frames)
+        b, c, h, w = batch.size()
+        # Move batch to model device
+        batch = batch.to(model_device)
+        with torch.no_grad():
+            heatmaps = model(batch)
+        # Ultra-fast keypoint extraction
+        kp_coords = extract_keypoints_from_heatmap_fast(heatmaps[:,:-1,:,:], scale=2, max_keypoints=1)
+        # Vectorized batch processing - no loops
+        batch_results = process_keypoints_vectorized(kp_coords, kp_threshold, 960, 540, current_batch_size)
+        results.extend(batch_results)
+        # Minimal cleanup
+        del heatmaps, kp_coords, batch
+    return results
+# Keypoint mapping from detection indices to standard football pitch keypoint IDs
+map_keypoints = {
+    1: 1, 2: 14, 3: 25, 4: 2, 5: 10, 6: 18, 7: 26, 8: 3, 9: 7, 10: 23,
+    11: 27, 20: 4, 21: 8, 22: 24, 23: 28, 24: 5, 25: 13, 26: 21, 27: 29,
+    28: 6, 29: 17, 30: 30, 31: 11, 32: 15, 33: 19, 34: 12, 35: 16, 36: 20,
+    45: 9, 50: 31, 52: 32, 57: 22
+}
+def get_mapped_keypoints(kp_points):
+    """Apply keypoint mapping to detection results"""
+    mapped_points = {}
+    for key, value in kp_points.items():
+        if key in map_keypoints:
+            mapped_key = map_keypoints[key]
+            mapped_points[mapped_key] = value
+        # else:
+            # Keep unmapped keypoints with original key
+            # mapped_points[key] = value
+    return mapped_points
+def process_batch_input(frames, model, kp_threshold, device, batch_size=8):
+    """Process multiple input images in batch"""
+    # Batch inference
+    kp_results = inference_batch(frames, model, kp_threshold, device, batch_size)
+    kp_results = [get_mapped_keypoints(kp) for kp in kp_results]
+    # Draw results and save
+    # for i, (frame, kp_points, input_path) in enumerate(zip(frames, kp_results, valid_paths)):
+    #     height, width = frame.shape[:2]
+    #     # Apply mapping to get standard keypoint IDs
+    #     mapped_kp_points = get_mapped_keypoints(kp_points)
+    #     for key, value in mapped_kp_points.items():
+    #         x = int(value['x'] * width)
+    #         y = int(value['y'] * height)
+    #         cv2.circle(frame, (x, y), 5, (0, 255, 0), -1)  # Green circles
+    #         cv2.putText(frame, str(key), (x+10, y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 2)
+    #     # Save result
+    #     output_path = input_path.replace('.png', '_result.png').replace('.jpg', '_result.jpg')
+    #     cv2.imwrite(output_path, frame)
+    # print(f"Batch processing complete. Processed {len(frames)} images.")
+    return kp_results

team_cluster.pyc ADDED Viewed

Binary file (7.62 kB). View file

utils.pyc ADDED Viewed

Binary file (20.6 kB). View file