Gary Simmons
add YouTube video analysis tools and audio transcription capabilities, including documentation and test scripts
5e5f9d1
| title: Template Final Assignment | |
| emoji: π΅π»ββοΈ | |
| colorFrom: indigo | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 5.25.2 | |
| app_file: app.py | |
| pinned: false | |
| hf_oauth: true | |
| # optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes. | |
| hf_oauth_expiration_minutes: 480 | |
| # Agent with YouTube Video Analysis | |
| This agent includes advanced YouTube video analysis capabilities using yt-dlp and OpenCV for frame extraction and analysis. | |
| ## Features | |
| ### YouTube Video Analysis Tools | |
| The agent is equipped with two powerful YouTube video analysis tools: | |
| #### 1. `analyze_youtube_video(video_url, max_frames=6, interval_seconds=45.0)` | |
| - **Purpose**: Downloads a YouTube video and extracts frames at regular intervals for detailed analysis | |
| - **Parameters**: | |
| - `video_url`: YouTube video URL (e.g., https://www.youtube.com/watch?v=VIDEO_ID) | |
| - `max_frames`: Maximum number of frames to extract (1-10, default: 6) | |
| - `interval_seconds`: Time interval between extractions (minimum: 10s, default: 45s) | |
| - **Returns**: JSON with video metadata, frame timestamps, and detailed descriptions of each frame | |
| - **Use cases**: Content analysis, scene detection, video summarization, accessibility descriptions | |
| #### 2. `get_youtube_video_info(video_url)` | |
| - **Purpose**: Quickly retrieves video metadata without downloading | |
| - **Parameters**: | |
| - `video_url`: YouTube video URL | |
| - **Returns**: JSON with title, duration, uploader, view count, description, and resolution | |
| - **Use cases**: Video verification, content filtering, metadata collection | |
| ### Technical Implementation | |
| - **Video Processing**: Uses yt-dlp for robust YouTube video downloading | |
| - **Frame Extraction**: OpenCV for efficient frame extraction and processing | |
| - **Image Processing**: PIL and numpy for frame manipulation and encoding | |
| - **Analysis Ready**: Frames are prepared for image analysis (base64 encoded, resized) | |
| - **Error Handling**: Comprehensive error handling for network issues, invalid URLs, and processing failures | |
| ### Example Usage | |
| The agent can answer questions like: | |
| - "Analyze this YouTube video and tell me what happens: [URL]" | |
| - "Extract 5 frames from this video every 60 seconds: [URL]" | |
| - "What is the title and duration of this video: [URL]" | |
| - "Describe the visual content of this tutorial video: [URL]" | |
| ### Dependencies | |
| - yt-dlp: YouTube video downloading | |
| - opencv-python: Computer vision and frame extraction | |
| - PIL (Pillow): Image processing | |
| - numpy: Numerical operations | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |