# YouTube Video Analysis Tools Documentation ## Overview This project now includes powerful YouTube video analysis capabilities that allow the agent to: 1. **Extract metadata** from YouTube videos without downloading them 2. **Download videos** and extract frames at specified intervals 3. **Analyze visual content** of video frames 4. **Provide timestamped descriptions** of video content ## Tools Available ### 1. `get_youtube_video_info(video_url)` **Purpose**: Quick metadata retrieval without downloading the video. **Parameters**: - `video_url` (str): YouTube video URL **Returns**: JSON string containing: - Video title, duration, uploader - View count, upload date - Resolution and description excerpt - Status (success/error) **Example Usage**: ```python result = get_youtube_video_info("https://www.youtube.com/watch?v=VIDEO_ID") ``` ### 2. `analyze_youtube_video(video_url, max_frames=6, interval_seconds=45.0)` **Purpose**: Full video analysis with frame extraction and description. **Parameters**: - `video_url` (str): YouTube video URL - `max_frames` (int): Maximum frames to extract (1-10, default: 6) - `interval_seconds` (float): Time between extractions (min: 10s, default: 45s) **Returns**: JSON string containing: - Video metadata - Frame analyses with timestamps - Detailed descriptions of visual content - Extraction summary **Example Usage**: ```python result = analyze_youtube_video( "https://www.youtube.com/watch?v=VIDEO_ID", max_frames=5, interval_seconds=30.0 ) ``` ## Agent Integration The tools are integrated into the `BasicAgent` and can be used through natural language queries: ### Example Queries 1. **Video Information**: - "What is the title and duration of this video: [URL]?" - "Get information about this YouTube video: [URL]" - "How many views does this video have: [URL]?" 2. **Content Analysis**: - "Analyze this YouTube video and tell me what happens: [URL]" - "Describe the visual content of this tutorial: [URL]" - "What can you see in this video: [URL]?" 3. **Frame Extraction**: - "Extract 5 frames from this video every 60 seconds: [URL]" - "Show me frames from the beginning, middle, and end of this video: [URL]" - "Analyze key moments in this video: [URL]" ## Technical Details ### Dependencies - **yt-dlp**: YouTube video downloading - **opencv-python**: Frame extraction and processing - **PIL (Pillow)**: Image processing and encoding - **numpy**: Numerical operations for image arrays ### Processing Pipeline 1. **Video Download**: yt-dlp downloads video in optimal quality (≤720p) 2. **Frame Extraction**: OpenCV extracts frames at specified intervals 3. **Image Processing**: Frames are resized (512px width) and converted to base64 4. **Analysis Ready**: Frames prepared for image analysis models ### Performance Considerations - **Download Limits**: Videos are limited to ≤720p to reduce bandwidth - **Frame Limits**: Maximum 10 frames to control processing time - **Interval Limits**: Minimum 10 seconds between frames to avoid redundancy - **Timeout Handling**: Robust error handling for network issues ### Error Handling - Invalid YouTube URLs - Network connectivity issues - Video download failures - Processing errors - Unsupported video formats ## Usage Examples ### In Agent Conversations **User**: "Can you analyze this YouTube video and tell me what it's about? https://www.youtube.com/watch?v=dQw4w9WgXcQ" **Agent Response**: The agent will: 1. First get video metadata to understand duration and title 2. Extract frames at intervals throughout the video 3. Analyze each frame for visual content 4. Provide a comprehensive summary with timestamps ### Sample Output Structure ```json { "status": "success", "video_info": { "title": "Video Title", "duration": "3:33", "uploader": "Channel Name" }, "analysis_summary": "Analyzed 6 frames from 'Video Title' (Duration: 3:33) at 30s intervals.", "frames_extracted": 6, "frame_analyses": [ { "timestamp_seconds": 0, "timestamp_formatted": "0:00", "description": "Description of what's visible in the frame" } ] } ``` ## Best Practices 1. **Start with video info** for unknown videos to check duration and content 2. **Use appropriate intervals** - shorter for action videos, longer for static content 3. **Limit frame count** for long videos to avoid excessive processing 4. **Handle errors gracefully** - network issues are common with video downloads ## Limitations - Requires internet connection for video access - Processing time depends on video length and quality - Geographic restrictions may apply to some videos - Rate limiting may occur with excessive usage ## Future Enhancements Potential improvements could include: - Integration with image analysis models for automated descriptions - Audio transcription combined with visual analysis - Scene change detection for intelligent frame selection - Batch processing for multiple videos - Caching mechanisms for frequently accessed videos