Example Data - Dallas
This directory contains a complete example analysis for demonstration purposes.
Dataset: Dallas - Austin Prediction
Prompt: "The capital of state containing Dallas is"
Target: " Austin"
Model: Gemma-2-2B with Cross-Layer Transcoders (CLT)
Features: 55 features analyzed
Files Included
Stage 1: Graph Generation
clt-hp-the-capital-of-201020250035-20251020-003525.json- Complete attribution graphselected_features_with_nodes.json- Selected features for analysis
Stage 2: Probe Prompts
prompts.json- Semantic concepts used for probing2025-10-21T07-40_export_ENRICHED.csv- Activation analysis resultsactivations_dump (2).json- Raw activation data
Stage 3: Node Grouping
node_grouping_final_20251027_173744.csv- Final classification and namingnode_grouping_summary_20251027_173749.json- Summary statisticsnode_grouping_step1_20251027_180825.csv- Token classificationnode_grouping_step2_20251027_180821.csv- Feature classification
How to Use
- Navigate to each stage page in the Streamlit app
- Use the "Load Example" or file upload options
- Load the corresponding files from this directory
- Explore the visualizations and results
Results Summary
The analysis identified:
- Semantic (Dictionary) features: Tokens like "Dallas", "Texas", "Austin"
- Semantic (Concept) features: Related concepts about cities and states
- Say "X" features: Output prediction mechanisms
- Relationship features: Connections between geographical entities
This demonstrates the complete pipeline for automated sparse feature interpretation.