┌─────────────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ │
│ SP404ChatAgent (sp404_chat.py) │
│ • Natural language chat interface │
│ • Commands: /help, /metrics, /history, /exit │
│ • Streaming responses with Rich formatting │
└──────────────────────┬──────────────────────────────────────────────┘
│
│ User Input: "Find me J Dilla style samples"
↓
┌─────────────────────────────────────────────────────────────────────┐
│ INTELLIGENCE LAYER (NEW!) │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Intelligent Context Manager │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Tier 1 │ │ Tier 2 │ │ Tier 3 │ │ Tier 4 │ │ │
│ │ │Immediate│ │ Working │ │Reference│ │Background│ │ │
│ │ │ 500- │ │ 800- │ │ 500- │ │ 300- │ │ │
│ │ │ 1200t │ │ 2000t │ │ 1500t │ │ 1000t │ │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │ │• Recent │ │• Musical│ │•Heuristi│ │•Thinking│ │ │
│ │ │ convo │ │ intent │ │ cs │ │ protocol│ │ │
│ │ │• Current│ │• Search │ │• Tool │ │•Examples│ │ │
│ │ │ task │ │ results│ │ registry│ │• Artist│ │ │
│ │ │ │ │• Samples│ │• Guides │ │ refs │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ │ │
│ │ Budget: 4000 tokens (soft) / 5000 tokens (hard) │ │
│ │ Current: ~3,800 tokens (40% savings vs naive loading) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Thinking Protocols (Priority 1) │ │
│ │ │ │
│ │ 5-Step Vibe Analysis 4-Step Query Generation │ │
│ │ ──────────────────── ────────────────────── │ │
│ │ 1. Analyze characteristics 1. Decode intent │ │
│ │ 2. Consider era/context 2. Optimize platform │ │
│ │ 3. Identify mood 3. Apply formulas │ │
│ │ 4. Determine use case 4. Validate queries │ │
│ │ 5. Find compatibility │ │
│ │ │ │
│ │ Agent reasons BEFORE answering, not after │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Heuristics Engine (Priority 3) │ │
│ │ │ │
│ │ <heuristic name="Detect Search Intent"> │ │
│ │ <when>User message received</when> │ │
│ │ <consider> │ │
│ │ • Action verbs (find, get, search) │ │
│ │ • Musical terms (boom bap, soul, 90 BPM) │ │
│ │ • Artist references (J Dilla, Madlib) │ │
│ │ </consider> │ │
│ │ <generally>Trigger if action + music terms</generally> │ │
│ │ <unless>Question or URL provided</unless> │ │
│ │ </heuristic> │ │
│ │ │ │
│ │ Flexible guidelines, not rigid if/else rules │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Tool Registry (Priority 2) │ │
│ │ │ │
│ │ youtube_search │ │
│ │ ├─ Triggers: ["find", "search", "discover"] │ │
│ │ ├─ Anti-triggers: ["analyze", "http://"] │ │
│ │ ├─ Parameters: query, max_results, filter │ │
│ │ └─ Workflows: Discovery → Analysis → Download │ │
│ │ │ │
│ │ timestamp_extractor │ │
│ │ ├─ Triggers: ["youtube.com", "youtu.be"] │ │
│ │ ├─ Returns: timestamps, descriptions, types │ │
│ │ └─ Workflows: URL Analysis → Extract → Download │ │
│ │ │ │
│ │ Complete documentation for when/how to use each tool │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Example Libraries (Priority 6) │ │
│ │ │ │
│ │ J Dilla → "boom bap 85-95 BPM vinyl sample pack" │ │
│ │ Madlib → "dusty jazz soul breaks vinyl rare" │ │
│ │ Metro Boomin → "dark trap 808 sub bass hi-hat rolls" │ │
│ │ Alchemist → "sample-heavy soul loops dramatic strings" │ │
│ │ │ │
│ │ 12+ artists with style → query translations │ │
│ └──────────────────────────────────────────────────────────────┘ │
└──────────────────────┬──────────────────────────────────────────────┘
│
│ Intelligent Context: ~3,800 tokens
│ "J Dilla" triggers boom bap protocol
↓
┌─────────────────────────────────────────────────────────────────────┐
│ AGENT LAYER │
│ │
│ CollectorAgent │
│ ├─ Uses query generation protocol │
│ ├─ Generates: "j dilla drum breaks vinyl sample pack" │
│ └─ Calls youtube_search tool │
│ │
│ VibeAnalysisAgent (enhanced with protocols) │
│ ├─ Uses 5-step vibe analysis protocol │
│ ├─ Analyzes: BPM 93, D minor, warm spectrum │
│ └─ Returns: "reflective, grounded, 1970s soul/funk" │
│ │
│ GrooveAnalyst │
│ └─ Rhythm and swing analysis │
│ │
│ EraExpert │
│ └─ Historical context and techniques │
│ │
│ TimestampExtractor │
│ └─ Video timestamp parsing │
└──────────────────────┬──────────────────────────────────────────────┘
│
│ Search queries + analysis
↓
┌─────────────────────────────────────────────────────────────────────┐
│ TOOLS LAYER │
│ │
│ youtube_search → Returns 8 video results │
│ timestamp_extract → Parses video descriptions │
│ audio_analysis → Extracts BPM, key, spectrum │
│ database → Stores metadata │
└──────────────────────┬──────────────────────────────────────────────┘
│
│ Results
↓
┌─────────────────────────────────────────────────────────────────────┐
│ RESPONSE │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Found 8 samples matching your request: │ │
│ │ │ │
│ │ # Title Platform Quality │ │
│ │ 1 J Dilla Drum Kit Vol 2 YouTube 92% │ │
│ │ 2 Boom Bap Breaks - Dilla Style YouTube 88% │ │
│ │ 3 MPC Sample Pack - Donuts YouTube 85% │ │
│ │ ... │ │
│ │ │ │
│ │ Type 'download 1-3' to download │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Step 1: CONTEXT BUILDING
├─ Task Detection: "sample_search" detected
├─ Tier 1 Load: Recent conversation (800 tokens)
├─ Tier 2 Load: Musical intent parameters (1,500 tokens)
├─ Tier 3 Load: search_intent_detection.xml, query_generation.xml (900 tokens)
└─ Tier 4 Load: J Dilla example, boom bap protocol (600 tokens)
Total: 3,800 tokens (vs 6,500 without intelligent loading)
Step 2: REASONING (Thinking Protocol)
├─ Agent reads: "J Dilla → boom bap, 85-95 BPM, MPC-style, vinyl samples"
├─ Agent applies: Query generation 4-step protocol
├─ Step 1: Intent = "boom bap drums in J Dilla style"
├─ Step 2: Platform = YouTube sample packs likely
├─ Step 3: Formula = [artist + style + instrument + quality]
└─ Step 4: Queries validated
Generated:
1. "j dilla drum breaks vinyl sample pack"
2. "boom bap 90 bpm drums mpc donuts"
3. "dilla style drum kit free"
Step 3: TOOL SELECTION (Heuristics)
├─ Heuristic: Detect Search Intent = HIGH confidence (0.95)
├─ Tool Registry: youtube_search matches triggers
├─ Anti-triggers: No URL, not a question ✓
└─ Decision: Use youtube_search tool
Step 4: EXECUTION
├─ CollectorAgent.execute(queries)
├─ youtube_search(query="j dilla drum breaks...")
├─ Quality filter: Remove tutorials, gameplay
├─ Score results: Era match, sample pack indicators
└─ Return: Top 8 results
Step 5: CONTEXT UPDATE
├─ Store search results in Tier 2
├─ Add discovered samples
└─ Update metrics: +1 search, 3,800 tokens used
Step 6: RESPONSE
└─ Format table → User sees results
User: "Find me J Dilla samples"
↓
Agent: Searches YouTube for "J Dilla samples"
↓
Returns: Generic results (tutorials, interviews, gameplay)
↓
Quality: Low (60% relevant)
Tokens: 6,500 (everything loaded)
Reasoning: None visible
User: "Find me J Dilla samples"
↓
Context Manager: Loads J Dilla profile from examples
↓
Thinking Protocol: Analyzes "boom bap, 85-95 BPM, vinyl, MPC-style"
↓
Heuristics: High confidence sample search intent
↓
Tool Registry: Select youtube_search with optimized parameters
↓
Agent: Generates 3 targeted queries
↓
Returns: High-quality sample packs (tutorials filtered out)
↓
Quality: High (85% relevant)
Tokens: 3,800 (intelligent loading, 40% savings)
Reasoning: Fully visible and traceable
┌──────────────────────────────────────────────────────┐
│ Context Manager │
│ "What context does the agent need?" │
└─────────────┬────────────────────────────────────────┘
│ Provides context
↓
┌──────────────────────────────────────────────────────┐
│ Thinking Protocols │
│ "How should the agent reason?" │
└─────────────┬────────────────────────────────────────┘
│ Guides reasoning
↓
┌──────────────────────────────────────────────────────┐
│ Heuristics │
│ "What decision should the agent make?" │
└─────────────┬────────────────────────────────────────┘
│ Informs decision
↓
┌──────────────────────────────────────────────────────┐
│ Tool Registry │
│ "How should the agent use tools?" │
└─────────────┬────────────────────────────────────────┘
│ Executes with context
↓
┌──────────────────────────────────────────────────────┐
│ Examples │
│ "What does good output look like?" │
└──────────────────────────────────────────────────────┘
NAIVE LOADING (Before):
████████████████████████████████████████ 6,500 tokens
│ │
└─ Everything loaded every time │
INTELLIGENT LOADING (After):
████████████████████████ 3,800 tokens
│ │
│ Tier 1: ████ │ 800t (Immediate)
│ Tier 2: ████████ │ 1,500t (Working)
│ Tier 3: ████ │ 900t (Reference)
│ Tier 4: ███ │ 600t (Background)
│ │
└─ Task-based, prioritized loading
SAVINGS: 2,700 tokens (40% reduction)
INPUT: Sample with BPM 93, D minor, 1150 Hz spectrum
STEP 1: Analyze Characteristics
├─ BPM 93 = mid-tempo (85-100 range)
├─ D minor = serious, emotional
└─ 1150 Hz = warm, analog character
↓
STEP 2: Era & Production
├─ Warm spectrum + mid-tempo = likely 1970s
├─ D minor at 93 BPM = soul/funk era
└─ Inference: Vinyl-sourced, tape saturation
↓
STEP 3: Mood
├─ D minor + mid-tempo = serious but not sad
├─ Warm = comfortable, organic
└─ Result: "purposeful, groovy, vintage"
↓
STEP 4: Use Case
├─ Classic breakbeat → drum foundation
├─ 93 BPM → boom bap, lo-fi, neo-soul
└─ Recommendation: Verse drums, loop foundation
↓
STEP 5: Compatibility
├─ Seek: 90-96 BPM, D minor/F major
├─ Avoid: Bright digital, fast tempo
└─ Best matches: Jazz/soul/funk same era
OUTPUT: {
"mood": ["purposeful", "groovy", "vintage"],
"era": "1970s",
"genre": "soul/funk",
"best_use": "drum foundation"
}
sp404mk2-sample-agent/
│
├── 📄 ARCHITECTURE_OVERVIEW.md ← YOU ARE HERE
├── 📄 CURRENT_FUNCTIONALITY.md
├── 📄 ARCHITECTURE_VISUAL.md
│
├── docs/
│ └── ARCHITECTURE.md (Original system design)
│
├── .claude/
│ ├── IMPLEMENTATION_PROGRESS.md
│ ├── thinking_protocols/ [Priority 1]
│ │ ├── vibe_analysis_protocol.md
│ │ └── search_query_generation_protocol.md
│ ├── tools/ [Priority 2]
│ │ ├── youtube_search.md
│ │ ├── timestamp_extractor.md
│ │ └── tool_registry.json
│ ├── heuristics/ [Priority 3]
│ │ ├── search_intent_detection.xml
│ │ ├── query_generation.xml
│ │ └── sample_quality_assessment.xml
│ ├── context/ [Priority 4]
│ │ └── tier_config.json
│ └── examples/ [Priority 6]
│ ├── vibe_analysis/
│ ├── search_queries/
│ └── musical_translation/
│
├── src/
│ ├── context/ [NEW Intelligence Layer]
│ │ ├── README.md
│ │ ├── intelligent_manager.py
│ │ ├── context_tiers.py
│ │ └── metrics.py
│ ├── utils/
│ │ └── heuristics_loader.py
│ └── agents/
│ └── vibe_analysis.py (enhanced)
│
└── sp404_chat.py (Main interface)
This visual guide complements ARCHITECTURE_OVERVIEW.md with diagrams and flow charts.