Real-time conversation monitoring and intelligent classification that captures every development session with zero data loss.
The LSL system provides intelligent session capture with automatic content routing:
- Real-Time Monitoring - Captures every Claude conversation as it happens
- Intelligent Classification - 5-layer system determines content routing (LOCAL vs CODING)
- Zero Data Loss - 4-layer monitoring architecture ensures reliability
- Multi-Project Support - Handles multiple projects simultaneously with foreign session tracking
- Security Redaction - Automatic sanitization of API keys, passwords, and credentials
The LSL system consists of multiple layers working together to ensure reliable, real-time session logging with intelligent content classification and routing.
The LSL system uses a 4-layer failsafe monitoring architecture (distinct from the classification system):
Layer 1: System-Level Watchdog (monitoring/global-monitor-watchdog.js)
- Top-level supervisor monitoring all monitors
- Restarts failed monitoring processes
- Logs health status every 60 seconds
Layer 2: Global Coordinator (monitoring/global-transcript-monitor-coordinator.js)
- Manages monitoring across all projects
- Ensures exactly one monitor per project
- Coordinates monitor lifecycle and health checks
Layer 3: Monitoring Verifier (monitoring/transcript-monitor-verifier.js)
- Verifies monitor health for all active projects
- Detects suspicious activity (stuck processes)
- Health checks every 30 seconds
Layer 4: Service-Level Self-Monitoring (Enhanced Transcript Monitor)
- Each monitor generates
.transcript-monitor-healthfile - Real-time process metrics (memory, CPU, uptime)
- Exchange count and activity tracking
Enhanced Transcript Monitor (scripts/enhanced-transcript-monitor.js):
- Monitors Claude transcript files (
.jsonl) in real-time - Detects session boundaries and organizes into time windows
- Generates health files with process metrics
- Routes classified content to appropriate destinations
ReliableCodingClassifier (src/live-logging/ReliableCodingClassifier.js):
- 5-layer classification system (Layer 0-4)
- Determines if content is LOCAL (project-specific) or CODING (infrastructure)
- Progressive fallback from fast pattern matching to deep semantic analysis
Security Redaction (src/live-logging/ConfigurableRedactor.js):
- 13 redaction pattern types covering all common secrets
- Automatic sanitization before writing to disk
- Configurable patterns with severity levels
- <5ms overhead per exchange
Classification Logger (scripts/classification-logger.js):
- Comprehensive decision tracking across all 5 layers
- JSONL logs for programmatic analysis
- Markdown reports with clickable navigation
- Bidirectional routing (LOCAL stays local, CODING goes to coding repo)
Every project started via coding/bin/coding gets its own Enhanced Transcript Monitor:
- Exchange Detection: Monitor detects new exchanges in Claude transcript files
- Periodic Refresh: Every 60s, switches to newest transcript file
- Health Reporting: Generates
.transcript-monitor-healthwith process metrics - Suspicious Activity: Identifies stuck processes and processing issues
Note: This is the 5-layer classification system (Layers 0-4), distinct from the 4-layer monitoring architecture.
Progressive Fallback Strategy: Each layer attempts classification; if confident, decision is final. Otherwise, falls through to next layer.
Layer 0: Session Filter - Conversation context and bias tracking
- Maintains sliding window of recent classifications
- Calculates conversation bias (CODING vs LOCAL)
- Handles follow-up prompts ("continue", "looks good")
- Activates when: bias strength ≥ 0.65 and neutral prompt detected
Layer 1: PathAnalyzer - File operation pattern matching
- Analyzes file paths for direct coding infrastructure detection
- Two-step artifact checking (local → coding repo)
- Prevents false positives from ambiguous paths
- Response time: <1ms
Layer 2: KeywordMatcher - Fast keyword-based classification
- Intelligent keyword analysis for coding-related terms
- Immediate classification for clear coding infrastructure content
- Response time: <10ms
Layer 3: EmbeddingClassifier - Semantic vector similarity
- Native JavaScript embeddings (Transformers.js)
- Model:
Xenova/all-MiniLM-L6-v2(384-dimensional) - Qdrant vector database with HNSW indexing
- Searches against 183 indexed coding infrastructure files
- Similarity threshold: 0.65 (configurable)
- Response time: ~50ms
Layer 4: SemanticAnalyzer - LLM-powered deep understanding
- Direct Groq API calls (llama-3.3-70b, qwen-2.5)
- Used when embedding classification is inconclusive
- Temperature: 0.1 for consistent decisions
- Response time: <10ms with caching
Early Exit Optimization: Classification stops at first confident decision, minimizing cost and latency.
How It Works:
- Per-Project Monitors: Each project gets its own Enhanced Transcript Monitor
- Global Coordination: Coordinator ensures no duplicate monitors
- Watchdog Supervision: System-level watchdog supervises all monitors
- Intelligent Routing: Classification determines content destination
- Foreign Session Tracking: CODING content automatically redirected to coding repo
Example Flow:
User works in curriculum-alignment →
Monitor detects CODING content →
Routes to: coding/.specstory/history/
YYYY-MM-DD_HHMM-HHMM_<hash>_from-curriculum-alignment.md
LOCAL Content (Project-Specific):
- Stored in:
project/.specstory/history/ - Format:
YYYY-MM-DD_HHMM-HHMM_<userhash>.md - Classification logs: Stay in project
CODING Content (Infrastructure):
- Redirected to:
coding/.specstory/history/ - Format:
YYYY-MM-DD_HHMM-HHMM_<userhash>_from-<project>.md - Classification logs: Copied to coding repo
Classification Logs:
- JSONL format:
YYYY-MM-DD_HHMM-HHMM_<userhash>.jsonl - Markdown reports: Separate for LOCAL and CODING decisions
- Status files: Aggregate statistics with clickable navigation
Before any content reaches storage:
- ConfigurableRedactor processes all exchanges
- 13 pattern types detect secrets:
- API keys and tokens
- Passwords and credentials
- URLs with embedded passwords
- Email addresses
- Corporate user IDs
- Matches replaced with
<SECRET_REDACTED> - Applied in both:
- Live monitoring (before storage)
- Post-session fallback (safety net)
Example Redaction:
// Before redaction
ANTHROPIC_API_KEY=sk-ant-1234567890abcdef
"openaiApiKey": "sk-abcdef1234567890"
// After redaction
ANTHROPIC_API_KEY=<SECRET_REDACTED>
"openaiApiKey": "<SECRET_REDACTED>"Performance: <5ms overhead per exchange, minimal impact on real-time processing.
Claude Conversation Transcripts (.jsonl format):
- Real-time exchange detection
- Periodic refresh to newest transcript (60s)
- Session boundary detection
Session Boundaries:
- Time-based windows (e.g., 1800-1900)
- Automatic session detection
- Window-based file organization
Important: LSL file writes are asynchronous and retroactive. Understanding this behavior prevents confusion about file modification timestamps.
How It Works:
-
Time-Slot Naming: LSL files are named by their time window (e.g.,
2025-12-20_1000-1100_g9b30a.mdfor the 10:00-11:00 slot) -
Retroactive Writing: The transcript monitor can write to a time-slot file after that time period has ended. This happens because:
- Transcript processing is asynchronous - content is buffered and written in batches
- JSONL transcript files may not be processed immediately when they're created
- Session end triggers a final flush of all pending content to appropriate time-slot files
-
Modification Timestamps: A file named
1000-1100may show a modification time of 12:36 if:- The session ended at 12:36 and flushed remaining buffered content
- The transcript monitor processed queued exchanges from that period
- Batch recovery/reprocessing was performed later
Example Scenario:
Session starts at 09:30
User works until 11:15
Session ends at 12:36
Files affected:
- 2025-12-20_0900-1000_g9b30a.md (09:30-10:00 content)
- 2025-12-20_1000-1100_g9b30a.md (10:00-11:00 content) ← May be modified at 12:36
- 2025-12-20_1100-1200_g9b30a.md (11:00-11:15 content)
All files may show 12:36 as modification time if content was flushed at session end.
Why This Design:
- Reduces I/O overhead during active sessions (batched writes)
- Ensures no data loss at session boundaries
- Allows recovery of content from crashed monitors
- Enables retroactive reprocessing of transcripts
Git Implications:
- Files may appear as modified in git status even for "past" time slots
- This is expected behavior - commit normally
- The content reflects activity during that time period, regardless of when it was written to disk
Process Monitoring:
{
"metrics": {
"memoryMB": 9,
"cpuUser": 7481974,
"uptimeSeconds": 925,
"processId": 78406
}
}Activity Tracking:
{
"activity": {
"lastExchange": "82da8b2a-6a30-45eb-b0c7-5e1e2b2d54ee",
"exchangeCount": 10,
"isSuspicious": false
}
}Suspicious Activity Detection:
- Stale monitors (no activity for extended period)
- Stuck processes (exchange count not increasing)
- High memory usage
- Processing issues
Core System:
scripts/enhanced-transcript-monitor.js- Core monitoring processsrc/live-logging/ReliableCodingClassifier.js- 5-layer classificationsrc/live-logging/ConfigurableRedactor.js- Security redactionscripts/classification-logger.js- Decision tracking
Monitoring Infrastructure:
monitoring/global-monitor-watchdog.js- System-level watchdogmonitoring/global-transcript-monitor-coordinator.js- Global coordinatormonitoring/transcript-monitor-verifier.js- Monitoring verifier
Configuration:
config/live-logging-config.json- Classification thresholds and layer settings.specstory/config/redaction-patterns.json- Redaction pattern definitions
File: config/live-logging-config.json
{
"session_filter": {
"enabled": true,
"bias_threshold": 0.65,
"window_size": 5
},
"path_analyzer": {
"enabled": true,
"check_local_artifacts": true
},
"embedding_classifier": {
"enabled": true,
"similarity_threshold": 0.65,
"model": "Xenova/all-MiniLM-L6-v2"
},
"semantic_analyzer": {
"enabled": true,
"provider": "groq",
"model": "llama-3.3-70b",
"temperature": 0.1
}
}File: .specstory/config/redaction-patterns.json
13 configurable pattern types:
- Environment variables (API_KEY=value)
- JSON API keys ("apiKey": "sk-...")
- sk- prefixed keys (OpenAI, Anthropic)
- xai- prefixed keys (XAI/Grok)
- Bearer tokens
- JWT tokens
- MongoDB connection strings
- PostgreSQL connection strings
- MySQL connection strings
- Generic URLs with credentials
- Email addresses
- Corporate user IDs
Each pattern includes:
id: Unique identifiername: Human-readable namepattern: Regex patternseverity: low | medium | high | criticalenabled: Toggle on/off
Related Systems:
- Health System - Monitors LSL service health
- Constraints - Code quality enforcement
- Knowledge Management - Online knowledge extraction from LSLs
- Trajectories - Real-time trajectory analysis from LSL data
Monitor not starting?
# Check if monitor is running
ps aux | grep enhanced-transcript-monitor
# Check health file
cat .health/coding-transcript-monitor-health.json
# Restart monitor via coding command
coding --restart-monitorLSL files not being generated?
# Verify monitor is processing exchanges
tail -50 .logs/transcript-monitor-test.log
# Check if LSL files exist for today
ls -la .specstory/history/ | grep "$(date +%Y-%m-%d)"
# Recover missing LSL files from transcripts
PROJECT_PATH=/Users/q284340/Agentic/coding CODING_REPO=/Users/q284340/Agentic/coding \
node scripts/batch-lsl-processor.js from-transcripts ~/.claude/projects/-Users-q284340-Agentic-codingClassification not working?
# Check classification logs
ls -la .specstory/logs/classification/
# Verify configuration
cat config/live-logging-config.json | jq '.embedding_classifier'
# Test embedding generator
node src/knowledge-management/EmbeddingGenerator.js --testForeign sessions not routing correctly?
# Check classification status
cat .specstory/logs/classification/classification-status_<userhash>.md
# Verify CODING content in coding repo
ls -la /path/to/coding/.specstory/history/*_from-<project>.mdWhen LSL files are missing due to monitor issues, use the batch processor to recover from transcripts:
# Recover all LSL files from transcripts for this project
PROJECT_PATH=/path/to/project CODING_REPO=/Users/q284340/Agentic/coding \
node /Users/q284340/Agentic/coding/scripts/batch-lsl-processor.js from-transcripts \
~/.claude/projects/-Users-q284340-Agentic-<project-name>
# Recover specific date range
PROJECT_PATH=/path/to/project CODING_REPO=/Users/q284340/Agentic/coding \
node /Users/q284340/Agentic/coding/scripts/batch-lsl-processor.js retroactive 2024-12-01 2024-12-03The health system now includes transcript monitor health as a verification rule:
# Check monitor health via centralized health file
cat .health/coding-transcript-monitor-health.json | jq '{status, metrics, activity}'
# View in health dashboard
open http://localhost:3032This README provides comprehensive LSL documentation including classification architecture and multi-project handling.



