Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
f2ca750
Add MCP Database Discovery Agent (initial commit)
renecannao Jan 13, 2026
9d6a217
Enhance Rich CLI with configurable LLM chat path and better tracing
renecannao Jan 13, 2026
01c182c
Add stdio MCP bridge for Claude Code integration
renecannao Jan 13, 2026
4491f3c
Add debug logging to MCP bridge for troubleshooting
renecannao Jan 13, 2026
fc6b462
Fix: unwrap ProxySQL nested response format
renecannao Jan 13, 2026
6d83ff1
Fix: unwrap ProxySQL response format in MCP tools and fix config syntax
renecannao Jan 13, 2026
edac8eb
Fix: Add verbose logging and fix stdout buffering issue in MCP stdio …
renecannao Jan 13, 2026
f560698
Fix: Replace stdout with truly unbuffered wrapper to prevent response…
renecannao Jan 13, 2026
55dd5ba
Debug: Add detailed stdout write logging to troubleshoot Claude Code …
renecannao Jan 13, 2026
2b51346
Fix: Wrap tool results in TextContent format for MCP protocol compliance
renecannao Jan 13, 2026
ad54f92
Revert: Simplify tool handlers back to original pass-through
renecannao Jan 13, 2026
f4a4af8
Fix: Write directly to stdout.buffer to bypass TextIOWrapper issues
renecannao Jan 13, 2026
23e5efc
Test: Don't redirect sys.stderr, write logs directly to file
renecannao Jan 13, 2026
a47567f
Revert: Restore original bridge completely
renecannao Jan 13, 2026
77099f7
Debug: Add minimal logging to track stdout writes and tool calls
renecannao Jan 13, 2026
9b4aea0
Fix: Wrap tools/call responses in MCP-compliant content format
renecannao Jan 13, 2026
49e964b
Fix: Make ProxySQL MCP server return MCP-compliant tool responses
renecannao Jan 13, 2026
2ceaac0
docs: Add logging section to bridge README
renecannao Jan 13, 2026
606fe2e
Fix: Address code review feedback from gemini-code-assist
renecannao Jan 13, 2026
304bb5a
Merge pull request #11 from ProxySQL/v3.1-MCP1_discovery_2
renecannao Jan 13, 2026
1d04614
Fix: Address code review feedback from coderabbitai and gemini-code-a…
renecannao Jan 13, 2026
f852900
Fix: Correct MCP catalog JSON parsing to handle special characters
renecannao Jan 13, 2026
14de472
Add multi-agent database discovery system
renecannao Jan 14, 2026
d73ce0c
Add headless database discovery scripts
renecannao Jan 14, 2026
b627f83
Refactor: Reorganize headless discovery scripts to dedicated directory
renecannao Jan 14, 2026
fdee58a
Add comprehensive database discovery outputs and enhance headless dis…
renecannao Jan 16, 2026
d9346fe
feat: Add AI features manager foundation
renecannao Jan 16, 2026
147a059
feat: Add NL2SQL converter with hybrid LLM support
renecannao Jan 16, 2026
bc4fff1
feat: Add NL2SQL query interception in MySQL_Session
renecannao Jan 16, 2026
6dd2613
Move discovery docs to examples directory
renecannao Jan 16, 2026
4f45c25
docs: Add comprehensive doxygen comments to NL2SQL headers and LLM_Cl…
renecannao Jan 16, 2026
af68f34
fix: Add missing verbosity level to proxy_debug call in Anomaly_Detector
renecannao Jan 16, 2026
a61f709
test: Add comprehensive TAP unit tests for NL2SQL
renecannao Jan 16, 2026
aee9c31
test: Add E2E test script for NL2SQL
renecannao Jan 16, 2026
e2d71ec
docs: Add comprehensive NL2SQL user and developer documentation
renecannao Jan 16, 2026
6d2b0ab
test: Fix vector keyword conflict in NL2SQL unit tests
renecannao Jan 16, 2026
eccb2bf
test: Add integration tests for NL2SQL
renecannao Jan 16, 2026
83c3983
chore: Remove stale database discovery report files from root
renecannao Jan 16, 2026
3f44229
feat: Add MCP AI Tool Handler for NL2SQL with test script
renecannao Jan 16, 2026
52a70b0
feat: Implement AI-based Anomaly Detection for ProxySQL
renecannao Jan 16, 2026
0be9715
test: Add comprehensive tests and documentation for Anomaly Detection
renecannao Jan 16, 2026
fec7d64
feat: Implement NL2SQL vector cache with GenAI embedding generation
renecannao Jan 16, 2026
f226c0e
feat: Implement embedding-based threat similarity for Anomaly Detection
renecannao Jan 16, 2026
1c7cd8c
fix: Correct PROXY_DEBUG constant from AI_GENERIC to GENAI
renecannao Jan 16, 2026
4b0cb9d
test: Add vector features unit test
renecannao Jan 16, 2026
f5c18fd
scripts: Add threat pattern documentation script
renecannao Jan 16, 2026
782f6cb
feat: Implement threat pattern management and improve statistics
renecannao Jan 16, 2026
637b2a6
feat: Implement NL2SQL vector cache and complete Anomaly threat patte…
renecannao Jan 16, 2026
1a8b406
fix: Correct SQL query for AI variables in vector features test
renecannao Jan 16, 2026
3b7033f
Add vector features verification script
renecannao Jan 16, 2026
c5a7fc3
Add external LLM setup guide and live testing script
renecannao Jan 16, 2026
897d306
Refactor: Simplify NL2SQL to use only generic providers
renecannao Jan 16, 2026
36b1122
feat: Improve SQL validation with multi-factor scoring
renecannao Jan 16, 2026
40b2608
feat: Add configuration validation to AI_Features_Manager
renecannao Jan 16, 2026
45e592b
feat: Add structured error messages with context to NL2SQL
renecannao Jan 16, 2026
d0dc36a
feat: Add structured logging with timing and request IDs
renecannao Jan 16, 2026
8f38b8a
feat: Add exponential backoff retry for transient LLM failures
renecannao Jan 16, 2026
49092e9
test: Add unit tests for AI configuration validation
renecannao Jan 16, 2026
8a6b748
docs: Update NL2SQL documentation for v0.2.0 features
renecannao Jan 16, 2026
3032dff
test: Add NL2SQL internal functionality unit tests
renecannao Jan 16, 2026
ae4200d
Enhance AI features with improved validation, memory safety, error ha…
renecannao Jan 16, 2026
7665b3b
Merge pull request #12 from ProxySQL/v3.1-MCP1_discovery_cc1
renecannao Jan 16, 2026
2888ee3
Fix gemini-code-assist recommendations and implement comprehensive an…
renecannao Jan 16, 2026
3f25fb6
Enhance anomaly detector unit tests with additional edge case coverage
renecannao Jan 16, 2026
527bfed
fix: Migrate AI variables to GenAI module for proper architecture
renecannao Jan 16, 2026
a7dac5e
feat: Make NL2SQL use async GenAI path instead of blocking calls
renecannao Jan 16, 2026
349320a
docs: Fix NL2SQL documentation with genai variables and async archite…
renecannao Jan 17, 2026
51fd51e
fix: Add missing GenAI_Thread.h include and fix variables reference
renecannao Jan 17, 2026
1ea6790
fix: Populate runtime_global_variables for GenAI variables on startup
renecannao Jan 17, 2026
4018a0a
fix: Follow MCP pattern for GenAI variables runtime table population
renecannao Jan 17, 2026
6ffb59b
fix: Use db parameter instead of hardcoded admindb in GenAI database_…
renecannao Jan 17, 2026
1eb42c5
fix: Add GenAI variables to runtime_global_variables population
renecannao Jan 17, 2026
3fe8a48
Fix genai variable handling and add API key masking
renecannao Jan 17, 2026
a3f0bad
feat: Convert NL2SQL to generic LLM bridge
renecannao Jan 17, 2026
5afb71c
docs: Rename NL2SQL documentation to LLM Bridge
renecannao Jan 17, 2026
1193a55
docs: Remove Version History section from LLM Bridge README
renecannao Jan 17, 2026
61ad3c4
Merge pull request #13 from ProxySQL/v3.1-MCP1_genAI
renecannao Jan 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
600 changes: 600 additions & 0 deletions doc/ANOMALY_DETECTION/API.md

Large diffs are not rendered by default.

509 changes: 509 additions & 0 deletions doc/ANOMALY_DETECTION/ARCHITECTURE.md

Large diffs are not rendered by default.

296 changes: 296 additions & 0 deletions doc/ANOMALY_DETECTION/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
# Anomaly Detection - Security Threat Detection for ProxySQL

## Overview

The Anomaly Detection module provides real-time security threat detection for ProxySQL using a multi-stage analysis pipeline. It identifies SQL injection attacks, unusual query patterns, rate limiting violations, and statistical anomalies.

## Features

- **Multi-Stage Detection Pipeline**: 5-layer analysis for comprehensive threat detection
- **SQL Injection Pattern Detection**: Regex-based and keyword-based detection
- **Query Normalization**: Advanced normalization for pattern matching
- **Rate Limiting**: Per-user and per-host query rate tracking
- **Statistical Anomaly Detection**: Z-score based outlier detection
- **Configurable Blocking**: Auto-block or log-only modes
- **Prometheus Metrics**: Native monitoring integration

## Quick Start

### 1. Enable Anomaly Detection

```sql
-- Via admin interface
SET genai-anomaly_enabled='true';
```

### 2. Configure Detection

```sql
-- Set risk threshold (0-100)
SET genai-anomaly_risk_threshold='70';

-- Set rate limit (queries per minute)
SET genai-anomaly_rate_limit='100';

-- Enable auto-blocking
SET genai-anomaly_auto_block='true';

-- Or enable log-only mode
SET genai-anomaly_log_only='false';
```

### 3. Monitor Detection Results

```sql
-- Check statistics
SHOW STATUS LIKE 'ai_detected_anomalies';
SHOW STATUS LIKE 'ai_blocked_queries';

-- View Prometheus metrics
curl http://localhost:4200/metrics | grep proxysql_ai
```

## Configuration

### Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `genai-anomaly_enabled` | true | Enable/disable anomaly detection |
| `genai-anomaly_risk_threshold` | 70 | Risk score threshold (0-100) for blocking |
| `genai-anomaly_rate_limit` | 100 | Max queries per minute per user/host |
| `genai-anomaly_similarity_threshold` | 85 | Similarity threshold for embedding matching (0-100) |
| `genai-anomaly_auto_block` | true | Automatically block suspicious queries |
| `genai-anomaly_log_only` | false | Log anomalies without blocking |

### Status Variables

| Variable | Description |
|----------|-------------|
| `ai_detected_anomalies` | Total number of anomalies detected |
| `ai_blocked_queries` | Total number of queries blocked |

## Detection Methods

### 1. SQL Injection Pattern Detection

Detects common SQL injection patterns using regex and keyword matching:

**Patterns Detected:**
- OR/AND tautologies: `OR 1=1`, `AND 1=1`
- Quote sequences: `'' OR ''=''`
- UNION SELECT: `UNION SELECT`
- DROP TABLE: `DROP TABLE`
- Comment injection: `--`, `/* */`
- Hex encoding: `0x414243`
- CONCAT attacks: `CONCAT(0x41, 0x42)`
- File operations: `INTO OUTFILE`, `LOAD_FILE`
- Timing attacks: `SLEEP()`, `BENCHMARK()`

**Example:**
```sql
-- This query will be blocked:
SELECT * FROM users WHERE username='admin' OR 1=1--' AND password='xxx'
```

### 2. Query Normalization

Normalizes queries for consistent pattern matching:
- Case normalization
- Comment removal
- Literal replacement
- Whitespace normalization

**Example:**
```sql
-- Input:
SELECT * FROM users WHERE name='John' -- comment

-- Normalized:
select * from users where name=?
```

### 3. Rate Limiting

Tracks query rates per user and host:
- Time window: 1 hour
- Tracks: Query count, last query time
- Action: Block when limit exceeded

**Configuration:**
```sql
SET ai_anomaly_rate_limit='100';
```

### 4. Statistical Anomaly Detection

Uses Z-score analysis to detect outliers:
- Query execution time
- Result set size
- Query frequency
- Schema access patterns

**Example:**
```sql
-- Unusually large result set:
SELECT * FROM huge_table -- May trigger statistical anomaly
```

### 5. Embedding-based Similarity

(Framework for future implementation)
Detects similarity to known threat patterns using vector embeddings.

## Examples

### SQL Injection Detection

```sql
-- Blocked: OR 1=1 tautology
mysql> SELECT * FROM users WHERE username='admin' OR 1=1--';
ERROR 1313 (HY000): Query blocked: SQL injection pattern detected

-- Blocked: UNION SELECT
mysql> SELECT name FROM products WHERE id=1 UNION SELECT password FROM users;
ERROR 1313 (HY000): Query blocked: SQL injection pattern detected

-- Blocked: Comment injection
mysql> SELECT * FROM users WHERE id=1-- AND password='xxx';
ERROR 1313 (HY000): Query blocked: SQL injection pattern detected
```

### Rate Limiting

```sql
-- Set low rate limit for testing
SET ai_anomaly_rate_limit='10';

-- After 10 queries in 1 minute:
mysql> SELECT 1;
ERROR 1313 (HY000): Query blocked: Rate limit exceeded for user 'app_user'
```

### Statistical Anomaly

```sql
-- Unusual query pattern detected
mysql> SELECT * FROM users CROSS JOIN orders CROSS JOIN products;
-- May trigger: Statistical anomaly detected (high result count)
```

## Log-Only Mode

For monitoring without blocking:

```sql
-- Enable log-only mode
SET ai_anomaly_log_only='true';
SET ai_anomaly_auto_block='false';

-- Queries will be logged but not blocked
-- Monitor via:
SHOW STATUS LIKE 'ai_detected_anomalies';
```

## Monitoring

### Prometheus Metrics

```bash
# View AI metrics
curl http://localhost:4200/metrics | grep proxysql_ai

# Output includes:
# proxysql_ai_detected_anomalies_total
# proxysql_ai_blocked_queries_total
```

### Admin Interface

```sql
-- Check detection statistics
SELECT * FROM stats_mysql_global WHERE variable_name LIKE 'ai_%';

-- View current configuration
SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_anomaly_%';
```

## Troubleshooting

### Queries Being Blocked Incorrectly

1. **Check if legitimate queries match patterns**:
- Review the SQL injection patterns list
- Consider log-only mode for testing

2. **Adjust risk threshold**:
```sql
SET ai_anomaly_risk_threshold='80'; -- Higher threshold
```

3. **Adjust rate limit**:
```sql
SET ai_anomaly_rate_limit='200'; -- Higher limit
```

### False Positives

If legitimate queries are being flagged:

1. Enable log-only mode to investigate:
```sql
SET ai_anomaly_log_only='true';
SET ai_anomaly_auto_block='false';
```

2. Check logs for specific patterns:
```bash
tail -f proxysql.log | grep "Anomaly:"
```

3. Adjust configuration based on findings

### No Anomalies Detected

If detection seems inactive:

1. Verify anomaly detection is enabled:
```sql
SELECT * FROM runtime_mysql_servers WHERE variable_name='ai_anomaly_enabled';
```

2. Check logs for errors:
```bash
tail -f proxysql.log | grep "Anomaly:"
```

3. Verify AI features are initialized:
```bash
grep "AI_Features" proxysql.log
```

## Security Considerations

1. **Anomaly Detection is a Defense in Depth**: It complements, not replaces, proper security practices
2. **Pattern Evasion Possible**: Attackers may evolve techniques; regular updates needed
3. **Performance Impact**: Detection adds minimal overhead (~1-2ms per query)
4. **Log Monitoring**: Regular review of anomaly logs recommended
5. **Tune for Your Workload**: Adjust thresholds based on your query patterns

## Performance

- **Detection Overhead**: ~1-2ms per query
- **Memory Usage**: ~100KB for user statistics
- **CPU Usage**: Minimal (regex-based detection)

## API Reference

See `API.md` for complete API documentation.

## Architecture

See `ARCHITECTURE.md` for detailed architecture information.

## Testing

See `TESTING.md` for testing guide and examples.
Loading