Open-source, embedded-ML network detection and response system protecting critical infrastructure from ransomware and DDoS attacks.
π Living contracts: Protobuf schema Β· Pipeline configs Β· RAG API
feature/phase3-hardening
For current state, see that branch. main is tagged v0.3.0-plugin-integrity.
ML Defender (aRGus NDR) is documented in a peer-reviewed preprint published on arXiv cs.CR (April 2026).
ML Defender (aRGus NDR): An Open-Source Embedded ML NIDS for Botnet and Anomalous Traffic Detection in Resource-Constrained Organizations β Alonso Isidoro RomΓ‘n
arXiv: arXiv:2604.04952 [cs.CR] DOI: https://doi.org/10.48550/arXiv.2604.04952 Published: 3 April 2026 Β· Draft v15 Β· MIT license Code: https://github.com/alonsoir/argus
Democratize enterprise-grade cybersecurity for hospitals, schools, and small organizations that cannot afford commercial solutions. Built to last decades with scientific honesty and methodical development.
Philosophy: Via Appia Quality β Systems built like Roman roads, designed to endure.
ML Defender stops ransomware propagation. What comes next is detecting infiltration.
ML Defender is a Network Detection and Response (NDR) system. Its guiding principle is network surveillance: every component operates on network traffic β packet capture, flow-level feature extraction, ML classification, firewall response.
Physical and removable-media vectors are explicitly out of scope by conscious design decision. File system activity, USB-borne payloads, and removable storage are not monitored. This is an architectural boundary, not an oversight. USB ports in the DMZ should be physically or firmware-disabled by the IT team; internal policy should prohibit removable media on monitored hosts (CIS Controls v8).
Complementary mode with Wazuh: for organizations requiring file integrity monitoring, ML Defender is designed to operate alongside battle-tested tools like Wazuh. The two systems are architecturally orthogonal β ML Defender defends the network perimeter; Wazuh defends the host state. Integration via raw TCP event streaming is on the roadmap (FEAT-INT-1).
| Metric | Value | Notes |
|---|---|---|
| F1-score (CTU-13 Neris) | 0.9985 | Stable across 4 replay runs |
| Precision | 0.9969 | |
| Recall | 1.0000 | Zero missed attacks (FN=0) |
| True Positives | 646 | Malicious flows from host 147.32.84.165 |
| False Positives | 2 | VirtualBox multicast/broadcast artifacts β absent in bare-metal |
| True Negatives | 12,075 | |
| FPR (ML, Neris evaluation) | 0.0002% | |
| FPR (Fast Detector, bigFlows) | 6.61% | DEBT-FD-001, Path B thresholds |
| FP reduction (Fast β ML) | ~500Γ | ML reduces production blocks to zero on bigFlows |
| Inference latency | 0.24β1.06 ΞΌs | Per-class, embedded C++20 |
| Throughput ceiling (virtualized) | ~33β38 Mbps | VirtualBox NIC limit, not pipeline |
| Stress test | 2,374,845 packets β 0 drops, 0 errors | 100 Mbps requested, loop=3 bigFlows |
| RAM (full pipeline) | ~1.28 GB | Stable under load |
| Pipeline components | 6/6 RUNNING | Reproducible from vagrant destroy |
| Plugin Loader | ADR-023 PHASE 2 COMPLETE (5/5) | 2a+2b+2c+2d+2e β 12/12 INTEG tests PASSED |
| Plugin Integrity | ADR-025 MERGED β v0.3.0-plugin-integrity | Ed25519 + TOCTOU-safe dlopen, 7/7 SIGN tests |
| Test suite | 25/25 + 4a 3/3 + 4b + 4c 3/3 + 4d 3/3 + 4e 3/3 + SIGN-1..7 | DAY 114 |
| PHASE 3 CI gate | TEST-PROVISION-1 PASSED 5/5 | DAY 115 |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ML Defender Pipeline β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Network Traffic β
β β β
β ββββββββββββββββββββ β
β β sniffer (C++20) β eBPF/XDP zero-copy packet capture β
β β β ShardedFlowManager (16 shards) β
β β β Fast Detector (rule-based heuristics) β
β β β plugin-loader PHASE 2c β
NORMAL β
β ββββββββββββββββββββ β
β β ZeroMQ (ChaCha20-Poly1305 encrypted) β
β ββββββββββββββββββββ β
β β ml-detector β 4Γ Embedded RandomForest classifiers β
β β (C++20) β DDoS: 0.24 ΞΌs | Ransomware: 1.06 ΞΌs β
β β β Maximum Threat Wins β
β β β plugin-loader PHASE 2d β
post-inference β
β ββββββββββββββββββββ β
β β ZeroMQ (encrypted) β
β ββββββββββββββββββββ β
β β etcd-server β Component registration + JSON config β
β β (C++20) β HMAC key management + crypto seeds β
β ββββββββββββββββββββ β
β β β
β ββββββββββββββββββββ β
β β firewall-acl β Autonomous blocking via ipset/iptables β
β β agent (C++20) β plugin-loader PHASE 2a β
NORMAL β
β ββββββββββββββββββββ β
β β β
β ββββββββββββββββββββ β
β β rag-ingester β FAISS + SQLite event ingestion β
β β (C++20) β plugin-loader PHASE 2b β
READONLY β
β β β Anti-poisoning trust model (ADR-028) β
β ββββββββββββββββββββ β
β β β
β ββββββββββββββββββββ β
β β rag-security β TinyLlama natural language interface β
β β (C++20+LLM) β Local inference β no cloud exfiltration β
β β β plugin-loader PHASE 2e β
READONLY (ADR-029) β
β ββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ML Defender is composable, not monolithic. All external integrations use the same transport stack: raw TCP + Protocol Buffers + ChaCha20-Poly1305. No HTTP, no Kafka, no WebSocket. Four reasons:
- Deterministic latency (<10ms; no HTTP/Kafka jitter)
- Attack surface (no HTTP parsers = no CVE surface; >90% reduction)
- No broker = no SPOF (Kafka/Redis incompatible with $150β200 single-node target)
- Minimal footprint (no librdkafka, no libcurl, no boost.asio)
FEAT-INT-1 (planned): Wazuh agents emit events via raw TCP β protobuf β ZeroMQ β rag-ingester.
| Property | Status |
|---|---|
| ChaCha20-Poly1305 AEAD encryption | β All inter-component transport |
| HKDF-SHA256 channel-scoped key derivation | β Distinct tx/rx subkeys per channel |
| libsodium 1.0.19 (compiled from source) | β SHA-256 verified |
| HMAC-SHA256 log integrity | β All CSV logs |
| Autonomous blocking (ipset/iptables) | β Millisecond response |
| Fail-closed design (std::terminate) | β All 6 main() functions |
| Async-signal-safe handlers (DEBT-SIGNAL-001) | β write(STDERR_FILENO), verified via objdump |
| std::atomic shutdown_called_ (DEBT-SIGNAL-002) | β DAY 114 |
| D8-pre bidireccional (FIX-C + FIX-D) | β NORMAL+nullptrβterminate, 64KB hard limit |
| Plugin Loader PHASE 2aβ2e (all 6 components) | β 12/12 INTEG tests PASSED |
| Plugin integrity Ed25519 (ADR-025) | β MERGED main β v0.3.0-plugin-integrity |
| Plugin signing key rotation (DEBT-SIGN-AUTO) | β provision.sh check-plugins dev/prod modes |
| Dev plugins blocked from production (DEBT-HELLO-001) | β BUILD_DEV_PLUGINS=OFF + validate-prod-configs gate |
| systemd hardening (PHASE 3) | β Restart=always, LD_PRELOAD=unset, min capabilities |
| CI gate TEST-PROVISION-1 (5 checks) | β pipeline-start dependency β DAY 115 |
| ADR-028 RAG Ingestion Trust Model | β FAISS anti-poisoning |
| ADR-024 Noise_IKpsk3 β OQs 5..8 closed | β Design complete, implementation post-PHASE 3 |
| provision.sh --reset (key rotation) | β³ DEBT-ADR025-D11 β deadline 18 Apr 2026 |
| AppArmor profiles (6 components) | β³ PHASE 3 Γtem 5 β complainβenforce |
| ADR-032 Plugin Distribution Chain (HSM) | β³ APROBADO β YubiKey OpenPGP Ed25519 |
| ADR-033 TPM Measured Boot | β³ PROPUESTO |
| Dynamic group key agreement (ADR-024 impl) | β³ Post-PHASE 3 |
| provision.sh reproducible (destroyβ6/6) | β DAY 108 |
- ADR-024 OQ-5..8 closed β Noise_IKpsk3 design complete, implementation unblocked
- PHASE 3 Γtem 1: 6 systemd units (Restart=always, LD_PRELOAD=unset, build-active profiles)
- PHASE 3 Γtem 2: DEBT-SIGN-AUTO β provision.sh check-plugins (dev sign / prod verify-only)
- PHASE 3 Γtem 3: DEBT-HELLO-001 β BUILD_DEV_PLUGINS=OFF + production JSONs cleaned (bug: 4 components had active:true)
- PHASE 3 Γtem 4: TEST-PROVISION-1 CI gate β 5 checks, pipeline-start dependency
- ADR-025: Plugin Integrity Ed25519 + TOCTOU-safe dlopen β MERGED main π
- Tag: v0.3.0-plugin-integrity
- TEST-INTEG-4d: ml-detector PHASE 2d, 3/3 PASSED
- DEBT-SIGNAL-001/002: async-signal-safe handlers + atomic
- arXiv Replace v15 submitted
- ADR-032: Plugin Distribution Chain (YubiKey HSM) β APROBADO
- PHASE 3 branch opened:
feature/phase3-hardening
- FIX-C/D: D8-pre bidireccional + MAX_PLUGIN_PAYLOAD_SIZE
- TEST-INTEG-4c/4e: 3/3 PASSED
- PHASE 2d/2e: ml-detector + rag-security plugin integration
- arXiv:2604.04952 [cs.CR] PUBLICADO π
- DEBT-ADR025-D11: provision.sh --reset (key rotation without auto-signing) β deadline 18 Apr
- TEST-PROVISION-1 checks 6+7: file permissions + JSON/plugin consistency
- AppArmor profiles: 6 components β complain β audit β enforce
- ADR-024 Noise_IKpsk3 implementation (OQs closed, ready to build)
- DEBT-TOOLS-001: synthetic injectors + PluginLoader integration
- Stress test CTU-13 Neris with real pipeline (F1=0.9985 with plugins active)
- ADR-032 Fase A: manifest JSON format + multi-key loader + revocation
- ADR-032 Fase B: YubiKey OpenPGP signing (hardware acquisition)
- ADR-030 activation: AppArmor enforcing + Raspberry Pi hardware
- etcd legacy refactoring (etcd = config distribution + heartbeat only)
- ADR-031 spike: seL4/Genode (2β3 weeks)
- BARE-METAL stress test
git clone https://github.com/alonsoir/argus.git
cd argus
make up
make all
make pipeline-start
make pipeline-statusmake pipeline-stop && make logs-lab-clean && make pipeline-start && sleep 15
make test-replay-neris
python3 scripts/calculate_f1_neris.py logs/lab/sniffer.log --total-events 19135make test-provision-1 # 5 checks: keys, plugin sigs, prod configs, symlinks, systemd units
make validate-prod-configs # ensure no dev plugins in production JSON configsSeven large language models serve as intellectual co-reviewers across all development phases:
Claude (Anthropic) Β· Grok (xAI) Β· ChatGPT (OpenAI) Β· DeepSeek Β· Qwen (Alibaba) Β· Gemini (Google) Β· Parallel.ai
Methodology: structured disagreement. Problems must be demonstrated with compilable tests or mathematics before fixes are proposed. Documented in the preprint Β§6 (Consejo de Sabios / Test-Driven Hardening).
- β DAY 106: Paper Draft v11 + arXiv SUBMITTED
- β DAY 107: MAC failure root cause resolved
- β DAY 108: provision.sh reproducible Β· ADR-026/027 committed
- β DAY 109: PHASE 2b CLOSED Β· TEST-INTEG-4b Β· Paper v12 Β· ADR-028 APROBADO
- β DAY 110: PluginMode + PHASE 2c CLOSED Β· Paper v13 Β· 6/6 RUNNING
- β DAY 111: arXiv:2604.04952 PUBLICADO π Β· FIX-C/D Β· PHASE 2d Β· ADR-029
- β DAY 112: PHASE 2e CLOSED Β· TEST-INTEG-4e 3/3 Β· ADR-030/031 documented
- β DAY 113: ADR-025 IMPLEMENTED Β· 11/11 tests Β· Paper v14
- β DAY 114: ADR-025 MERGED β v0.3.0-plugin-integrity π Β· TEST-INTEG-4d Β· Signal safety Β· arXiv v15 Β· ADR-032 APROBADO
- β DAY 115: PHASE 3 Γtems 1-4 DONE π Β· ADR-024 OQs 5..8 closed Β· TEST-PROVISION-1 CI gate Β· DEBT-HELLO-001 (bug: 4Γactive:true fixed)
MIT License β See LICENSE
Via Appia Quality ποΈ β Built to last decades.