Skip to content

alonsoir/argus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

525 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ML Defender (aRGus NDR)

Open-source, embedded-ML network detection and response system protecting critical infrastructure from ransomware and DDoS attacks.

Via Appia Quality Council of Wise Ones License: MIT F1=0.9985 Validated Tests: 25/25 + INTEG Pipeline: 6/6 Plugin Integrity Plugin Loader ADR-029 PHASE 3 Crypto arXiv TDH Docs

πŸ“œ Living contracts: Protobuf schema Β· Pipeline configs Β· RAG API


⚠️ Active development branch: feature/phase3-hardening For current state, see that branch. main is tagged v0.3.0-plugin-integrity.

πŸ“„ Preprint

ML Defender (aRGus NDR) is documented in a peer-reviewed preprint published on arXiv cs.CR (April 2026).

ML Defender (aRGus NDR): An Open-Source Embedded ML NIDS for Botnet and Anomalous Traffic Detection in Resource-Constrained Organizations β€” Alonso Isidoro RomΓ‘n

arXiv: arXiv:2604.04952 [cs.CR] DOI: https://doi.org/10.48550/arXiv.2604.04952 Published: 3 April 2026 Β· Draft v15 Β· MIT license Code: https://github.com/alonsoir/argus


🎯 Mission

Democratize enterprise-grade cybersecurity for hospitals, schools, and small organizations that cannot afford commercial solutions. Built to last decades with scientific honesty and methodical development.

Philosophy: Via Appia Quality β€” Systems built like Roman roads, designed to endure.

ML Defender stops ransomware propagation. What comes next is detecting infiltration.


πŸ›‘οΈ Threat Model Scope

ML Defender is a Network Detection and Response (NDR) system. Its guiding principle is network surveillance: every component operates on network traffic β€” packet capture, flow-level feature extraction, ML classification, firewall response.

Physical and removable-media vectors are explicitly out of scope by conscious design decision. File system activity, USB-borne payloads, and removable storage are not monitored. This is an architectural boundary, not an oversight. USB ports in the DMZ should be physically or firmware-disabled by the IT team; internal policy should prohibit removable media on monitored hosts (CIS Controls v8).

Complementary mode with Wazuh: for organizations requiring file integrity monitoring, ML Defender is designed to operate alongside battle-tested tools like Wazuh. The two systems are architecturally orthogonal β€” ML Defender defends the network perimeter; Wazuh defends the host state. Integration via raw TCP event streaming is on the roadmap (FEAT-INT-1).


πŸ“Š Validated Results (DAY 115 β€” 12 April 2026)

Metric Value Notes
F1-score (CTU-13 Neris) 0.9985 Stable across 4 replay runs
Precision 0.9969
Recall 1.0000 Zero missed attacks (FN=0)
True Positives 646 Malicious flows from host 147.32.84.165
False Positives 2 VirtualBox multicast/broadcast artifacts β€” absent in bare-metal
True Negatives 12,075
FPR (ML, Neris evaluation) 0.0002%
FPR (Fast Detector, bigFlows) 6.61% DEBT-FD-001, Path B thresholds
FP reduction (Fast β†’ ML) ~500Γ— ML reduces production blocks to zero on bigFlows
Inference latency 0.24–1.06 ΞΌs Per-class, embedded C++20
Throughput ceiling (virtualized) ~33–38 Mbps VirtualBox NIC limit, not pipeline
Stress test 2,374,845 packets β€” 0 drops, 0 errors 100 Mbps requested, loop=3 bigFlows
RAM (full pipeline) ~1.28 GB Stable under load
Pipeline components 6/6 RUNNING Reproducible from vagrant destroy
Plugin Loader ADR-023 PHASE 2 COMPLETE (5/5) 2a+2b+2c+2d+2e β€” 12/12 INTEG tests PASSED
Plugin Integrity ADR-025 MERGED β€” v0.3.0-plugin-integrity Ed25519 + TOCTOU-safe dlopen, 7/7 SIGN tests
Test suite 25/25 + 4a 3/3 + 4b + 4c 3/3 + 4d 3/3 + 4e 3/3 + SIGN-1..7 DAY 114
PHASE 3 CI gate TEST-PROVISION-1 PASSED 5/5 DAY 115

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       ML Defender Pipeline                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Network Traffic                                                 β”‚
β”‚         ↓                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚  sniffer (C++20) β”‚  eBPF/XDP zero-copy packet capture        β”‚
β”‚  β”‚                  β”‚  ShardedFlowManager (16 shards)           β”‚
β”‚  β”‚                  β”‚  Fast Detector (rule-based heuristics)    β”‚
β”‚  β”‚                  β”‚  plugin-loader PHASE 2c βœ… NORMAL         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚         ↓  ZeroMQ (ChaCha20-Poly1305 encrypted)                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚  ml-detector     β”‚  4Γ— Embedded RandomForest classifiers     β”‚
β”‚  β”‚  (C++20)         β”‚  DDoS: 0.24 ΞΌs | Ransomware: 1.06 ΞΌs     β”‚
β”‚  β”‚                  β”‚  Maximum Threat Wins                      β”‚
β”‚  β”‚                  β”‚  plugin-loader PHASE 2d βœ… post-inference  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚         ↓  ZeroMQ (encrypted)                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚  etcd-server     β”‚  Component registration + JSON config     β”‚
β”‚  β”‚  (C++20)         β”‚  HMAC key management + crypto seeds       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚         ↓                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚ firewall-acl     β”‚  Autonomous blocking via ipset/iptables   β”‚
β”‚  β”‚ agent (C++20)    β”‚  plugin-loader PHASE 2a βœ… NORMAL         β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚         ↓                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚  rag-ingester    β”‚  FAISS + SQLite event ingestion           β”‚
β”‚  β”‚  (C++20)         β”‚  plugin-loader PHASE 2b βœ… READONLY       β”‚
β”‚  β”‚                  β”‚  Anti-poisoning trust model (ADR-028)     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β”‚         ↓                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                            β”‚
β”‚  β”‚  rag-security    β”‚  TinyLlama natural language interface      β”‚
β”‚  β”‚  (C++20+LLM)     β”‚  Local inference β€” no cloud exfiltration  β”‚
β”‚  β”‚                  β”‚  plugin-loader PHASE 2e βœ… READONLY (ADR-029) β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Integration Philosophy

ML Defender is composable, not monolithic. All external integrations use the same transport stack: raw TCP + Protocol Buffers + ChaCha20-Poly1305. No HTTP, no Kafka, no WebSocket. Four reasons:

  1. Deterministic latency (<10ms; no HTTP/Kafka jitter)
  2. Attack surface (no HTTP parsers = no CVE surface; >90% reduction)
  3. No broker = no SPOF (Kafka/Redis incompatible with $150–200 single-node target)
  4. Minimal footprint (no librdkafka, no libcurl, no boost.asio)

FEAT-INT-1 (planned): Wazuh agents emit events via raw TCP β†’ protobuf β†’ ZeroMQ β†’ rag-ingester.


πŸ” Security Properties

Property Status
ChaCha20-Poly1305 AEAD encryption βœ… All inter-component transport
HKDF-SHA256 channel-scoped key derivation βœ… Distinct tx/rx subkeys per channel
libsodium 1.0.19 (compiled from source) βœ… SHA-256 verified
HMAC-SHA256 log integrity βœ… All CSV logs
Autonomous blocking (ipset/iptables) βœ… Millisecond response
Fail-closed design (std::terminate) βœ… All 6 main() functions
Async-signal-safe handlers (DEBT-SIGNAL-001) βœ… write(STDERR_FILENO), verified via objdump
std::atomic shutdown_called_ (DEBT-SIGNAL-002) βœ… DAY 114
D8-pre bidireccional (FIX-C + FIX-D) βœ… NORMAL+nullptrβ†’terminate, 64KB hard limit
Plugin Loader PHASE 2a–2e (all 6 components) βœ… 12/12 INTEG tests PASSED
Plugin integrity Ed25519 (ADR-025) βœ… MERGED main β€” v0.3.0-plugin-integrity
Plugin signing key rotation (DEBT-SIGN-AUTO) βœ… provision.sh check-plugins dev/prod modes
Dev plugins blocked from production (DEBT-HELLO-001) βœ… BUILD_DEV_PLUGINS=OFF + validate-prod-configs gate
systemd hardening (PHASE 3) βœ… Restart=always, LD_PRELOAD=unset, min capabilities
CI gate TEST-PROVISION-1 (5 checks) βœ… pipeline-start dependency β€” DAY 115
ADR-028 RAG Ingestion Trust Model βœ… FAISS anti-poisoning
ADR-024 Noise_IKpsk3 β€” OQs 5..8 closed βœ… Design complete, implementation post-PHASE 3
provision.sh --reset (key rotation) ⏳ DEBT-ADR025-D11 β€” deadline 18 Apr 2026
AppArmor profiles (6 components) ⏳ PHASE 3 Γ­tem 5 β€” complainβ†’enforce
ADR-032 Plugin Distribution Chain (HSM) ⏳ APROBADO β€” YubiKey OpenPGP Ed25519
ADR-033 TPM Measured Boot ⏳ PROPUESTO
Dynamic group key agreement (ADR-024 impl) ⏳ Post-PHASE 3
provision.sh reproducible (destroyβ†’6/6) βœ… DAY 108

πŸ—ΊοΈ Roadmap

βœ… DONE β€” DAY 115 (12 Apr 2026)

  • ADR-024 OQ-5..8 closed β€” Noise_IKpsk3 design complete, implementation unblocked
  • PHASE 3 Γ­tem 1: 6 systemd units (Restart=always, LD_PRELOAD=unset, build-active profiles)
  • PHASE 3 Γ­tem 2: DEBT-SIGN-AUTO β€” provision.sh check-plugins (dev sign / prod verify-only)
  • PHASE 3 Γ­tem 3: DEBT-HELLO-001 β€” BUILD_DEV_PLUGINS=OFF + production JSONs cleaned (bug: 4 components had active:true)
  • PHASE 3 Γ­tem 4: TEST-PROVISION-1 CI gate β€” 5 checks, pipeline-start dependency

βœ… DONE β€” DAY 114 (11 Apr 2026)

  • ADR-025: Plugin Integrity Ed25519 + TOCTOU-safe dlopen β€” MERGED main πŸŽ‰
  • Tag: v0.3.0-plugin-integrity
  • TEST-INTEG-4d: ml-detector PHASE 2d, 3/3 PASSED
  • DEBT-SIGNAL-001/002: async-signal-safe handlers + atomic
  • arXiv Replace v15 submitted
  • ADR-032: Plugin Distribution Chain (YubiKey HSM) β€” APROBADO
  • PHASE 3 branch opened: feature/phase3-hardening

βœ… DONE β€” DAY 111–113

  • FIX-C/D: D8-pre bidireccional + MAX_PLUGIN_PAYLOAD_SIZE
  • TEST-INTEG-4c/4e: 3/3 PASSED
  • PHASE 2d/2e: ml-detector + rag-security plugin integration
  • arXiv:2604.04952 [cs.CR] PUBLICADO πŸŽ‰

πŸ”œ NEXT β€” PHASE 3 remaining (feature/phase3-hardening)

  • DEBT-ADR025-D11: provision.sh --reset (key rotation without auto-signing) β€” deadline 18 Apr
  • TEST-PROVISION-1 checks 6+7: file permissions + JSON/plugin consistency
  • AppArmor profiles: 6 components β€” complain β†’ audit β†’ enforce

P3 β€” Post-PHASE 3

  • ADR-024 Noise_IKpsk3 implementation (OQs closed, ready to build)
  • DEBT-TOOLS-001: synthetic injectors + PluginLoader integration
  • Stress test CTU-13 Neris with real pipeline (F1=0.9985 with plugins active)
  • ADR-032 Fase A: manifest JSON format + multi-key loader + revocation
  • ADR-032 Fase B: YubiKey OpenPGP signing (hardware acquisition)
  • ADR-030 activation: AppArmor enforcing + Raspberry Pi hardware
  • etcd legacy refactoring (etcd = config distribution + heartbeat only)
  • ADR-031 spike: seL4/Genode (2–3 weeks)
  • BARE-METAL stress test

πŸš€ Quick Start

git clone https://github.com/alonsoir/argus.git
cd argus
make up
make all
make pipeline-start
make pipeline-status

F1 Validation

make pipeline-stop && make logs-lab-clean && make pipeline-start && sleep 15
make test-replay-neris
python3 scripts/calculate_f1_neris.py logs/lab/sniffer.log --total-events 19135

CI Gate (PHASE 3)

make test-provision-1   # 5 checks: keys, plugin sigs, prod configs, symlinks, systemd units
make validate-prod-configs   # ensure no dev plugins in production JSON configs

🧠 Consejo de Sabios β€” Multi-Model Peer Review

Seven large language models serve as intellectual co-reviewers across all development phases:

Claude (Anthropic) Β· Grok (xAI) Β· ChatGPT (OpenAI) Β· DeepSeek Β· Qwen (Alibaba) Β· Gemini (Google) Β· Parallel.ai

Methodology: structured disagreement. Problems must be demonstrated with compilable tests or mathematics before fixes are proposed. Documented in the preprint Β§6 (Consejo de Sabios / Test-Driven Hardening).


πŸ—ΊοΈ Milestones

  • βœ… DAY 106: Paper Draft v11 + arXiv SUBMITTED
  • βœ… DAY 107: MAC failure root cause resolved
  • βœ… DAY 108: provision.sh reproducible Β· ADR-026/027 committed
  • βœ… DAY 109: PHASE 2b CLOSED Β· TEST-INTEG-4b Β· Paper v12 Β· ADR-028 APROBADO
  • βœ… DAY 110: PluginMode + PHASE 2c CLOSED Β· Paper v13 Β· 6/6 RUNNING
  • βœ… DAY 111: arXiv:2604.04952 PUBLICADO πŸŽ‰ Β· FIX-C/D Β· PHASE 2d Β· ADR-029
  • βœ… DAY 112: PHASE 2e CLOSED Β· TEST-INTEG-4e 3/3 Β· ADR-030/031 documented
  • βœ… DAY 113: ADR-025 IMPLEMENTED Β· 11/11 tests Β· Paper v14
  • βœ… DAY 114: ADR-025 MERGED β€” v0.3.0-plugin-integrity πŸŽ‰ Β· TEST-INTEG-4d Β· Signal safety Β· arXiv v15 Β· ADR-032 APROBADO
  • βœ… DAY 115: PHASE 3 Γ­tems 1-4 DONE πŸŽ‰ Β· ADR-024 OQs 5..8 closed Β· TEST-PROVISION-1 CI gate Β· DEBT-HELLO-001 (bug: 4Γ—active:true fixed)

πŸ“„ License

MIT License β€” See LICENSE

Via Appia Quality πŸ›οΈ β€” Built to last decades.

About

Distributed C++20 microservices architecture for real-time detection and correlation of DDoS and ransomware activity. Deterministic ingestion, idempotent replay, FAISS-based semantic indexing, and zero-coordination incident correlation via derived trace identifiers.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors