Name	Name	Last commit message	Last commit date
Latest commit History 613 Commits
.github/workflows	.github/workflows
avatars	avatars
common-rag-ingester	common-rag-ingester
common	common
contract-validation/day48	contract-validation/day48
contrib	contrib
crypto-transport	crypto-transport
docs	docs
etcd-client	etcd-client
etcd-server	etcd-server
firewall-acl-agent	firewall-acl-agent
libs/seed-client	libs/seed-client
ml-detector	ml-detector
ml-training	ml-training
models	models
plugin-loader	plugin-loader
plugins	plugins
protobuf	protobuf
rag-ingester	rag-ingester
rag	rag
scripts	scripts
shared/indices	shared/indices
site	site
sniffer	sniffer
third_party	third_party
tools	tools
tsan-reports	tsan-reports
.gitguardian.yaml	.gitguardian.yaml
.gitignore	.gitignore
.gitmodules	.gitmodules
CLAUDE.md	CLAUDE.md
LICENSE.txt	LICENSE.txt
Makefile	Makefile
README.md	README.md
Vagrantfile	Vagrantfile
cleanup_adrs.py	cleanup_adrs.py
commit-message.txt	commit-message.txt
mkdocs.yml	mkdocs.yml

ML Defender (aRGus NDR)

Open-source, embedded-ML network detection and response system protecting critical infrastructure from ransomware and DDoS attacks.

📜 Living contracts: Protobuf schema · Pipeline configs · RAG API

✅ main is tagged v0.5.2-hardened — DAY 125-127 debt closure complete. 9 debts closed across 3 days. PRE-PRODUCTION: do not deploy in hospitals until ACRL (DEBT-PENTESTER-LOOP-001) is complete.

Estado actual — DAY 127 (2026-04-23)

Tag activo: v0.5.2-hardened | Branch activa: main (limpio)

Pipeline

6/6 componentes RUNNING con resolve_config() activo
make test-all: ALL TESTS COMPLETE (desde VM fría: vagrant halt → make up → make bootstrap → make test-all)
TEST-PROVISION-1: 8/8 OK

Hitos DAY 125-127

9 deudas cerradas en 3 días consecutivos
Tag v0.5.2-hardened mergeado a main
Hallazgo crítico DAY 126: fs::is_symlink(resolved) es inútil post-weakly_canonical(). lstat() sobre el path original es la única defensa correcta para material criptográfico.
Hallazgo arquitectónico DAY 127: lexically_normal() vs weakly_canonical() — dos herramientas para dos casos de seguridad distintos. Nueva primitiva resolve_config() para paridad dev/prod via symlinks.
Consejo 8/8 DAY 127: Taxonomía safe_path formalizada. Pregunta crítica FEDER: ¿NDR standalone o federación? Clarificar con Andrés Caro Lindo antes de julio.

Deuda técnica abierta

Ver docs/BACKLOG.md para detalle completo.

Deuda	Prioridad	Target
DEBT-SNYK-WEB-VERIFICATION-001	🟡 Media	DAY 128
DEBT-PROPERTY-TESTING-PATTERN-001	🟡 Media	DAY 128
DEBT-SAFE-PATH-TAXONOMY-DOC-001	🟡 Media	DAY 128
DEBT-PROVISION-PORTABILITY-001	🟢 Media	DAY 128

Próxima frontera (post-deuda)

DEBT-PENTESTER-LOOP-001 — ACRL: Caldera → eBPF capture → XGBoost retrain → Ed25519 sign → hot-swap

⚠️ NO desplegar en producción hasta

DEBT-PENTESTER-LOOP-001 completado (datos reales ACRL)
ADR-036 (Formal Verification Baseline)

🏗️ Tres variantes del pipeline

Variante	Estado	Descripción
aRGus-dev	✅ Activa (`main`)	x86-debug, imagen Vagrant completa, build-debug. Para investigación y desarrollo diario.
aRGus-production	🟡 Pendiente	x86-apparmor + arm64-apparmor. Imágenes Debian optimizadas. Para hospitales, escuelas, municipios.
aRGus-seL4	⏳ Diseño futuro	Apéndice científico. Kernel seL4, libpcap (no eBPF/XDP), sniffer monohilo. Branch independiente.

📄 Preprint

ML Defender (aRGus NDR) is documented in a peer-reviewed preprint published on arXiv cs.CR (April 2026).

ML Defender (aRGus NDR): An Open-Source Embedded ML NIDS for Botnet and Anomalous Traffic Detection in Resource-Constrained Organizations — Alonso Isidoro Román

arXiv: arXiv:2604.04952 [cs.CR] DOI: https://doi.org/10.48550/arXiv.2604.04952 Published: 3 April 2026 · Draft v16 (updated 19 April 2026) · MIT license Code: https://github.com/alonsoir/argus

🎯 Mission

Democratize enterprise-grade cybersecurity for hospitals, schools, and small organizations that cannot afford commercial solutions. Built to last decades with scientific honesty and methodical development.

Philosophy: Via Appia Quality — Systems built like Roman roads, designed to endure.

"Un escudo que aprende de su propia sombra."

🛡️ Threat Model Scope

ML Defender is a Network Detection and Response (NDR) system. Its guiding principle is network surveillance: every component operates on network traffic.

Physical and removable-media vectors are explicitly out of scope by conscious design decision. Complementary mode with Wazuh for file integrity monitoring.

📊 Validated Results (DAY 122 — 19 April 2026)

Metric	Value	Notes
F1-score (CTU-13 Neris)	0.9985	Stable across 4 replay runs
Precision	0.9969
Recall	1.0000	Zero missed attacks (FN=0)
XGBoost Precision (CIC-IDS-2017 val)	0.9945	In-distribution, threshold=0.8211
XGBoost Recall (CIC-IDS-2017 val)	0.9818	In-distribution
XGBoost F1 (CIC-IDS-2017 val)	0.9881	Val-AUCPR=0.99846
XGBoost Wednesday OOD	Documented impossibility	Structural covariate shift — see §8 paper
Inference latency (XGBoost)	1.986 µs/sample	Gate <2µs ✅
Inference latency (RF)	0.24–1.06 µs	Per-class, embedded C++20
Throughput ceiling (virtualized)	~33–38 Mbps	VirtualBox NIC limit, not pipeline
Stress test	2,374,845 packets — 0 drops	100 Mbps requested, loop=3
RAM (full pipeline)	~1.28 GB	Stable under load
Pipeline components	6/6 RUNNING	Reproducible from `make bootstrap`
Plugin integrity	ADR-025 MERGED	Ed25519 + TOCTOU-safe dlopen
AppArmor	6/6 enforce	0 denials
Path traversal prevention	ADR-037 MERGED	`safe_path` header-only — 3 primitivas + 16+ RED→GREEN tests
Dev/prod parity	DAY 127 MERGED	`resolve_config()` — symlinks legítimos en prefix confiable
CI gate	TEST-PROVISION-1 8/8

🔬 DAY 122 Scientific Finding

On DAY 122, a rigorous temporal holdout evaluation on CIC-IDS-2017 revealed a structural covariate shift: Wednesday contains exclusively application-layer DoS attacks (Hulk, GoldenEye, Slowloris) absent from all training days. No threshold can simultaneously satisfy Precision≥0.99 and Recall≥0.95 on Wednesday data. This is not an XGBoost failure — it is an empirical impossibility result caused by the dataset's day-specific attack segregation design.

This finding corroborates Sommer & Paxson (2010) and provides new quantitative evidence that static classifiers trained on academic benchmarks are structurally insufficient for production NDR.

The architectural response — the Adversarial Capture-Retrain Loop (ACRL) — is proposed in §11.18 of the paper.

🔒 DAY 124-127 Security Hardening

ADR-037 — safe_path (DAY 124)

contrib/safe-path/ is a zero-dependency C++20 header-only library that prevents path traversal attacks across all production components. Three active primitives with distinct security semantics:

// General paths — prefix verified post-canonical resolution
const auto safe = argus::safe_path::resolve(path, "/etc/ml-defender/");

// Cryptographic seed material — lstat() PRE-resolution, symlinks strictly rejected
// (fs::is_symlink(resolved) arrives too late — weakly_canonical() already resolved it)
const int fd = argus::safe_path::resolve_seed(seed_path, keys_dir_);

// Config files with legitimate symlinks — lexically_normal() verifies prefix
// BEFORE following symlinks (enables /etc/ml-defender/ → /vagrant/ dev/prod parity)
const auto cfg = argus::safe_path::resolve_config(config_path, "/etc/ml-defender/");

Taxonomy (Consejo 8/8 · DAY 127):

Primitive	Use case	Symlinks	Verification
`resolve()`	General paths	Allowed post-check	`weakly_canonical()` post-resolution
`resolve_seed()`	Crypto material	❌ Strictly rejected	`lstat()` pre-resolution
`resolve_config()`	Config files	✅ Allowed in prefix	`lexically_normal()` pre-resolution
`resolve_model()`	ML models (future)	TBD	Ed25519 signature verify — backlog ADR-038

Test-Driven Hardening — Property Testing (DAY 125-127)

DAY 125-127 validated key TDH principles through empirical evidence:

// memory_utils.hpp — header-only, independently testable
[[nodiscard]] inline double compute_memory_mb(long pages, long page_size) noexcept {
    return (static_cast<double>(pages) * static_cast<double>(page_size)) / (1024.0 * 1024.0);
}
// Note: double chosen over int64_t — LONG_MAX/4096 * 8192 overflows int64_t.
// Property test PropertyNeverNegative caught this latent bug in the int64_t version.

Testing hierarchy (Consejo 8/8 · DAY 127):

Layer	What it verifies	When
Unit tests	Specific known cases (RED→GREEN)	Every security fix
Property tests	Mathematical invariants	Every security fix
Fuzzing (libFuzzer)	Parsers and external interfaces	Post-property-testing
Mutation testing	Test suite quality	Pre-major-release

Permanent rules (Council 8/8):

Every security fix must include: (1) unit test RED→GREEN, (2) property test for invariants, (3) integration test in real component.
Every new file-handling surface must be classified with PathPolicy before implementation.

🏗️ Architecture

┌──────────────────────────────────────────────────────────────────┐
│                       ML Defender Pipeline                       │
├──────────────────────────────────────────────────────────────────┤
│  Network Traffic                                                 │
│         ↓                                                        │
│  ┌──────────────────┐                                            │
│  │  sniffer (C++20) │  eBPF/XDP zero-copy packet capture        │
│  │                  │  ShardedFlowManager (16 shards)           │
│  └──────────────────┘                                            │
│         ↓  ZeroMQ (ChaCha20-Poly1305 encrypted)                  │
│  ┌──────────────────┐                                            │
│  │  ml-detector     │  4× Embedded RandomForest classifiers     │
│  │  (C++20)         │  XGBoost plugin ADR-026 ✅ Prec=0.9945    │
│  └──────────────────┘                                            │
│         ↓  ZeroMQ (encrypted)                                    │
│  ┌──────────────────┐                                            │
│  │  etcd-server     │  Component registration + seed distrib.   │
│  └──────────────────┘                                            │
│         ↓                                                        │
│  ┌──────────────────┐                                            │
│  │ firewall-acl     │  Autonomous blocking via ipset/iptables   │
│  │ agent (C++20)    │  safe_path::resolve_config() DAY 127 ✅   │
│  └──────────────────┘                                            │
│         ↓                                                        │
│  ┌──────────────────┐                                            │
│  │  rag-ingester    │  FAISS + SQLite event ingestion           │
│  │  (C++20)         │  safe_path::resolve_config() DAY 127 ✅   │
│  └──────────────────┘                                            │
│         ↓                                                        │
│  ┌──────────────────┐                                            │
│  │  rag-security    │  TinyLlama natural language interface      │
│  │  (C++20+LLM)     │  Local inference — no cloud exfiltration  │
│  └──────────────────┘                                            │
└──────────────────────────────────────────────────────────────────┘

🚀 Quick Start

Critical rules:

Always use make <target>. Never compile or install manually in the VM.

The Vagrantfile and Makefile are the single source of truth.

👶 First time — fresh clone

git clone https://github.com/alonsoir/argus.git
cd argus
make up          # vagrant up — full provisioning ~20-30 min
make bootstrap   # all 8 steps in one command

🔄 Daily workflow

make up
make pipeline-stop
make pipeline-build
make sign-plugins && make sign-models
make pipeline-start && make pipeline-status
make test-all

✅ CI Gate

make test-all
# Runs: libs + components + TEST-PROVISION-1 (8/8)
#       TEST-INVARIANT-SEED + plugin-integ-test (6/6 incl. TEST-INTEG-SIGN)

🗺️ Roadmap

✅ DONE — DAY 127 (23 Apr 2026) — DEBT-DEV-PROD-SYMLINK-001 🎉

resolve_config() ✅ — nueva primitiva safe_path para configs con symlinks legítimos
Makefile paths absolutos ✅ — fin de paths relativos en arranque de componentes
Consejo 8/8 ✅ — taxonomía safe_path formalizada + pregunta crítica FEDER

✅ DONE — DAY 126 (23 Apr 2026) — v0.5.2-hardened 🎉

DEBT-SAFE-PATH-SEED-SYMLINK-001 ✅ — lstat() pre-resolution, 11/11 tests
DEBT-CONFIG-PARSER-FIXED-PREFIX-001 ✅ — prefix fijo, 4/4 + 3/3 tests
DEBT-PRODUCTION-TESTS-REMAINING-001 ✅ — seed-client + firewall 3/3 + 3/3
DEBT-MEMORY-UTILS-BOUNDS-001 ✅ — MAX_REALISTIC_MEMORY_MB, 5/5 tests
Tag: v0.5.2-hardened ✅

✅ DONE — DAY 125 (22 Apr 2026) — DEBT CLOSURE 🎉

DEBT-GITIGNORE-TEST-SOURCES-001 ✅
DEBT-INTEGER-OVERFLOW-TEST-001 ✅ — property test caught latent bug in own fix
DEBT-SAFE-PATH-TEST-RELATIVE-001 ✅
DEBT-SAFE-PATH-TEST-PRODUCTION-001 ✅ (rag-ingester)
DEBT-CRYPTO-TRANSPORT-CTEST-001 ✅

✅ DONE — DAY 124 (21 Apr 2026) — ADR-037 HARDENING 🎉

ADR-037 — contrib/safe-path/ header-only C++20 · 9 RED→GREEN tests ✅
Tag: v0.5.1-hardened ✅

🔜 NEXT — DAY 128: Documentation + Snyk + Portability

Priority	Task
🟡 P1	DEBT-SNYK-WEB-VERIFICATION-001 — Snyk web sobre v0.5.2-hardened
🟡 P1	DEBT-PROPERTY-TESTING-PATTERN-001 — docs/testing/PROPERTY-TESTING.md
🟡 P1	DEBT-SAFE-PATH-TAXONOMY-DOC-001 — docs/SECURITY-PATH-PRIMITIVES.md
🟢 P2	DEBT-PROVISION-PORTABILITY-001 — ARGUS_SERVICE_USER

🔜 THEN — PHASE 5: Adversarial Capture-Retrain Loop

Priority	Task
P0	DEBT-PENTESTER-LOOP-001 — MITRE Caldera → real adversarial flows → XGBoost retraining
P0	ADR-038 — ACRL formal design
P0	BACKLOG-FEDER-001 — clarificar scope con Andrés Caro Lindo (NDR standalone vs federación)
P1	aRGus-production images (x86 + ARM64 apparmor)
P2	aRGus-seL4 research branch

🧠 Consejo de Sabios — Multi-Model Peer Review

Eight large language models serve as intellectual co-reviewers:

Claude (Anthropic) · Grok (xAI) · ChatGPT (OpenAI) · DeepSeek · Qwen (Alibaba) · Gemini (Google) · Kimi (Moonshot) · Mistral

Methodology: structured disagreement. Problems must be demonstrated with compilable tests or mathematics before fixes are proposed. Documented in the preprint §6.

🗺️ Milestones

✅ DAY 111: arXiv:2604.04952 PUBLICADO 🎉
✅ DAY 113: ADR-025 MERGED — v0.3.0-plugin-integrity 🎉
✅ DAY 118: PHASE 3 COMPLETADA — v0.4.0 🎉
✅ DAY 120: make bootstrap + XGBoost F1=0.9978 🎉
✅ DAY 122: PHASE 4 COMPLETADA — v0.5.0-preproduction 🎉
✅ DAY 124: ADR-037 MERGED — v0.5.1-hardened 🎉
✅ DAY 125: 5 debts closed · property testing validates TDH · 47 test sources recovered 🎉
✅ DAY 126: 4 debts closed · lstat() pre-resolution · fixed prefix · v0.5.2-hardened 🎉
✅ DAY 127: resolve_config() · dev/prod parity · Consejo 8/8 taxonomía safe_path 🎉
🔜 DAY 128: Snyk verification + property testing pattern + provision portability

📄 License

MIT License — See LICENSE

Via Appia Quality 🏛️ — Built to last decades.

Folders and files

Latest commit

History

Repository files navigation

ML Defender (aRGus NDR)

Estado actual — DAY 127 (2026-04-23)

Pipeline

Hitos DAY 125-127

Deuda técnica abierta

Próxima frontera (post-deuda)

⚠️ NO desplegar en producción hasta

🏗️ Tres variantes del pipeline

📄 Preprint

🎯 Mission

🛡️ Threat Model Scope

📊 Validated Results (DAY 122 — 19 April 2026)

🔬 DAY 122 Scientific Finding

🔒 DAY 124-127 Security Hardening

ADR-037 — safe_path (DAY 124)

Test-Driven Hardening — Property Testing (DAY 125-127)

🏗️ Architecture

🚀 Quick Start

👶 First time — fresh clone

🔄 Daily workflow

✅ CI Gate

🗺️ Roadmap

✅ DONE — DAY 127 (23 Apr 2026) — DEBT-DEV-PROD-SYMLINK-001 🎉

✅ DONE — DAY 126 (23 Apr 2026) — v0.5.2-hardened 🎉

✅ DONE — DAY 125 (22 Apr 2026) — DEBT CLOSURE 🎉

✅ DONE — DAY 124 (21 Apr 2026) — ADR-037 HARDENING 🎉

🔜 NEXT — DAY 128: Documentation + Snyk + Portability

🔜 THEN — PHASE 5: Adversarial Capture-Retrain Loop

🧠 Consejo de Sabios — Multi-Model Peer Review

🗺️ Milestones

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages