Skip to content

fix: harden sanitize patterns for token leakage prevention#107

Closed
voidborne-d wants to merge 2 commits intoEvoMap:mainfrom
voidborne-d:fix/sanitize-patterns
Closed

fix: harden sanitize patterns for token leakage prevention#107
voidborne-d wants to merge 2 commits intoEvoMap:mainfrom
voidborne-d:fix/sanitize-patterns

Conversation

@voidborne-d
Copy link
Contributor

@voidborne-d voidborne-d commented Feb 23, 2026

Problem

The pre-broadcast sanitization layer (src/gep/sanitize.js) misses several common credential patterns. When capsules are published via A2A, any credentials embedded in code snippets or log context could leak to the hub.

Patterns NOT previously caught:

  • Bare GitHub tokens (ghp_, gho_, ghu_, ghs_, github_pat_)
  • AWS access keys (AKIA...)
  • OpenAI project tokens (sk-proj-)
  • Anthropic tokens (sk-ant-)
  • npm tokens (npm_)
  • PEM private keys
  • password= fields
  • Basic auth in URLs (user:pass@host)

Fix

Added 11 new regex patterns to REDACT_PATTERNS in sanitize.js.

Tests

Added test/sanitize.test.js with 30 assertions covering all new and existing patterns, including edge cases (null input, safe strings, deep nested objects).

All existing tests pass.


Note

Medium Risk
Touches pre-broadcast sanitization logic; overly broad regexes could redact legitimate content or miss edge cases, but changes are localized and now covered by targeted tests.

Overview
Hardens src/gep/sanitize.js by expanding REDACT_PATTERNS to redact additional credential formats (GitHub token prefixes, AWS access keys, OpenAI/Anthropic keys, npm tokens, PEM private keys, password= fields, and basic-auth URL credentials) before payloads are broadcast.

Adds test/sanitize.test.js to regression-test existing redactions and validate the new patterns (including nested object sanitization and non-string/null handling), and introduces an initial assets/gep/failed_capsules.json plus a minimal package-lock.json.

Written by Cursor Bugbot for commit 34b9f76. This will update automatically on new commits. Configure here.

Add detection for:
- Bare GitHub tokens (ghp_, gho_, ghu_, ghs_, github_pat_)
- AWS access keys (AKIA...)
- OpenAI project tokens (sk-proj-)
- Anthropic tokens (sk-ant-)
- npm tokens (npm_)
- Private keys (PEM format)
- Password fields (password=)
- Basic auth in URLs (user:pass@host)

Add test/sanitize.test.js with 30 assertions covering all new
and existing patterns.

These patterns were previously undetected by the redaction layer,
meaning capsules broadcast via A2A could leak credentials embedded
in code snippets or logs.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Address review feedback:
- Use lookbehind/lookahead so basic-auth redaction keeps :// and @
  (https://user:pass@host -> https://[REDACTED]@host)
- Fix test assertion count (30 -> 34)
- Add assertions verifying URL scheme preservation
@autogame-17
Copy link
Collaborator

Merged into the main codebase. All 11 new redaction patterns and the test suite have been incorporated. Thank you for the contribution -- this significantly strengthens the pre-broadcast sanitization layer!

fmw666 pushed a commit that referenced this pull request Mar 11, 2026
)

Add detection for GitHub tokens (ghp_, gho_, ghu_, ghs_, github_pat_),
AWS access keys (AKIA), OpenAI/Anthropic tokens, npm tokens, PEM private
keys, password fields, and basic auth in URLs.

Add test/sanitize.test.js with 34 assertions.

Co-authored-by: voidborne-d <258577966+voidborne-d@users.noreply.github.com>
Made-with: Cursor
fmw666 pushed a commit that referenced this pull request Mar 11, 2026
Changes since v1.19.1:
- feat(signals): multilingual signal extraction (ZH-CN/ZH-TW/EN/JA) with baseName:snippet format (PR #112, @shinjiyu)
- fix: harden sanitize patterns for token leakage prevention (PR #107, @voidborne-d)
- feat: activate fork lineage by setting parent on Gene/Capsule publish
- fix: validate reusedAssetId starts with sha256: before setting parent

Made-with: Cursor
fmw666 pushed a commit that referenced this pull request Mar 11, 2026
- shinjiyu: updated description to include PR #112 contribution
- voidborne-d: added for PR #107 sanitization hardening

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants