fix: harden sanitize patterns for token leakage prevention#107
Closed
voidborne-d wants to merge 2 commits intoEvoMap:mainfrom
Closed
fix: harden sanitize patterns for token leakage prevention#107voidborne-d wants to merge 2 commits intoEvoMap:mainfrom
voidborne-d wants to merge 2 commits intoEvoMap:mainfrom
Conversation
Add detection for: - Bare GitHub tokens (ghp_, gho_, ghu_, ghs_, github_pat_) - AWS access keys (AKIA...) - OpenAI project tokens (sk-proj-) - Anthropic tokens (sk-ant-) - npm tokens (npm_) - Private keys (PEM format) - Password fields (password=) - Basic auth in URLs (user:pass@host) Add test/sanitize.test.js with 30 assertions covering all new and existing patterns. These patterns were previously undetected by the redaction layer, meaning capsules broadcast via A2A could leak credentials embedded in code snippets or logs.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Address review feedback: - Use lookbehind/lookahead so basic-auth redaction keeps :// and @ (https://user:pass@host -> https://[REDACTED]@host) - Fix test assertion count (30 -> 34) - Add assertions verifying URL scheme preservation
dislovelhl
approved these changes
Feb 26, 2026
Collaborator
|
Merged into the main codebase. All 11 new redaction patterns and the test suite have been incorporated. Thank you for the contribution -- this significantly strengthens the pre-broadcast sanitization layer! |
fmw666
pushed a commit
that referenced
this pull request
Mar 11, 2026
) Add detection for GitHub tokens (ghp_, gho_, ghu_, ghs_, github_pat_), AWS access keys (AKIA), OpenAI/Anthropic tokens, npm tokens, PEM private keys, password fields, and basic auth in URLs. Add test/sanitize.test.js with 34 assertions. Co-authored-by: voidborne-d <258577966+voidborne-d@users.noreply.github.com> Made-with: Cursor
fmw666
pushed a commit
that referenced
this pull request
Mar 11, 2026
Changes since v1.19.1: - feat(signals): multilingual signal extraction (ZH-CN/ZH-TW/EN/JA) with baseName:snippet format (PR #112, @shinjiyu) - fix: harden sanitize patterns for token leakage prevention (PR #107, @voidborne-d) - feat: activate fork lineage by setting parent on Gene/Capsule publish - fix: validate reusedAssetId starts with sha256: before setting parent Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The pre-broadcast sanitization layer (
src/gep/sanitize.js) misses several common credential patterns. When capsules are published via A2A, any credentials embedded in code snippets or log context could leak to the hub.Patterns NOT previously caught:
ghp_,gho_,ghu_,ghs_,github_pat_)AKIA...)sk-proj-)sk-ant-)npm_)password=fieldsuser:pass@host)Fix
Added 11 new regex patterns to
REDACT_PATTERNSinsanitize.js.Tests
Added
test/sanitize.test.jswith 30 assertions covering all new and existing patterns, including edge cases (null input, safe strings, deep nested objects).All existing tests pass.
Note
Medium Risk
Touches pre-broadcast sanitization logic; overly broad regexes could redact legitimate content or miss edge cases, but changes are localized and now covered by targeted tests.
Overview
Hardens
src/gep/sanitize.jsby expandingREDACT_PATTERNSto redact additional credential formats (GitHub token prefixes, AWS access keys, OpenAI/Anthropic keys, npm tokens, PEM private keys,password=fields, and basic-auth URL credentials) before payloads are broadcast.Adds
test/sanitize.test.jsto regression-test existing redactions and validate the new patterns (including nested object sanitization and non-string/null handling), and introduces an initialassets/gep/failed_capsules.jsonplus a minimalpackage-lock.json.Written by Cursor Bugbot for commit 34b9f76. This will update automatically on new commits. Configure here.