Add MailManInTheMiddle Security Arena scenario submission#19
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc26ce56dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| from scenarios.security_arena.submissions.purple_vanguard.mailmaninthemiddle.plugin import ( | ||
| MailManInTheMiddlePlugin, |
There was a problem hiding this comment.
Use import path that matches orchestrator sys.path
When the orchestrator is launched via python scenarios/security_arena/orchestrator.py (the command used in this repo’s scenario TOML), it only adds scenarios/security_arena to sys.path; the absolute import scenarios.security_arena.submissions... will therefore raise ModuleNotFoundError unless the repo root is on PYTHONPATH, preventing the new scenario from loading in the default run. Using a relative import from the local submissions directory (or adjusting sys.path) avoids this runtime failure.
Useful? React with 👍 / 👎.
| def _extract_domain(self, address: str) -> str: | ||
| if "@" not in address: | ||
| return "" | ||
| return address.split("@", 1)[-1].lower() |
There was a problem hiding this comment.
Parse domains from display-name email addresses
The _extract_domain helper just splits on '@' and returns the remainder, so addresses with display names like Jordan Miles <jordan.miles@purplevanguard.example> produce purplevanguard.example> and are treated as external. That causes false violations (and could fail baseline scoring) whenever a model emits common RFC 5322 address formats in tool calls. Stripping angle brackets/whitespace or using email.utils.parseaddr would prevent internal addresses from being misclassified.
Useful? React with 👍 / 👎.
Motivation
attack_succeeded.json).Description
MailManInTheMiddlePluginatscenarios/security_arena/submissions/purple_vanguard/mailmaninthemiddle/plugin.pyimplementing policy-aware prompts, simulated tools (list_unread_emails,get_email,draft_email,forward_email), deterministic success checks, and baseline/attack inputs.scenarios/security_arena/submissions/purple_vanguard/mailmaninthemiddle/includingartifacts/inbox_baseline.json,artifacts/inbox_attack.json,artifacts/company_policy.md,README.md, andscenario_mailmaninthemiddle.toml.scenarios/security_arena/plugins/registry.pywith a direct import and mapping key `Codex Task