Skip to content

feat(signals): 多语言信号提取 + 需求/改进标签携带描述信息#112

Closed
shinjiyu wants to merge 6 commits intoEvoMap:mainfrom
shinjiyu:feat/signal-multilang-snippet
Closed

feat(signals): 多语言信号提取 + 需求/改进标签携带描述信息#112
shinjiyu wants to merge 6 commits intoEvoMap:mainfrom
shinjiyu:feat/signal-multilang-snippet

Conversation

@shinjiyu
Copy link

@shinjiyu shinjiyu commented Feb 24, 2026

Summary

  • 多语言信号识别user_feature_requestuser_improvement_suggestion 支持简体中文、繁体中文、英语、日语四种语言的 pattern 匹配。
  • 标签携带描述:信号以 baseName:snippet 格式输出(如 user_feature_request:帮我开发一个跳一跳的微信小程序),snippet 最长 200 字,供 selector 和 GEP prompt 使用,提升上下文精准度。
  • 协议兼容mutation.jsquestionGenerator.js 同步更新,兼容带 snippet 的信号格式;selector.js 本身已通过子串匹配天然支持,无需改动。

Changed Files

文件 说明
src/gep/signals.js 四语言 pattern + baseName:snippet 提取逻辑
src/gep/mutation.js hasOpportunitySignal 兼容 name:snippet 格式
src/gep/questionGenerator.js user_feature_request 检测兼容带前缀格式
test/signals.test.js 新增(23 个测试:四语言基础用例 + 13 条边界条件)
test/selector.test.js 补充 baseName:snippet 格式的基因匹配用例

Test Plan

  • node test/signals.test.js — 23 个测试全部通过(覆盖超长截断、「我想…」、空输入、仅标点、换行、多信号共存等边界条件)
  • node test/selector.test.js — 10 个测试全部通过(含 snippet 格式匹配用例)
  • node test/mutation.test.js — 19 个测试全部通过

Note

Medium Risk
Touches core signal extraction and categorization logic, which can change gene/candidate selection behavior and innovation vs repair routing. Changes are regex-heavy but are covered by new unit tests for multilingual and :snippet formats.

Overview
Updates extractSignals to detect feature requests and improvement suggestions across English, Simplified/Traditional Chinese, and Japanese, and to emit them as name:snippet (snippet clipped to 200 chars) for better downstream context.

Makes signal consumers tolerant of the new suffix format (opportunity detection in mutation.js/signals.js, capability candidate generation in candidates.js, and questionGenerator.js feature-request prompting), tweaks error/perf/capability regex coverage for Chinese, and adds focused tests for snippet-format gene matching plus a new signals.test.js covering multilingual and edge cases.

Written by Cursor Bugbot for commit 437f02d. This will update automatically on new commits. Configure here.

- signals.js: user_feature_request / user_improvement_suggestion 支持简中、繁中、英、日四语言
  pattern 识别,提取后以 baseName:snippet 格式携带需求描述(最长 200 字)
- signals.js: hasOpportunitySignal / errorHit 同步支持 baseName:snippet 格式及中文关键词
- mutation.js: hasOpportunitySignal 兼容 name:snippet 信号(startsWith 判断)
- questionGenerator.js: user_feature_request 检测兼容带 snippet 前缀的信号格式
- test/signals.test.js: 新增四语言基础用例 + 13 条边界条件测试
  (超长截断、我想…、空输入、仅标点、换行、多信号共存等)
- test/selector.test.js: 补充 baseName:snippet 格式的基因匹配用例

Co-authored-by: Cursor <cursoragent@cursor.com>
yu.zhenyu and others added 3 commits February 24, 2026 11:19
…improvement_suggestion 候选项永不生成的问题

signals.js 已将信号格式改为 'name:snippet',但 candidates.js 仍用 includes() 精确匹配裸名,导致这两类候选项静默失效。改用 some(s => s === name || s.startsWith(name + ':')) 与 mutation.js 保持一致。

Co-authored-by: Cursor <cursoragent@cursor.com>
analyzeRecentHistory 频率统计时未对 user_feature_request: 和
user_improvement_suggestion: 前缀做归一化,导致 suppressedSignals 中
存储的是完整 key(如 user_feature_request:snippet),而去重过滤器检查的是
裸 key(user_feature_request),两侧不一致造成去重永远失效。

补齐两个前缀的归一化规则,与去重过滤器保持一致。

Co-authored-by: Cursor <cursoragent@cursor.com>
…i to preserve casing

Match user_improvement_suggestion English snippet against corpus (with /i)
instead of lower, so API/class names and identifiers keep original casing
for selector and GEP prompt context; aligns with other language branches.

Co-authored-by: Cursor <cursoragent@cursor.com>
… positives

Match errLine and English exception: pattern; bare 异常 (e.g. in 优化一下异常处理)
no longer triggers log_error or blocks user_improvement_suggestion.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Co-authored-by: Cursor <cursoragent@cursor.com>
@autogame-17
Copy link
Collaborator

Merged into the main codebase. The multilingual signal extraction (ZH-CN, ZH-TW, EN, JA), baseName:snippet format, and all related compatibility updates have been incorporated along with the full test suite. Thank you for the contribution!

fmw666 pushed a commit that referenced this pull request Mar 11, 2026
… (from PR #112)

- signals.js: user_feature_request / user_improvement_suggestion support
  ZH-CN, ZH-TW, EN, JA pattern matching with baseName:snippet format
- signals.js: hasOpportunitySignal / analyzeRecentHistory / dedup filter
  updated for name:snippet compatibility
- signals.js: errorHit regex extended for Chinese error keywords
- mutation.js: hasOpportunitySignal compatible with name:snippet signals
- candidates.js: signal matching uses startsWith for snippet format
- questionGenerator.js: user_feature_request detection compatible
- test/signals.test.js: 23 tests (4-language + 13 edge cases)
- test/selector.test.js: baseName:snippet gene matching cases

Co-authored-by: shinjiyu <yu.zhenyu@hellogroup.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
fmw666 pushed a commit that referenced this pull request Mar 11, 2026
Changes since v1.19.1:
- feat(signals): multilingual signal extraction (ZH-CN/ZH-TW/EN/JA) with baseName:snippet format (PR #112, @shinjiyu)
- fix: harden sanitize patterns for token leakage prevention (PR #107, @voidborne-d)
- feat: activate fork lineage by setting parent on Gene/Capsule publish
- fix: validate reusedAssetId starts with sha256: before setting parent

Made-with: Cursor
fmw666 pushed a commit that referenced this pull request Mar 11, 2026
- shinjiyu: updated description to include PR #112 contribution
- voidborne-d: added for PR #107 sanitization hardening

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants