Skip to content

feat(assessors): add docstring-signature consistency assessor#9

Closed
kami619 wants to merge 16 commits intomainfrom
ambient/session-1773863385
Closed

feat(assessors): add docstring-signature consistency assessor#9
kami619 wants to merge 16 commits intomainfrom
ambient/session-1773863385

Conversation

@kami619
Copy link
Copy Markdown
Owner

@kami619 kami619 commented Mar 18, 2026

Summary

  • Adds a new DocstringConsistencyAssessor (Tier 2 Critical, 2% weight) that detects mismatches between function docstrings and their actual signatures
  • Directly addresses the gaming vulnerability from issue [AUDIT] Agent Ready audit by Ugo Giordano ambient-code/agentready#340 where a repo scored 99.7/100 Platinum with deliberately misleading docstrings
  • Includes 16 unit tests covering Google, Sphinx, and NumPy docstring styles, phantom params, async functions, *args/**kwargs, and edge cases

What it catches

The audit repo (ugiordan/agentready-audit) has functions like:

def __init__(self):
    """Initialize with database session, cache, and event bus.

    Args:
        session: SQLAlchemy async session (injected).  # NOT in signature
        cache: Redis client (injected).                # NOT in signature
        event_bus: Domain event publisher (injected).   # NOT in signature
    """
    pass

This assessor flags all three parameters as "documented but not in signature", causing the repo to fail this check instead of getting a free pass.

Weight allocation

To maintain the 100% weight sum, 1% each was taken from inline_documentation and concise_documentation (3% -> 2% each), giving the new assessor 2%.

Test plan

  • 16 unit tests pass (all styles: Google, Sphinx, NumPy)
  • Existing assessor tests unaffected (94 tests pass)
  • Black, isort, ruff clean
  • CI pipeline validation

Fixes ambient-code#340

Generated with Claude Code

kami619 and others added 15 commits March 4, 2026 13:28
… in standard_layout check (ambient-code#322)

* fix(assessors): support project-named directories and test-only repos in standard_layout

Enhances StandardLayoutAssessor to recognize multiple valid Python project
structures instead of rigidly requiring src/:

- Project-named directories (e.g., pandas/pandas/) now pass the check
- Test-only repositories now receive NOT_APPLICABLE instead of failing
- Parses pyproject.toml for project name (PEP 621 and Poetry formats)
- Falls back to detecting any root directory with __init__.py
- Adds blocklist to exclude non-source directories (tests, docs, utils, etc.)
- Updates remediation to present both src/ and flat layout as valid options

Fixes ambient-code#246
Fixes ambient-code#305

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(assessors): address PR ambient-code#322 review feedback

Changes based on code review:

Critical:
- Remove setup.cfg from test_indicators (false positive risk)
- Sort directory iteration for deterministic cross-platform behavior

Significant:
- Limit Strategy 3 (fallback detection) to repos with pyproject.toml
- Use word-boundary regex for test-only repo name detection
- Expand _NON_SOURCE_DIRS blocklist with migrations, config, etc.

Minor:
- Remove redundant has_tests check (caller already verifies)
- Replace empty strings in commands with comment separators

Updated test to expect failure for project-named directory without
pyproject.toml, matching the new stricter detection behavior.

All 16 tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(assessors): address PR ambient-code#322 second review feedback

Fixes based on code review comment #3948171727:

Bug fixes:
- pytest.ini/conftest.py only indicate test-only repos when pyproject.toml
  is absent; mixed projects typically have pyproject.toml so these files
  alone are not reliable test-only indicators
- Remove celery, middleware, alembic from blocklist since these are
  legitimate package names for their respective projects (Celery, etc.)

Design improvement:
- Strategy 3 (heuristic fallback) now returns type "heuristic" and evidence
  shows "— verify" suffix to flag that this is a best-guess match, not an
  exact name match from pyproject.toml

Style cleanup:
- Remove self-referential "PR ambient-code#322 feedback:" comments; replaced with
  direct rationale where still needed

Added 3 new tests:
- test_pytest_ini_with_pyproject_is_not_test_only
- test_celery_directory_not_blocked
- test_heuristic_match_shows_verify_in_evidence

All 19 tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(assessors): address PR ambient-code#322 third review feedback

High Priority Fixes:
- Add TypedDict (SourceDirectoryInfo) for _find_source_directory return type
  to provide type safety and prevent runtime key typos
- Fix Strategy 3 scoping bug: heuristic fallback now runs whenever
  pyproject.toml exists, not just when a package name is found. This allows
  repos with pyproject.toml containing only [build-system] to still benefit
  from heuristic source directory detection.

Medium Priority (Tests):
- Add test for pyproject.toml without name field (only [build-system])
- Add test for project-named directory without __init__.py (namespace packages)
  verifying Strategy 2 falls through to Strategy 3 correctly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: fix black formatting in structure.py

Add missing blank line after TypedDict class definition.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(assessors): externalize non-source dirs blocklist to Python.arsrc

Addresses PR ambient-code#322 review Comment 1:
- Create Python.arsrc config file with gitignore-like format
- Add _load_arsrc_file() with @lru_cache for efficient config loading
- Add UserWarning when config file is missing (helps detect packaging issues)
- Update pyproject.toml to include *.arsrc in package-data
- Add 7 new tests covering config loading, parsing, and packaging

The blocklist is now externalized following the reviewer's suggestion to
adopt a github/gitignore-like pattern for managing language conventions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Refactor package data entries in pyproject.toml

Remove conflicting entries and standardize package data.

---------

Co-authored-by: Ambient Code Bot <bot@ambient-code.local>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…e#331)

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](docker/setup-buildx-action@v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](docker/login-action@v3...v4)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## [2.29.6](ambient-code/agentready@v2.29.5...v2.29.6) (2026-03-05)

### Bug Fixes

* **assessors:** support project-named directories and test-only repos in standard_layout check ([ambient-code#322](ambient-code#322)) ([2fbb733](ambient-code@2fbb733)), closes [ambient-code#246](ambient-code#246) [ambient-code#305](ambient-code#305)
…mbient-code#336)

* fix: update CLAUDE.md with accurate codebase info, add BOOKMARKS.md

Fixes ambient-code#327 - CLAUDE.md was outdated with multiple inaccuracies:
- Tier weights corrected to 55/27/15/3 (was 50/30/15/5)
- Python version corrected to >=3.12 (was 3.11+)
- CI/CD section updated to reflect 16 workflows (was "manual")
- Bootstrap/Align shown as implemented commands (were "planned")
- Architecture tree now includes all directories (fixers, github, utils)
- Removed phantom GITHUB_ISSUES.md reference
- Corrected contracts path to specs/001-agentready-scorer/contracts/
- Removed incorrect "9/31 stub assessors" claim (all 25 implemented)

Optimized CLAUDE.md from 457 to 131 lines (71% reduction) for agent
context window efficiency. Detailed navigation moved to BOOKMARKS.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address agent review feedback for PR 336

* review feedback fixes

Signed-off-by: Kamesh Akella <kakella@redhat.com>

---------

Signed-off-by: Kamesh Akella <kakella@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
# [2.30.0](ambient-code/agentready@v2.29.6...v2.30.0) (2026-03-06)

### Bug Fixes

* update CLAUDE.md with accurate codebase info, add BOOKMARKS.md ([ambient-code#336](ambient-code#336)) ([99ab035](ambient-code@99ab035)), closes [ambient-code#327](ambient-code#327)

### Features

* add opendatahub-io/opendatahub-tests to leaderboard ([ambient-code#332](ambient-code#332)) ([b92713b](ambient-code@b92713b))
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5 to 6.
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](docker/metadata-action@v5...v6)

---
updated-dependencies:
- dependency-name: docker/metadata-action
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…-code#335)

When a user submits a leaderboard entry from a fork that hasn't been synced
with upstream, the workflow was using the fork's outdated schema for
validation. This caused valid submissions to fail with errors like
"attributes_total: 25 was expected" even though PR ambient-code#312 had already fixed
the schema to allow 10-25 attributes.

This fix changes the workflow to:
1. Checkout upstream main for validation tools and schema
2. Fetch only the submission file from the PR branch
3. Validate using upstream's agentready installation

This ensures validation always uses the latest schema while still validating
the actual submitted file from the PR.

Fixes the validation failure on PR ambient-code#332.
See: ambient-code#312

Co-authored-by: Ambient Code Bot <bot@ambient-code.local>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## [2.30.1](ambient-code/agentready@v2.30.0...v2.30.1) (2026-03-09)

### Bug Fixes

* **leaderboard:** use upstream schema for fork PR validation ([ambient-code#335](ambient-code#335)) ([5939b59](ambient-code@5939b59)), closes [ambient-code#312](ambient-code#312)
)

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](docker/build-push-action@v6...v7)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [black](https://github.com/psf/black) from 25.11.0 to 26.3.1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](psf/black@25.11.0...26.3.1)

---
updated-dependencies:
- dependency-name: black
  dependency-version: 26.3.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pyjwt](https://github.com/jpadilla/pyjwt) from 2.10.1 to 2.12.0.
- [Release notes](https://github.com/jpadilla/pyjwt/releases)
- [Changelog](https://github.com/jpadilla/pyjwt/blob/master/CHANGELOG.rst)
- [Commits](jpadilla/pyjwt@2.10.1...2.12.0)

---
updated-dependencies:
- dependency-name: pyjwt
  dependency-version: 2.12.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Add a new Tier 2 assessor that detects mismatches between function
docstrings and their actual signatures. This directly addresses the
gaming vulnerability demonstrated in issue ambient-code#340, where a repo scored
99.7/100 Platinum by having docstrings that described parameters the
functions didn't actually accept.

The assessor:
- Parses Python files with AST to extract function signatures
- Extracts documented parameters from Google, Sphinx, and NumPy style docstrings
- Flags phantom params (documented but not in signature) and undocumented params
- Scores proportionally based on consistency rate (threshold: 80%)
- Excludes self/cls, handles *args/**kwargs and keyword-only args

Weight allocation: 2% (Tier 2 Critical), taken from inline_documentation
and concise_documentation (3% -> 2% each) to maintain 100% total.

Fixes ambient-code#340

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Leaderboard Validation

FAILED

Claimed: 83.5/100
Verified: N/A/100
Diff: N/A points (±2 tolerance)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 18, 2026

📈 Test Coverage Report

Branch Coverage
This PR 67.3%
Main 66.4%
Diff ✅ +0.9%

Coverage calculated from unit tests only

…y check

Add posonlyargs to _get_signature_params so that positional-only
parameters (def f(x, /, y)) are correctly included in the
signature-docstring comparison. Without this, they would be
incorrectly flagged as phantom params.

Refs ambient-code#340

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Leaderboard Validation

FAILED

Claimed: 83.5/100
Verified: N/A/100
Diff: N/A points (±2 tolerance)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

👋 This pull request has been inactive for 21 days and will be closed in 7 days if there is no further activity.

If you plan to continue work on this PR, please:

  • Push new commits or add a comment
  • Remove the stale label
  • Add the work-in-progress label to prevent future stale marking

Thank you for your contributions to AgentReady!

@github-actions github-actions bot added the stale label Apr 9, 2026
@github-actions
Copy link
Copy Markdown

🔒 This pull request has been automatically closed due to 28 days of inactivity.

If you'd like to continue this work, please reopen the PR or create a new one.

@github-actions github-actions bot closed this Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AUDIT] Agent Ready audit by Ugo Giordano

3 participants