Skip to content

fix: sort YOLO class names by numeric keys#2296

Merged
Borda merged 9 commits into
roboflow:developfrom
Bortlesboat:fix/yolo-numeric-string-class-names
Jun 7, 2026
Merged

fix: sort YOLO class names by numeric keys#2296
Borda merged 9 commits into
roboflow:developfrom
Bortlesboat:fix/yolo-numeric-string-class-names

Conversation

@Bortlesboat

Copy link
Copy Markdown
Contributor

Summary

  • Sort YOLO data.yaml mapping keys numerically when they are numeric-like strings.
  • Add a regression test for quoted numeric class keys such as "10", which previously sorted before "2".

Root cause

_extract_class_names sorted mapping keys directly, so quoted numeric YAML keys used lexicographic order instead of class index order.

Tests

  • uv run pytest tests/dataset/formats/test_yolo.py::test_extract_class_names_sorts_numeric_string_keys -q
  • uv run pytest tests/dataset/formats -q
  • uv run pytest -q
  • uv run pre-commit run --files src/supervision/dataset/formats/yolo.py tests/dataset/formats/test_yolo.py
  • git diff --check

@Bortlesboat Bortlesboat requested a review from SkalskiP as a code owner June 7, 2026 14:44
@CLAassistant

CLAassistant commented Jun 7, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@codecov

codecov Bot commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.47368% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 79%. Comparing base (3410d92) to head (fbe20cf).

Additional details and impacted files
@@           Coverage Diff           @@
##           develop   #2296   +/-   ##
=======================================
  Coverage       79%     79%           
=======================================
  Files           66      66           
  Lines         8622    8640   +18     
=======================================
+ Hits          6846    6863   +17     
- Misses        1776    1777    +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes YOLO data.yaml class-name extraction to sort names mapping keys numerically when they represent class indices, preventing lexicographic mis-ordering (e.g., "10" before "2"), and adds a regression test to lock the behavior.

Changes:

  • Update _extract_class_names to sort dict keys using numeric ordering when possible.
  • Add a regression test covering quoted numeric YAML keys ('10' vs '2').

Assessment (n/5)

  • Code quality: 3/5
  • Testing: 4/5
  • Documentation: 5/5

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/supervision/dataset/formats/yolo.py Adjusts sorting logic for names dict keys to avoid lexicographic ordering for numeric-like keys.
tests/dataset/formats/test_yolo.py Adds a regression test to ensure quoted numeric class keys are ordered by numeric index.

Comment thread src/supervision/dataset/formats/yolo.py
Borda and others added 7 commits June 7, 2026 09:38
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
`stripped.lstrip("-").isdigit()` stripped ALL leading hyphens, so "--1"
passed _is_int_like() but int("--1") raised ValueError at sort time.
Changed to `stripped.isdigit()` which is consistent with int() and also
rejects negative class indices (non-negative by YOLO spec).

---
Co-authored-by: Claude Code <noreply@anthropic.com>
When a data.yaml names dict has both int-like keys (e.g. "0", "2") and
non-int-like keys (e.g. "foo"), the previous code silently fell back to
lexicographic sort, producing wrong class-index order for numeric keys.
Now raises ValueError with a clear message listing example keys from each
category, consistent with the function's existing validation style.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Convert the one-off numeric-string test to a parametrized suite covering:
- Quoted string numeric keys (original bug scenario, '10' after '2')
- Native int keys (most common real YOLO format from Ultralytics/Roboflow)
- Non-numeric string keys (lexicographic fallback path)
- Empty names dict (vacuously correct edge case)
- Mixed numeric/non-numeric keys (ValueError from new mixed-key guard)

---
Co-authored-by: Claude Code <noreply@anthropic.com>
…able

sorted(..., key=lambda key: int(key)) — the lambda parameter shadows the
outer `keys` list variable name. Using `k` makes the scope boundary clear.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Without this guard, isinstance(True, int) == True causes YAML `true`/`false`
keys to be treated as class indices 1 and 0.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Documents the two-branch sort logic (numeric vs lexicographic), the bool
exclusion invariant, and the new mixed-key ValueError — non-obvious after
the recent changes.

---
Co-authored-by: Claude Code <noreply@anthropic.com>
Borda
Borda previously approved these changes Jun 7, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread tests/dataset/formats/test_yolo.py Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Borda Borda merged commit 35006d7 into roboflow:develop Jun 7, 2026
26 checks passed
@Borda Borda mentioned this pull request Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants