Skip to content

fix: add missing requires_gpu and requires_heavy_ram markers to Huggi…#621

Closed
planetf1 wants to merge 1 commit into
generative-computing:mainfrom
planetf1:fix/hf-metrics-test-markers-620
Closed

fix: add missing requires_gpu and requires_heavy_ram markers to Huggi…#621
planetf1 wants to merge 1 commit into
generative-computing:mainfrom
planetf1:fix/hf-metrics-test-markers-620

Conversation

@planetf1
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 commented Mar 11, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

test_huggingface_token_metrics_integration was added in #563 without requires_gpu or requires_heavy_ram markers. All other HuggingFace tests carry both markers, which are enforced by conftest.py at collection time — requires_heavy_ram skips on systems with < 48 GB RAM, requires_gpu skips without a GPU. Without them the test ran unconditionally on any machine with OpenTelemetry installed, triggering a full model download and load that consumed 10–15 minutes and exhausted available memory.

On my macbook m1 32GB this stalled the system for 15 mins before i aborted it, very high memory pressure (>40GB) led to audio breakup/stuttering, and finally had to be terminated.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

…ngFace metrics test

test_huggingface_token_metrics_integration was missing the markers that
trigger conftest auto-skip logic on systems without sufficient GPU or RAM
(threshold: 48GB). Without them the test runs unconditionally, loading a
full HF model and consuming excessive time and memory.

Fixes generative-computing#620
@github-actions
Copy link
Copy Markdown
Contributor

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@mergify
Copy link
Copy Markdown

mergify Bot commented Mar 11, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@planetf1 planetf1 marked this pull request as ready for review March 11, 2026 15:06
@planetf1 planetf1 requested a review from a team as a code owner March 11, 2026 15:06
@planetf1 planetf1 requested a review from ajbozarth March 11, 2026 15:07
Copy link
Copy Markdown
Contributor

@ajbozarth ajbozarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Details below, but I don't think this is a necessary fix. If it's failing for you (or more people) we should identify why because it may be a different issue.

Comment on lines +302 to +305
@pytest.mark.skipif(
int(os.environ.get("CICD", 0)) == 1,
reason="Skipping HuggingFace metrics test in CI - requires GPU and model download",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is handled below by

if gh_run:
        pytest.skip("Skipping in CI - requires model download")

Should we replace that code with this decorator? Is the decorator the new standard way to skip in CI?

Comment on lines +300 to +301
@pytest.mark.requires_gpu
@pytest.mark.requires_heavy_ram
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered adding these, but in all my testing they didn't seem to apply. I saw these regularly take 30sec per run as long as the required model was already downloaded. I also made sure to choose a model that was used in other tests so these tests wouldn't download a new model only to use here

@ajbozarth
Copy link
Copy Markdown
Contributor

So I tested this and found that while the tests behave as I expect when just running the telemetry tests, that they do hang as @planetf1 found when running the entire suite.

After trying out #621 to see if it addressed the problem I found that it will actually make it worse without addressing this. But instead of just skipping I found that the issue was without the test isolation used in other hf tests (due to their need of high ram/gpu) that these tests were loading the model into memory multiple times and not clearing it. I have a better fix that adds a fixture to address this that I'll open in a bit after some more testing

@planetf1
Copy link
Copy Markdown
Contributor Author

Thanks @ajbozarth your fix is more appropriate. Closing.

@planetf1 planetf1 closed this Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_huggingface_token_metrics_integration is too heavy

2 participants