You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an extension that fetches per-test historical duration data from Azure DevOps Test Analytics and uses it to:
Lower the per-test still running after N seconds threshold for tests with known short historical runtime, surfacing potential hangs much earlier than the static default.
Decorate the emitted line with the historical p95/p99 so investigators have immediate context — no need to leave the CI log to confirm "is this slow normal for this test?".
Example output:
[slow] still running after 5s: Foo (historical p95 = 2s, p99 = 3s, samples = 120)
Hooks into the IProgressEnricher surface introduced by #9139 (silence-driven heartbeat renderer).
Motivation
Today's static threshold (e.g. 60s in #9139) is a lowest-common-denominator. A test that historically completes in 2 seconds is interesting at 5s, not 60s. AzDO already collects per-test duration history; we can leverage that for a dramatically better signal in the most common .NET CI host.
This realises the history-aware threshold engine extension point sketched in #3495 and the design discussion on #9125, without putting that intelligence into core (per Marco's earlier guidance on #3495).
Proposed design
Where it lives
Two options to decide before implementation:
Option A — extend the existing Microsoft.Testing.Extensions.AzureDevOpsReport package with a new opt-in feature.
Option B — new standalone Microsoft.Testing.Extensions.AzureDevOpsTestHistory package (keeps Report focused on reporting; this is data-fetching).
Default recommendation: Option B, because the new package needs different permissions (read access to Analytics) and depends on extra HTTP plumbing we don't want polluting the Report package.
Data source
Azure DevOps Test Analytics OData, TestResults entity (per-run granularity).
⚠ Spike needed: confirm whether TestResultsDaily (cheap, aggregate-only — likely just gives mean) is sufficient, or whether we need TestResults (per-run, more rows, true distribution required for p95/p99). Web research suggests percentile metrics are NOT exposed directly — we must fetch durations and compute client-side.
Endpoint shape: GET https://analytics.dev.azure.com/{org}/{project}/_odata/v3.0-preview/TestResults?\=...&\=AutomatedTestName,Duration,Date.
Auth: System.AccessToken env var (requires "Allow scripts to access OAuth token" pipeline toggle).
Scope of the history fetch
Same pipeline (build definition ID from BUILD_DEFINITIONID).
Default branch only (configurable). Cross-branch comparisons can mislead — different test environments.
Window: N days back (configurable; default 30).
Minimum sample size before emitting p95/p99 (default 10).
API unavailable / token missing / non-AzDO host / fork PR with no token: extension silently no-ops (verbose log only). Never fail the run.
Async fetch at session start; never block test execution. If fetch takes longer than the first test, the first few slow-test emissions use static thresholds.
Cache per-process: don't refetch within a run.
Hook usage
IProgressEnricher.OnSlowTestThreshold(test) → return min(staticDefault, p99 * multiplier) if we have history, else null (uses static).
⚠ Open question / spike: AzDO AutomatedTestName may not match MTP TestNode.Uid exactly for parameterised tests. Need an investigation spike to define the lookup key. Possible approaches:
Match on FQN namespace.class.method and rely on parameter-set aggregation.
Add an extension-supplied mapper hook.
Treat parameterised variants as a single bucket (less precise but simpler).
Knobs
--report-azdo-test-history (on/off) — or auto-on when --report-azdo-progress is set and access token is available.
--report-azdo-test-history-window-days N (default 30).
--report-azdo-test-history-min-sample N (default 10).
--report-azdo-test-history-multiplier X (default 3.0 — emit warning at p99 * 3, so a normally-2s test triggers at 9s, not 60s).
--report-azdo-test-history-branch <name> (default: main / repo default branch).
Privacy / safety
Don't log p95/p99 when sample size < minimum.
Don't log historical failure rates (out of scope; could leak flakiness data; separate feature).
Fork PRs typically have no token → extension no-ops, no leakage.
Touchpoints
New package: src/Platform/Microsoft.Testing.Extensions.AzureDevOpsTestHistory/ mirroring the AzureDevOpsReport layout.
Unit tests for percentile math + caching + fallback behaviour.
Acceptance test for graceful no-op outside AzDO.
Open questions
TestResults vs TestResultsDaily — what's the minimum we can query? (Spike.)
Test identity matching for parameterised tests. (Spike.)
Cross-branch fallback policy (when running on a branch with no history, fall back to default branch's history? or just static threshold?).
Should the extension also report duration regressions ("this test is 3x slower than last week") as warnings? (Probably out of scope; flag as follow-up.)
Out of scope
Other CI hosts (GitHub Actions doesn't have a comparable first-class test analytics API; would need a different data source, possibly cached artifacts).
Flakiness prediction / quarantine recommendations — separate feature.
Writing data back to AzDO — rely on AzDO's existing test result ingestion.
Summary
Add an extension that fetches per-test historical duration data from Azure DevOps Test Analytics and uses it to:
still running after N secondsthreshold for tests with known short historical runtime, surfacing potential hangs much earlier than the static default.Example output:
[slow] still running after 5s: Foo (historical p95 = 2s, p99 = 3s, samples = 120)Hooks into the
IProgressEnrichersurface introduced by #9139 (silence-driven heartbeat renderer).Motivation
Today's static threshold (e.g. 60s in #9139) is a lowest-common-denominator. A test that historically completes in 2 seconds is interesting at 5s, not 60s. AzDO already collects per-test duration history; we can leverage that for a dramatically better signal in the most common .NET CI host.
This realises the history-aware threshold engine extension point sketched in #3495 and the design discussion on #9125, without putting that intelligence into core (per Marco's earlier guidance on #3495).
Proposed design
Where it lives
Two options to decide before implementation:
Microsoft.Testing.Extensions.AzureDevOpsReportpackage with a new opt-in feature.Microsoft.Testing.Extensions.AzureDevOpsTestHistorypackage (keeps Report focused on reporting; this is data-fetching).Default recommendation: Option B, because the new package needs different permissions (read access to Analytics) and depends on extra HTTP plumbing we don't want polluting the Report package.
Data source
TestResultsentity (per-run granularity).TestResultsDaily(cheap, aggregate-only — likely just gives mean) is sufficient, or whether we needTestResults(per-run, more rows, true distribution required for p95/p99). Web research suggests percentile metrics are NOT exposed directly — we must fetch durations and compute client-side.GET https://analytics.dev.azure.com/{org}/{project}/_odata/v3.0-preview/TestResults?\=...&\=AutomatedTestName,Duration,Date.System.AccessTokenenv var (requires "Allow scripts to access OAuth token" pipeline toggle).Scope of the history fetch
BUILD_DEFINITIONID).Bootstrap & failure modes
Hook usage
IProgressEnricher.OnSlowTestThreshold(test)→ returnmin(staticDefault, p99 * multiplier)if we have history, else null (uses static).IProgressEnricher.OnSlowTestEmit(test, currentDuration)→ returns the(historical p95 = 2s, p99 = 3s, samples = 120)suffix.Test identity matching
⚠ Open question / spike: AzDO
AutomatedTestNamemay not match MTPTestNode.Uidexactly for parameterised tests. Need an investigation spike to define the lookup key. Possible approaches:namespace.class.methodand rely on parameter-set aggregation.Knobs
--report-azdo-test-history(on/off) — or auto-on when--report-azdo-progressis set and access token is available.--report-azdo-test-history-window-days N(default 30).--report-azdo-test-history-min-sample N(default 10).--report-azdo-test-history-multiplier X(default 3.0 — emit warning atp99 * 3, so a normally-2s test triggers at 9s, not 60s).--report-azdo-test-history-branch <name>(default:main/ repo default branch).Privacy / safety
Touchpoints
src/Platform/Microsoft.Testing.Extensions.AzureDevOpsTestHistory/mirroring theAzureDevOpsReportlayout.AzDoTestHistoryFetcher.cs— OData query + percentile compute.AzDoTestHistorySlowTestEnricher.cs— implementsIProgressEnricher.AzDoTestHistoryCommandLineProvider.cs— the four options above.test/IntegrationTests/Microsoft.Testing.Platform.Acceptance.IntegrationTests/HelpInfoAllExtensionsTests.cstest/IntegrationTests/Microsoft.Testing.Platform.Acceptance.IntegrationTests/MSBuild.KnownExtensionRegistration.csdotnet msbuild ... /t:UpdateXlf.Open questions
TestResultsvsTestResultsDaily— what's the minimum we can query? (Spike.)Out of scope
Related
IProgressEnricherhook surface (prerequisite).slowest testsfeature request; this realises the extension layer of that design.