Skip to content

Dotnet - Add support for Foundry Adaptive evals#6267

Open
alliscode wants to merge 10 commits into
microsoft:mainfrom
alliscode:dotnet-adaptive-evals
Open

Dotnet - Add support for Foundry Adaptive evals#6267
alliscode wants to merge 10 commits into
microsoft:mainfrom
alliscode:dotnet-adaptive-evals

Conversation

@alliscode
Copy link
Copy Markdown
Member

This pull request adds support for consuming pre-existing Azure AI Foundry rubric (adaptive) evaluators in the .NET agent framework, enabling per-dimension scoring and CI gating on those dimensions. It introduces new core types for rubric evaluators, updates the Foundry evaluation pipeline to accept and process rubric references, and provides a comprehensive end-to-end sample (Evaluation_FoundryRubric) demonstrating usage. Additional documentation and environment setup instructions are included.

Rubric Evaluator Support

  • Introduced new core types in Microsoft.Agents.AI for rubric evaluators, including RubricScore, GeneratedEvaluatorRef, and per-dimension breakdowns in EvalScoreResult.Dimensions. Added assertion methods for CI gating on dimension scores.
  • Updated Foundry evaluation wiring to accept FoundryEvaluatorSpec (either built-in names or rubric refs), emit the correct wire shape for rubric evaluators, and preserve rubric refs through the evaluation pipeline. Rubric evaluators are skipped for ground-truth checks. [1] [2] [3] [4] [5]

Sample and Documentation

  • Added a new sample project Evaluation_FoundryRubric that demonstrates connecting to a pre-existing Foundry agent and rubric evaluator, mixing rubric and built-in evaluators, reading per-dimension scores, and enforcing CI quality gates. [1] [2] [3] [4]
  • Updated related sample READMEs to reference the new rubric evaluator sample. [1] [2]
  • Added .env.example for environment variable setup and expanded documentation on required endpoints and evaluator/agent distinctions. [1] [2]

Other Improvements

  • Clarified comments and parameter docs in the Foundry evaluation converter to reflect support for rubric evaluators and their data mapping/tool definitions.

This update enables robust integration with custom rubric evaluators in Azure AI Foundry, supporting advanced evaluation scenarios and CI gating on custom dimensions.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings June 2, 2026 15:10
@moonbox3 moonbox3 added documentation Improvements or additions to documentation python .NET labels Jun 2, 2026
@github-actions github-actions Bot changed the title Dotnet - Add support for Foundry Adaptive evals .NET: Dotnet - Add support for Foundry Adaptive evals Jun 2, 2026
@github-actions github-actions Bot changed the title .NET: Dotnet - Add support for Foundry Adaptive evals Python: Dotnet - Add support for Foundry Adaptive evals Jun 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/foundry/agent_framework_foundry
   _foundry_evals.py336897%471–472, 507–508, 665, 670, 853, 920
TOTAL37782439588% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7517 34 💤 0 ❌ 0 🔥 2m 0s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds cross-language support (Python + .NET) for consuming pre-existing Azure AI Foundry rubric/adaptive evaluators by reference (name/version), surfacing per-dimension rubric scores, and providing assertion helpers for CI gating.

Changes:

  • Introduces rubric evaluator core types (GeneratedEvaluatorRef, RubricScore, per-dimension score breakdowns) and CI assertion helpers.
  • Extends Foundry eval wiring to accept mixed evaluator specs (built-ins + rubric refs), emit the correct wire shape (incl. evaluator version), and parse per-dimension scores from result samples.
  • Adds end-to-end samples and documentation updates showing how to run rubric-based evals and gate on rubric dimensions.
Show a summary per file
File Description
python/uv.lock Lockfile update for dependency specifier ordering.
python/samples/05-end-to-end/evaluation/foundry_evals/README.md Documents how to reference rubric evaluators and gate on dimensions.
python/samples/05-end-to-end/evaluation/foundry_evals/evaluate_with_rubric_sample.py New runnable sample using a rubric evaluator + dimension gating.
python/samples/05-end-to-end/evaluation/foundry_evals/.env.example Adds env vars for agent + rubric evaluator refs.
python/packages/foundry/tests/test_foundry_evals.py Adds unit tests for rubric refs and rubric dimension extraction.
python/packages/foundry/agent_framework_foundry/_foundry_evals.py Adds GeneratedEvaluatorRef support and rubric dimension parsing into results.
python/packages/foundry/agent_framework_foundry/init.py Exports GeneratedEvaluatorRef.
python/packages/core/tests/core/test_local_eval.py Adds tests for per-dimension rubric assertion helpers.
python/packages/core/agent_framework/foundry/init.pyi Exposes GeneratedEvaluatorRef in stubs.
python/packages/core/agent_framework/foundry/init.py Lazy-export mapping for GeneratedEvaluatorRef.
python/packages/core/agent_framework/_evaluation.py Adds RubricScore, EvalScoreResult.dimensions, and rubric assertion helpers.
python/packages/core/agent_framework/init.py Exports RubricScore at top-level.
dotnet/tests/Microsoft.Agents.AI.UnitTests/EvaluationTests.cs Adds tests for rubric types + new assertion helpers.
dotnet/tests/Microsoft.Agents.AI.Foundry.UnitTests/FoundryEvalsTests.cs Updates tests to use FoundryEvaluatorSpec and adds rubric parsing tests.
dotnet/tests/Microsoft.Agents.AI.Foundry.UnitTests/FoundryEvalConverterTests.cs Adds tests ensuring rubric refs emit correct testing criteria wire shape.
dotnet/src/Microsoft.Agents.AI/Evaluation/RubricScore.cs New core type representing a rubric dimension score.
dotnet/src/Microsoft.Agents.AI/Evaluation/GeneratedEvaluatorRef.cs New core type referencing a provider-registered rubric evaluator.
dotnet/src/Microsoft.Agents.AI/Evaluation/EvalItemResult.cs Adds EvalScoreResult.Dimensions to carry per-dimension rubric breakdown.
dotnet/src/Microsoft.Agents.AI/Evaluation/AgentEvaluationResults.cs Adds score/dimension assertion helpers for CI gating (incl. recursion into sub-results).
dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvalWireModels.cs Adds wire model support for evaluator_version.
dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvaluatorSpec.cs New discriminated spec (built-in name vs rubric ref) with implicit conversions.
dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvals.cs Accepts evaluator specs, preserves rubric refs through filtering, parses rubric dimensions from samples.
dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvalConverter.cs Emits correct testing criteria for rubric refs (name/version + mapping) and skips rubric refs for ground-truth checks.
dotnet/samples/05-end-to-end/Evaluation/Evaluation_FoundryRubric/README.md New sample documentation for Foundry rubric evaluation + gating.
dotnet/samples/05-end-to-end/Evaluation/Evaluation_FoundryRubric/Program.cs New end-to-end sample program mixing rubric + built-ins and gating on a dimension.
dotnet/samples/05-end-to-end/Evaluation/Evaluation_FoundryRubric/Evaluation_FoundryRubric.csproj New sample project.
dotnet/samples/05-end-to-end/Evaluation/Evaluation_FoundryRubric/.env.example New sample env template for rubric evaluation.
dotnet/samples/02-agents/Evaluation/Evaluation_Multimodal/README.md Adds link to the new rubric evaluation sample.
dotnet/samples/02-agents/Evaluation/Evaluation_ExpectedOutputs/README.md Adds link to the new rubric evaluation sample.
dotnet/agent-framework-dotnet.slnx Adds the new rubric evaluation sample project to the solution.
docs/decisions/0023-foundry-evals-integration.md Records the follow-up decision and design notes for rubric evaluator consumption.

Copilot's findings

  • Files reviewed: 30/31 changed files
  • Comments generated: 3

Comment thread dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvals.cs
Comment thread dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvals.cs
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 90%

✓ Correctness

No actionable issues found in this dimension.

✓ Security Reliability

No actionable issues found in this dimension.

✓ Test Coverage

The PR adds comprehensive test coverage for the new rubric evaluator functionality in .NET (AssertScoreAtLeast, AssertDimensionScoreAtLeast, AssertNoFailedItems, ParseRubricScores, BuildTestingCriteria with rubric refs, FilterToolEvaluators with rubric refs, FindMissingGroundTruthEvaluators skipping rubric refs). The Python side tests assert_dimension_score_at_least thoroughly and covers BuildTestingCriteria, FilterToolEvaluators, and ParseRubricScores. However, the Python assert_score_at_least and assert_no_failed_items methods have zero test coverage despite having non-trivial logic (recursion into sub_results, offender formatting, threshold comparisons).

✗ Design Approach

I found two design issues in the new rubric support. The new sample advertises a CI quality gate but catches and suppresses the failure instead of returning a failing exit code, so the sample’s main scenario does not actually gate CI. Separately, the .NET Foundry evaluator path still auto-appends ToolCallAccuracy whenever tools are present, even when the caller explicitly provided a rubric-only evaluator list; that overrides explicit configuration in a way the Python implementation in this repo avoids. I found one design issue in the new rubric-score extraction path: the helper says it defensively handles SDK shape variation, but its top-level fallback only works for dict samples. A typed SDK sample object that exposes dimension_scores or rubric_scores directly on the sample instance is silently treated as a non-rubric evaluator, so per-dimension scores disappear from EvalScoreResult.dimensions.

Flagged Issues

  • dotnet/src/Microsoft.Agents.AI.Foundry/Evaluation/FoundryEvals.cs:155-159 unconditionally appends ToolCallAccuracy when tools are present, overiding an explicit rubric-only evaluator list. The Python implementation only auto-adds tool evaluators when evaluators is None.
  • python/packages/foundry/agent_framework_foundry/_foundry_evals.py:534-555 — _extract_rubric_scores() only searches top-level rubric keys for dict samples. A typed SDK sample object exposing dimension_scores/rubric_scores directly (no properties wrapper) is silently treated as non-rubric, losing per-dimension scores.

Automated review by alliscode's agents

Comment thread python/packages/foundry/agent_framework_foundry/_foundry_evals.py
Comment thread python/packages/core/agent_framework/_evaluation.py
Comment thread python/packages/core/agent_framework/_evaluation.py
@alliscode alliscode force-pushed the dotnet-adaptive-evals branch from a7cef93 to 5e6bf02 Compare June 2, 2026 15:28
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
Address PR microsoft#6267 review comments on the .NET FoundryEvals integration:

- Add source-compat overloads accepting `string[] evaluators` for `FoundryEvals` ctor, `EvaluateTracesAsync`, and `EvaluateFoundryTargetAsync` so existing callers passing string arrays keep compiling unchanged. New overloads forward via a private `ToSpecs` helper that wraps each name through the implicit `string -> FoundryEvaluatorSpec` conversion.

- Guard against `default(FoundryEvaluatorSpec)` entries (both `BuiltinName` and `GeneratedRef` null) that would NRE the downstream converter. Adds `FoundryEvaluatorSpec.IsValid` / `EnsureValid` plus an internal `EnsureAllSpecsValid` helper, wired into the main ctor and both static evaluation entry points.

- Add 6 unit tests covering the new validation surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comment: the FoundryRubric sample swallowed the AssertDimensionScoreAtLeast failure, so a CI run that included it as a quality gate would still exit 0 even when the rubric regressed. Set `System.Environment.ExitCode = 1` in the catch so CI fails while still letting the rest of the sample's logging complete cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comment: `_extract_rubric_scores` only searched the `properties` dict when the sample exposed one. When the Azure AI Projects typed SDK returns a Sample object that puts `dimension_scores` / `rubric_scores` directly on the instance (no `properties` wrapper), we missed them and surfaced no per-dimension scores.

Add an `else: containers.append(sample)` branch so non-dict typed samples are also inspected for the score keys. Covered by two new tests: one with `dimension_scores` directly on a typed Sample without a `properties` wrapper, and one with the legacy `rubric_scores` key in the same shape.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comments: both assertion helpers shipped without unit tests. Add `TestAssertScoreAtLeast` (above threshold, below w/ offenders, evaluator filter, sub_results recursion) and `TestAssertNoFailedItems` (all passing, failed/errored statuses, sub_results recursion) with a shared `_score_results` fixture builder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode and others added 5 commits June 2, 2026 10:15
Adds the core rubric-evaluator surface that mirrors the Python work in

PR microsoft#6101 (commit e45b934). Provider-agnostic types only — no Foundry

coupling. Subsequent commits will wire these into FoundryEvals.

- RubricScore: per-dimension score record (Id, Score?, Applicable, Weight, Reason).

- EvalScoreResult.Dimensions: optional init-only list of RubricScore.

  Null for non-rubric (built-in) evaluators.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the provider-agnostic surface for referencing a pre-existing rubric

evaluator and gating CI on per-item / per-dimension thresholds. Mirrors

Python PR microsoft#6101 commits e5830dd (ref type) and 4bc6046 (asserts).

- GeneratedEvaluatorRef: name + optional version/display-name, plus a

  Latest(name) factory for versionless refs (discouraged for CI; consumers

  should warn at run time).

- AgentEvaluationResults.AssertScoreAtLeast: walks DetailedItems[].Scores,

  optionally filtered by evaluator name, recurses into SubResults.

- AgentEvaluationResults.AssertDimensionScoreAtLeast: walks each score's

  Dimensions list, skips non-applicable dimensions by default, supports

  requireApplicable to flip that, recurses into SubResults.

- AgentEvaluationResults.AssertNoFailedItems: walks DetailedItems for

  fail/error statuses, recurses into SubResults.

All helpers throw InvalidOperationException (matches existing AssertAllPassed).

Truncates offender lists to the first 5 with a '+N more' suffix to keep

CI output readable, mirroring the Python helpers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds FoundryEvaluatorSpec, a readonly-struct union with implicit conversions
from both string and GeneratedEvaluatorRef so call sites can mix built-in
evaluator names with rubric evaluator references:

    var evals = new FoundryEvals(
        projectClient, model,
        new GeneratedEvaluatorRef("policy-rubric", "3"),
        FoundryEvals.Relevance,
        FoundryEvals.Coherence);

FoundryEvals constructors (3 overloads), EvaluateTracesAsync, and
EvaluateFoundryTargetAsync now take FoundryEvaluatorSpec[]/params instead of
string[]/params. Existing call sites using string literals or string[] keep
working unchanged via implicit conversion.

FoundryEvalConverter.BuildTestingCriteria emits the documented Foundry wire
format for rubric refs:
  {
    "type": "azure_ai_evaluator",
    "name": <DisplayName ?? Name>,
    "evaluator_name": <Name>,
    "evaluator_version": <Version>,   // omitted when null
    "initialization_parameters": { "deployment_name": <model> },
    "data_mapping": { conversation arrays, optional tool_definitions }
  }

WireTestingCriterion gains an optional EvaluatorVersion field. Rubric refs
are preserved through FilterToolEvaluators (tool-aware but not tool-required)
and ignored by FindMissingGroundTruthEvaluators. A versionless ref emits a
Trace.TraceWarning at criterion-build time so CI authors notice the floating
version (mirrors the Python warning).

Adds 6 new Foundry unit tests (3 BuildTestingCriteria rubric paths, 1
FindMissingGroundTruthEvaluators, 1 FilterToolEvaluators preservation, 1
mixed-order). 369/369 Foundry tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…core

Adds FoundryEvals.ParseRubricScores, called per result inside ParseDetailedItem.
Each EvalScoreResult now populates Dimensions when the evaluator's sample carries
a rubric breakdown.

Accepts three shapes for forward compatibility with provider SDK iterations:

  1. sample.properties.dimension_scores  (canonical Foundry runtime shape)
  2. sample.properties.rubric_scores     (preview/legacy key)
  3. top-level sample.dimension_scores / sample.rubric_scores  (defensive fallback)

Entries missing 'id', 'weight', or 'applicable' are skipped without invalidating
well-formed siblings. Non-applicable dimensions may omit 'score' (parsed as null).

Adds 6 unit tests covering canonical and legacy keys, top-level fallback, no-match
returns null, malformed-entry skipping, and the non-applicable null-score path.
375/375 Foundry tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds dotnet/samples/05-end-to-end/Evaluation/Evaluation_FoundryRubric mirroring
the Python evaluate_with_rubric_sample.py:

  - Fetches a pre-existing Foundry agent via AgentAdministrationClient
    (GetAgentAsync for latest, GetAgentVersionAsync when FOUNDRY_AGENT_VERSION
    is pinned).
  - References a rubric evaluator by GeneratedEvaluatorRef(name, version);
    falls back to GeneratedEvaluatorRef.Latest(name) with the documented
    floating-version warning.
  - Mixes the rubric with FoundryEvals.Relevance and FoundryEvals.Coherence
    in a single FoundryEvals run (implicit string-and-ref conversion).
  - Prints per-dimension breakdowns from EvalScoreResult.Dimensions for each
    item.
  - Demonstrates a CI quality gate with AssertDimensionScoreAtLeast("general_quality", 3.0).

Documents the FOUNDRY_PROJECT_ENDPOINT footgun (must be project-scoped URL
.../api/projects/<project>, not the bare Azure OpenAI endpoint) and the
Eval-Definition-vs-Rubric-Evaluator distinction in the README. Ships a
.env.example with the FOUNDRY_* variables.

Registers the project in agent-framework-dotnet.slnx and cross-links from
the sibling Evaluation_Multimodal / Evaluation_ExpectedOutputs READMEs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alliscode alliscode force-pushed the dotnet-adaptive-evals branch from 5564cd0 to 710a8a7 Compare June 2, 2026 17:21
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
Address PR microsoft#6267 review comments on the .NET FoundryEvals integration:

- Add source-compat overloads accepting `string[] evaluators` for `FoundryEvals` ctor, `EvaluateTracesAsync`, and `EvaluateFoundryTargetAsync` so existing callers passing string arrays keep compiling unchanged. New overloads forward via a private `ToSpecs` helper that wraps each name through the implicit `string -> FoundryEvaluatorSpec` conversion.

- Guard against `default(FoundryEvaluatorSpec)` entries (both `BuiltinName` and `GeneratedRef` null) that would NRE the downstream converter. Adds `FoundryEvaluatorSpec.IsValid` / `EnsureValid` plus an internal `EnsureAllSpecsValid` helper, wired into the main ctor and both static evaluation entry points.

- Add 6 unit tests covering the new validation surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comment: the FoundryRubric sample swallowed the AssertDimensionScoreAtLeast failure, so a CI run that included it as a quality gate would still exit 0 even when the rubric regressed. Set `System.Environment.ExitCode = 1` in the catch so CI fails while still letting the rest of the sample's logging complete cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comment: `_extract_rubric_scores` only searched the `properties` dict when the sample exposed one. When the Azure AI Projects typed SDK returns a Sample object that puts `dimension_scores` / `rubric_scores` directly on the instance (no `properties` wrapper), we missed them and surfaced no per-dimension scores.

Add an `else: containers.append(sample)` branch so non-dict typed samples are also inspected for the score keys. Covered by two new tests: one with `dimension_scores` directly on a typed Sample without a `properties` wrapper, and one with the legacy `rubric_scores` key in the same shape.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
alliscode added a commit to alliscode/agent-framework that referenced this pull request Jun 2, 2026
PR microsoft#6267 review comments: both assertion helpers shipped without unit tests. Add `TestAssertScoreAtLeast` (above threshold, below w/ offenders, evaluator filter, sub_results recursion) and `TestAssertNoFailedItems` (all passing, failed/errored statuses, sub_results recursion) with a shared `_score_results` fixture builder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alliscode alliscode removed the python label Jun 2, 2026
@alliscode alliscode changed the title Python: Dotnet - Add support for Foundry Adaptive evals Dotnet - Add support for Foundry Adaptive evals Jun 2, 2026
alliscode and others added 4 commits June 2, 2026 10:36
Address PR microsoft#6267 review comments on the .NET FoundryEvals integration:

- Add source-compat overloads accepting `string[] evaluators` for `FoundryEvals` ctor, `EvaluateTracesAsync`, and `EvaluateFoundryTargetAsync` so existing callers passing string arrays keep compiling unchanged. New overloads forward via a private `ToSpecs` helper that wraps each name through the implicit `string -> FoundryEvaluatorSpec` conversion.

- Guard against `default(FoundryEvaluatorSpec)` entries (both `BuiltinName` and `GeneratedRef` null) that would NRE the downstream converter. Adds `FoundryEvaluatorSpec.IsValid` / `EnsureValid` plus an internal `EnsureAllSpecsValid` helper, wired into the main ctor and both static evaluation entry points.

- Add 6 unit tests covering the new validation surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR microsoft#6267 review comment: the FoundryRubric sample swallowed the AssertDimensionScoreAtLeast failure, so a CI run that included it as a quality gate would still exit 0 even when the rubric regressed. Set `System.Environment.ExitCode = 1` in the catch so CI fails while still letting the rest of the sample's logging complete cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR microsoft#6267 review comment: `_extract_rubric_scores` only searched the `properties` dict when the sample exposed one. When the Azure AI Projects typed SDK returns a Sample object that puts `dimension_scores` / `rubric_scores` directly on the instance (no `properties` wrapper), we missed them and surfaced no per-dimension scores.

Add an `else: containers.append(sample)` branch so non-dict typed samples are also inspected for the score keys. Covered by two new tests: one with `dimension_scores` directly on a typed Sample without a `properties` wrapper, and one with the legacy `rubric_scores` key in the same shape.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR microsoft#6267 review comments: both assertion helpers shipped without unit tests. Add `TestAssertScoreAtLeast` (above threshold, below w/ offenders, evaluator filter, sub_results recursion) and `TestAssertNoFailedItems` (all passing, failed/errored statuses, sub_results recursion) with a shared `_score_results` fixture builder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alliscode alliscode force-pushed the dotnet-adaptive-evals branch from 710a8a7 to db25a71 Compare June 2, 2026 17:39
@moonbox3 moonbox3 added the python label Jun 2, 2026
@github-actions github-actions Bot changed the title Dotnet - Add support for Foundry Adaptive evals Python: Dotnet - Add support for Foundry Adaptive evals Jun 2, 2026
…ic sample

The Azure AI Foundry rubric evaluator concept doc page has not yet been published, so the link in the sample README and Program.cs comment 404s. Drop the references until the upstream doc is live.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 20/20 changed files
  • Comments generated: 1

Comment on lines +28 to +31
string projectEndpoint = Environment.GetEnvironmentVariable("FOUNDRY_PROJECT_ENDPOINT_3")
?? throw new InvalidOperationException("FOUNDRY_PROJECT_ENDPOINT is not set.");
string model = Environment.GetEnvironmentVariable("FOUNDRY_MODEL_3")
?? throw new InvalidOperationException("FOUNDRY_MODEL is not set.");
@alliscode alliscode changed the title Python: Dotnet - Add support for Foundry Adaptive evals Dotnet - Add support for Foundry Adaptive evals Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation .NET python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants