Skip to content

Fix #401: [Model] ComparativeContainment#662

Merged
GiggleLiu merged 11 commits intomainfrom
issue-401-comparative-containment
Mar 17, 2026
Merged

Fix #401: [Model] ComparativeContainment#662
GiggleLiu merged 11 commits intomainfrom
issue-401-comparative-containment

Conversation

@GiggleLiu
Copy link
Contributor

Summary

Fixes #401

@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.11%. Comparing base (302d54d) to head (171b96b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #662      +/-   ##
==========================================
+ Coverage   97.09%   97.11%   +0.02%     
==========================================
  Files         292      294       +2     
  Lines       38984    39220     +236     
==========================================
+ Hits        37851    38088     +237     
+ Misses       1133     1132       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@GiggleLiu
Copy link
Contributor Author

Implementation Summary

Changes

  • Added the new ComparativeContainment set model with One, i32, and f64 variants, schema registration, example-db coverage, library exports, and dedicated unit tests.
  • Added pred create ComparativeContainment support, including --r-sets / --s-sets, weight parsing for i32 and f64, /One validation, no-flag help output, and CLI regression tests.
  • Added the Typst paper entry and figure, then regenerated the checked-in schema, reduction-graph, and example-fixture exports.

Deviations from Plan

  • During review, I fixed two issues discovered after the initial implementation: the /f64 CLI path was still parsing weights as integers, and the schema-driven help path advertised --universe-size instead of the actual --universe flag.
  • I also rewrote one sentence in the paper entry to avoid implying reduction edges that are not yet present in the exported catalog.

Open Questions

  • This PR follows the existing weighted-model convention and does not add explicit positivity validation for non-unit weights. If maintainers want stricter enforcement for ComparativeContainment, that can be added consistently in a follow-up.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ComparativeContainment set-based satisfaction problem to the core library and wires it through the CLI, example DB, and generated documentation so it can be created/serialized consistently like existing models.

Changes:

  • Introduces the ComparativeContainment<W> model (schema registration, variants, example-db spec, and unit tests).
  • Extends pred create with new flags (--r-sets/--s-sets/--r-weights/--s-weights) and help formatting tweaks.
  • Regenerates example fixtures and docs artifacts (schemas + reduction graph) and adds a paper entry.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/unit_tests/trait_consistency.rs Adds the new model to the trait-consistency smoke test suite.
src/unit_tests/models/set/comparative_containment.rs New unit tests covering creation, evaluation, brute-force solving, and serialization.
src/models/set/mod.rs Registers and re-exports the new set model; adds example-db specs hook.
src/models/set/comparative_containment.rs Implements the ComparativeContainment model, schema entry, variants, and example-db spec.
src/models/mod.rs Re-exports ComparativeContainment at the models module level.
src/lib.rs Re-exports ComparativeContainment via the crate prelude.
src/example_db/fixtures/examples.json Adds a canonical model example and updates regenerated fixtures.
problemreductions-cli/tests/cli_tests.rs Adds CLI tests for creating ComparativeContainment instances across weight variants.
problemreductions-cli/src/commands/create.rs Adds ComparativeContainment creation logic; generalizes set/weight parsing helpers; improves help flag naming.
problemreductions-cli/src/cli.rs Adds new create flags for ComparativeContainment and updates CLI help banner.
docs/src/reductions/reduction_graph.json Regenerates reduction graph nodes/indices including ComparativeContainment variants.
docs/src/reductions/problem_schemas.json Regenerates exported schema list to include ComparativeContainment.
docs/paper/reductions.typ Adds Comparative Containment entry to the paper (definition + example).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +57 to +76
/// Create a new instance with explicit weights.
pub fn with_weights(
universe_size: usize,
r_sets: Vec<Vec<usize>>,
s_sets: Vec<Vec<usize>>,
r_weights: Vec<W>,
s_weights: Vec<W>,
) -> Self {
assert_eq!(
r_sets.len(),
r_weights.len(),
"number of R sets and R weights must match"
);
assert_eq!(
s_sets.len(),
s_weights.len(),
"number of S sets and S weights must match"
);
validate_set_family("R", universe_size, &r_sets);
validate_set_family("S", universe_size, &s_sets);
Comment on lines +123 to +127
self.valid_config(config)
&& config
.iter()
.enumerate()
.all(|(element, &selected)| selected == 0 || set.contains(&element))
…01-comparative-containment

# Conflicts:
#	docs/paper/reductions.typ
#	docs/src/reductions/reduction_graph.json
#	problemreductions-cli/src/commands/create.rs
#	problemreductions-cli/tests/cli_tests.rs
#	src/lib.rs
#	src/models/mod.rs
#	src/unit_tests/trait_consistency.rs
…01-comparative-containment

# Conflicts:
#	docs/src/reductions/problem_schemas.json
#	docs/src/reductions/reduction_graph.json
#	problemreductions-cli/src/cli.rs
#	problemreductions-cli/src/commands/create.rs
#	src/example_db/fixtures/examples.json
#	src/unit_tests/trait_consistency.rs
@GiggleLiu
Copy link
Contributor Author

Agentic Review Report

Structural Check

Structural Review: model ComparativeContainment

Structural Completeness

# Check Status
1 Model file exists (src/models/set/comparative_containment.rs) PASS
2 inventory::submit! present PASS
3 #[derive(...Serialize, Deserialize)] on struct PASS
4 Problem trait impl PASS
5 SatisfactionProblem impl PASS
6 #[cfg(test)] + #[path = "..."] test link PASS
7 Test file exists (src/unit_tests/models/set/comparative_containment.rs) PASS
8 Test file has >= 3 test functions PASS — 13 test functions found
9 Registered in set/mod.rs PASS
10 Re-exported in models/mod.rs PASS
11 declare_variants! entry exists PASS — 3 variants (One, i32/default, f64), all marked sat
12 CLI resolve_alias entry PASS — uses catalog-based resolution via inventory::submit!
13 CLI create support PASS — full parsing for all 3 weight variants
14 Canonical model example registered PASS — via canonical_model_example_specs() in model file, chained through set/mod.rs into model_builders.rs; fixture present in examples.json
15 Paper display-name entry PASS — "ComparativeContainment": [Comparative Containment]
16 Paper problem-def block PASS — #problem-def("ComparativeContainment") with full example, data-driven diagram, and satisfying-solution enumeration

Build Status

  • make test: PASS — all tests passed, 0 failures
  • make clippy: PASS — no warnings with -D warnings

Semantic Review

  • evaluate() correctness: OK — For each candidate subset Y (encoded as a binary config), it computes R-weight sum and S-weight sum by iterating over sets and checking containment. Returns true if R-total >= S-total. Empty set correctly treated as contained in every set. Manual verification confirms correctness.
  • dims() correctness: OK — Returns vec![2; self.universe_size], matching the binary configuration space.
  • Size getter consistency: OK — universe_size(), num_r_sets(), num_s_sets() provided as inherent methods. Complexity string "2^universe_size" references valid getter.
  • Weight handling: OK — Weights managed via inherent methods. Weight validation uses Any downcasting for i32 and f64.
  • Variant configuration: OK — variant_params![W] with single weight dimension.

Issue Compliance

# Check Status
1 Problem name matches issue OK
2 Mathematical definition matches OK
3 Problem type (opt/sat) matches OK — Satisfaction, Metric = bool
4 Type parameters match OK — W: WeightElement
5 Configuration space matches OK — vec![2; universe_size]
6 Feasibility check matches OK
7 Objective function matches OK — r_total >= s_total
8 Complexity matches OK — "2^universe_size"

Summary

  • 16/16 structural checks passed
  • 8/8 issue compliance checks passed
  • 0 FAIL/ISSUE items found
  • Note: merge conflict with main in problemreductions-cli/src/commands/create.rs (merge logistics, not structural deficiency)

Quality Check

Design Principles

  • DRY: ISSUEis_valid_solution() (line 152-157) and evaluate() (line 190-195) in src/models/set/comparative_containment.rs contain identical logic: both pattern-match on (r_weight_sum, s_weight_sum) and return r_total >= s_total. The is_valid_solution method should delegate to the shared weight comparison (or just call evaluate if the trait bound can be satisfied). If the different trait bounds make delegation impractical, at least extract the shared comparison into a private method that both call.

  • DRY: OK — The validate_set_family and validate_weight_family helper functions are well-factored and reused for both R and S families. The sum_containing_weights helper avoids duplicating the weight-accumulation loop. CLI helper functions properly generalize existing patterns.

  • KISS: ISSUE — The validate_weight_family function (line 224-239) uses dyn Any downcasting to validate weight positivity at runtime. This approach is brittle — adding a new weight type would silently skip validation. A trait-based approach would be more idiomatic. That said, the current weight type set (One, i32, f64) is closed, so it works correctly even if inelegant.

  • KISS: OK — Overall model structure is straightforward. Clean separation of concerns.

  • HC/LC: OK — Each function has a single responsibility. No god functions.

HCI (CLI/MCP changed)

  • Error messages: OK — Actionable error messages with usage examples.
  • Discoverability: OK--help table documents required flags.
  • Consistency: OK--r-sets/--s-sets/--r-weights/--s-weights naming follows model conventions.
  • Least surprise: OK — Default weights are all-ones when flags are omitted.
  • Feedback: OK--universe flag naming tested to avoid confusion.

Test Quality

  • Naive test detection: ISSUEr_weight_sum() and s_weight_sum() public methods (lines 142-149) are never directly tested. Only exercised indirectly through evaluate() and BruteForce::find_all_satisfying().

  • Validation coverage gap: ISSUE — Only the R-family side of i32 weight validation is tested (test_comparative_containment_rejects_nonpositive_i32_weights). No #[should_panic] test for S-family with nonpositive i32 weights, nor for S-family count mismatch.

  • Assert count: OK — Tests have sufficient assertions across both yes/no instances.

  • Adversarial cases: OK — Tests include invalid configs, no-instance, nonpositive weights, non-finite weights, out-of-range elements. Paper example verifies exact solution count (3).

  • Instance sizes: OK — yes_instance uses universe_size=4 with 2 R-sets and 2 S-sets (16 configurations to enumerate).

Issues

Important (Should Fix):

  1. DRY: is_valid_solution duplicates evaluatesrc/models/set/comparative_containment.rs:152-157 and :190-195 contain identical logic. Extract shared comparison into a private method or delegate.

  2. KISS: validate_weight_family uses dyn Any downcastingsrc/models/set/comparative_containment.rs:224-239. Brittle pattern; adding a new weight type silently skips validation. Consider trait-based approach or at minimum a documenting comment.

  3. Test gap: r_weight_sum/s_weight_sum not directly testedsrc/unit_tests/models/set/comparative_containment.rs. Direct value assertions would catch bugs masked by the comparison in evaluate.

  4. Test gap: S-family i32 validation and S-family count mismatch not tested — Missing #[should_panic] tests for nonpositive i32 weight on S side and mismatched S-set/S-weight count.

Minor (Nice to Have):

  1. validate_weight_family silently passes for unknown types — Not a current bug but could become one.

  2. Redundant use std::any::Any import — Only needed for the downcasting pattern.

  3. No test for empty set families — Edge case: r_sets = vec![] or s_sets = vec![] not tested.


Agentic Feature Tests

Summary

Overall verdict: PASS — no issues found. All CLI workflows function correctly from a downstream user's perspective.

Tests Performed

# Test Result
1 Discoverability (pred list) PASS — All 3 variants listed, i32 marked default, correct complexity
2 Problem Information (pred show) PASS — 5 fields with correct types, 0 reductions (expected for new model)
3 Example Creation (pred create --example) PASS — Canonical example produces valid JSON
4 Custom Instance Creation (pred create ...) PASS — All weight variants, default weights, error handling verified
5 Solving (pred solve) PASS — Correct solutions for yes/no instances across all weight variants
6 Reductions (pred reduce) N/A — No reductions exist (expected for new isolated model)
7 Serialization Round-Trip PASS — JSON output from pred create correctly consumed by pred solve
8 Test Suite PASS — 13 unit tests + 6 CLI integration tests pass
9 Paper Entry PASS — Complete problem-def with example, diagram, and solution enumeration

Semantic Correctness Verification

Manually verified evaluate() semantics:

  • "Containing" relation correctly implemented: every selected element of Y must appear in set S
  • Empty set Y correctly handled (contained in every set)
  • Weight sums and >= comparison correct
  • Canonical example [1,0,0,0] (Y={0}) verified: R-sum=7 >= S-sum=3
  • No-instance verified: all 4 configs checked, none satisfying

Issues Found

None. The implementation is complete and correct from a user perspective.


Generated by review-pipeline

GiggleLiu and others added 2 commits March 18, 2026 00:28
- evaluate() now delegates to is_valid_solution() instead of duplicating
  the r_weight_sum >= s_weight_sum comparison
- Add missing #[should_panic] test for S-family i32 nonpositive weights

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GiggleLiu
Copy link
Contributor Author

Follow-up items (recorded during final review):

  • validate_weight_family in src/models/set/comparative_containment.rs uses dyn Any downcasting for weight validation. If a new weight type is added to WeightElement, validation silently passes. Consider adding trait-based weight validation (e.g., fn validate_positive(&self) on WeightElement) as a cross-cutting improvement across all models that validate weights.

GiggleLiu and others added 2 commits March 18, 2026 01:20
…e coverage

- Replace `validate_weight_family` dyn Any downcasting with WeightElement
  trait: use `weight.to_sum() > zero` via `partial_cmp`, which correctly
  rejects non-positive and NaN values for all weight types
- Remove `use std::any::Any` import
- Tighten impl block bound from `W: 'static` to `W: WeightElement`
- Add direct r_weight_sum/s_weight_sum test with value assertions
- Add S-family weight count mismatch should_panic test
- Update should_panic messages to match unified validation error format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…elper

The empty_args() helper in create.rs tests constructs CreateArgs
manually and was missing the r_sets/s_sets/r_weights/s_weights fields
added by this PR, causing a compilation error in CI coverage builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GiggleLiu GiggleLiu merged commit 9640b88 into main Mar 17, 2026
5 checks passed
@GiggleLiu GiggleLiu deleted the issue-401-comparative-containment branch March 17, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Model] ComparativeContainment

2 participants