Skip to content

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#986

Open
aliceb-nv wants to merge 339 commits intoNVIDIA:release/26.04from
aliceb-nv:gpudet
Open

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#986
aliceb-nv wants to merge 339 commits intoNVIDIA:release/26.04from
aliceb-nv:gpudet

Conversation

@aliceb-nv
Copy link
Copy Markdown
Contributor

@aliceb-nv aliceb-nv commented Mar 23, 2026

This PR implements the integration of GPU heuristics into the deterministic codepath of the solver. This enables running full end-to-end solves with presolve, parallel B&B, and GPU heuristics, with full run-to-run determinism and heterogeneous (CPU+GPU) execution.

For this purpose, the GPU heuristics were modified to eliminate sources of nondeterminism such as non-ordered floating point reductions, atomic operations, scheduling-dependent operations... Extensive testing has been added to the CI runs to ensure individual components (such as probing, FJ, recombiners, FP...) are bitwise deterministic, and their overall orchestration maintains this invariant.
Furthermore, to reduce changes to the codebase and splits in the codepath, wall-clock timers on the heuristics side were replaced with unified termination checking objects, that check for wall time (opportunistic mode), or work total (deterministic mode).
These timers are hierarchical, and thus allow for checking against multiple terminations sources, for example a local algo's work budget, the global wall clock time limit, or in future PRs user-controlled termination signals like Ctrl-C or callback handlers.

B&B acts as the authority for incumbent publishing to the user. At every horizon sync, the queue of GPU heuristic solutions is checked, and if solutions whose timestamp is <= the current B&B work total, they are drained, replayed, and incorporated into the tree exploration.

A new get_solution_callback_ext callback has been introduced to prevent ABI breaks and pass more metadata with each published solution. For now, only the solution origin and its work unit timestamp are exposed, but additional fields may be later added without breaking the ABI.
The benchmark runner now relies on such callbacks to record the incumbents and their timestamps - it is no longer necessary to parse the logs to compute the primal integral.

A few other bugfixes have been included to the existing B&B deterministic exploration regarding node queues. Other fixes include uninitialized memory accesses, and improvements to allow running compute-sanitizer initcheck without false positives. Determinism logs have also been added thorought the code, disabled by default.

Some features remain unsupported in deterministic mode, such as Papilo's Probing presolver, concurrent root solve, RINS, or the sub-MIP recombiner. These will be incorporated in later PRs.
Furthermore, the PR has been modified to take the ML estimators out for the sake of reducing code review workload. They will be incorporated again in later PRs, along with improved tuning.
No intentional API or ABI breaking changes introduced.

Opportunistic mode performance remains unchanged:
Primal gap 12.5%, 226 feasible, primal integral 19.1%.

Deterministic mode benchmarks pending.

Closes #144
Closes #882

Forward-merge release/26.04 into main
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

Forward-merge release/26.04 into main
@chris-maes chris-maes modified the milestones: 26.04, 26.06 Apr 2, 2026
@chris-maes chris-maes removed the P2 label Apr 2, 2026
mlubin and others added 10 commits April 3, 2026 18:20
These tests are expensive, currently broken, and aren't covered under CI. New contributors will be confused if they see failing tests, so they should be excluded from the instructions.

Authors:
  - Miles Lubin (https://github.com/mlubin)

Approvers:
  - Hugo Linsenmaier (https://github.com/hlinsen)
  - Ramakrishnap (https://github.com/rgsl888prabhu)

URL: NVIDIA#1034
Forward-merge release/26.04 into main
Forward-merge release/26.04 into main
Forward-merge release/26.04 into main
Forward-merge release/26.04 into main
Forward-merge release/26.04 into main
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1484-1494: ⚠️ Potential issue | 🟠 Major

A finite work budget still overruns by one sampling window.

Line 1492 only executes inside the iterations % 100 == 0 block, so work_budget == 0 or any small finite budget still allows up to one extra 100-iteration batch before the loop stops. If this limit is meant to be authoritative, it needs an upfront check or a tighter sampling interval while a finite budget is active.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1484 - 1494,
The loop currently only checks the finite budget every 100 iterations (the if
(fj_cpu.iterations % 100 == 0 ...) block), allowing up to one extra
100-iteration batch to overrun a small work_budget; fix this by adding an
authoritative budget check at the start of the loop using
fj_cpu.work_units_elapsed.load(std::memory_order_relaxed) >= fj_cpu.work_budget
and break early if true (so the budget is enforced on every iteration), and keep
the existing 100-iteration sampling (memory_aggregator.collect(), biased_work
update, producer_sync->notify_progress()) as-is.
cpp/src/mip_heuristics/local_search/local_search.cu (1)

264-317: ⚠️ Potential issue | 🟠 Major

Return the merged solution's feasibility, not just cpu_feasible.

After the merge, solution can be feasible because the GPU path found one even when cpu_feasible is false. run_fj_on_zero() forwards this return value directly, so it can miss a valid zero-seeded incumbent and keep going into FP unnecessarily.

💡 Minimal fix
-  solution.compute_feasibility();
-
-  return cpu_feasible;
+  const bool merged_feasible = solution.compute_feasibility();
+  return merged_feasible;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/local_search.cu` around lines 264 - 317,
The function currently returns cpu_feasible which ignores the merged solution
state after possibly copying the CPU solution into solution; change the return
to reflect the final merged feasibility by returning solution.get_feasible() (or
assign a merged_feasible = solution.get_feasible() after the final
solution.compute_feasibility() and return that). Update the final return
statement (replace "return cpu_feasible;") in the local_search.cu block handling
CPU/GPU merge so callers receive the merged solution's true feasibility.
🧹 Nitpick comments (15)
cpp/CMakeLists.txt (2)

546-556: Convert the commented PAPI block into an option-gated implementation or remove it.

This dead commented block will drift and confuse future maintainers. PAPI is not referenced elsewhere in the CMake configuration, so converting it behind an explicit option(...) or deleting it entirely is safe.

Proposed option-gated implementation
-# find_path(PAPI_INCLUDE_DIR papi.h)
-# find_library(PAPI_LIBRARY papi)
-#
-# if (PAPI_INCLUDE_DIR AND PAPI_LIBRARY)
-#     message(STATUS "Found PAPI in ${PAPI_INCLUDE_DIR}")
-#     target_include_directories(cuopt PRIVATE ${PAPI_INCLUDE_DIR})
-#     target_link_libraries(cuopt PRIVATE ${PAPI_LIBRARY})
-# else()
-#     message(FATAL_ERROR "Could not find PAPI")
-# endif()
+option(CUOPT_ENABLE_PAPI "Enable PAPI integration" OFF)
+if(CUOPT_ENABLE_PAPI)
+  find_path(PAPI_INCLUDE_DIR papi.h)
+  find_library(PAPI_LIBRARY papi)
+  if(PAPI_INCLUDE_DIR AND PAPI_LIBRARY)
+    message(STATUS "Found PAPI in ${PAPI_INCLUDE_DIR}")
+    target_include_directories(cuopt PRIVATE ${PAPI_INCLUDE_DIR})
+    target_link_libraries(cuopt PRIVATE ${PAPI_LIBRARY})
+  else()
+    message(FATAL_ERROR "CUOPT_ENABLE_PAPI=ON but PAPI was not found")
+  endif()
+endif()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/CMakeLists.txt` around lines 546 - 556, The commented PAPI detection
block should be either removed or converted to an option-gated implementation:
add a CMake option like ENABLE_PAPI (default OFF) and wrap the existing
find_path/find_library logic in if(ENABLE_PAPI) ... endif(), using the same
variables PAPI_INCLUDE_DIR and PAPI_LIBRARY and applying
target_include_directories(cuopt PRIVATE ${PAPI_INCLUDE_DIR}) and
target_link_libraries(cuopt PRIVATE ${PAPI_LIBRARY}) when found, and emit a
clear message(FATAL_ERROR "...") only when ENABLE_PAPI is ON and the libraries
are missing; alternatively delete the commented block entirely if PAPI support
is not needed.

666-667: Guard -pie with platform/capability checks and use CMake's linker form.

The unconditional -pie flag (line 681) lacks platform guards. The -pie flag is primarily for Linux ELF binaries; on Windows and macOS, it may be ineffective or cause issues. Use CheckPIESupported, guard with CMAKE_SYSTEM_NAME STREQUAL "Linux", and switch to LINKER:-pie for proper linker driver handling.

Proposed change
+include(CheckPIESupported)
+check_pie_supported()
+
 set_target_properties(cuopt_cli
   PROPERTIES
   CXX_STANDARD 20
   CXX_STANDARD_REQUIRED ON
   CXX_SCAN_FOR_MODULES OFF
   POSITION_INDEPENDENT_CODE ON
 )
 
-target_link_options(cuopt_cli PRIVATE -pie)
+if(CMAKE_SYSTEM_NAME STREQUAL "Linux")
+  target_link_options(cuopt_cli PRIVATE "LINKER:-pie")
+endif()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/CMakeLists.txt` around lines 666 - 667, The CMakeLists currently adds an
unconditional -pie flag; update it to only add PIE when supported and
appropriate by calling CMake's CheckPIESupported (or using CMake's PIE support
macros), guard with a platform check such as CMAKE_SYSTEM_NAME STREQUAL "Linux"
(or the CheckPIESupported result), and add the flag via the
target_link_options/linker form (e.g., LINKER:-pie) instead of raw compile
flags; locate the section that mentions "-pie" and replace the unconditional
addition with a conditional that queries CheckPIESupported and CMAKE_SYSTEM_NAME
and then appends LINKER:-pie to the target_link_options for the relevant
targets.
cpp/src/mip_heuristics/problem/presolve_data.cu (1)

225-228: Make the problem argument const here.

This implementation only reads problem.handle_ptr, but the non-const reference now forces problem_t<i_t, f_t>::papilo_uncrush_assignment(...) to const_cast *this at the call site. Changing this to const problem_t<i_t, f_t>& keeps the API const-correct and removes that footgun.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/problem/presolve_data.cu` around lines 225 - 228,
Change the parameter type of presolve_data_t<i_t,f_t>::papilo_uncrush_assignment
from a non-const reference to a const reference (i.e., const problem_t<i_t,f_t>&
problem) so the method is const-correct; update the function signature
accordingly and ensure all uses inside (e.g., accessing problem.handle_ptr)
remain reads-only so no other changes are required, then remove any call-site
const_casts that were compensating for the non-const parameter.
cpp/src/mip_heuristics/solver_solution.cu (1)

249-255: Make pool_hash reflect the pool order.

XOR collapses permutations and duplicate pairs, so distinct solution pools can emit the same provenance hash. If this field is meant to fingerprint the exact pool state, hash the sequence of per-solution hashes instead.

Possible adjustment
   uint32_t pool_hash = 0;
+  std::vector<uint32_t> pool_hashes;
+  pool_hashes.reserve(solution_pool_.size());
   for (const auto& pool_sol : solution_pool_) {
     if (pool_sol.size() > 0) {
       auto host_pool_sol = cuopt::host_copy(pool_sol, rmm::cuda_stream_default);
-      pool_hash ^= detail::compute_hash(host_pool_sol);
+      pool_hashes.push_back(detail::compute_hash(host_pool_sol));
+    } else {
+      pool_hashes.push_back(0);
     }
   }
+  if (!pool_hashes.empty()) { pool_hash = detail::compute_hash(pool_hashes); }

Also applies to: 261-274

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver_solution.cu` around lines 249 - 255, The
current XOR of per-solution hashes (pool_hash ^= detail::compute_hash(...))
collapses order and duplicates; instead compute a sequence-aware combined hash
by iterating solution_pool_, calling cuopt::host_copy for each pool_sol and
computing per_hash = detail::compute_hash(host_pool_sol), then fold per_hash
into pool_hash using an order-sensitive combiner (e.g., pool_hash = pool_hash *
PRIME + per_hash or use a std::hash_combine-style mix) so the final pool_hash
reflects the sequence and content; apply the same change to the analogous block
around the region noted (lines 261-274) so both places use the ordered
combination instead of XOR.
cpp/src/mip_heuristics/presolve/bounds_presolve.cu (1)

170-177: Avoid making the deterministic cap sticky across future calls.

Clamping settings.iteration_limit in-place mutates the stored solver settings, so later invocations can no longer distinguish the user-configured limit from this deterministic cap. Use a local loop bound instead.

♻️ Suggested refactor
-  if ((context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS)) {
-    settings.iteration_limit = std::min(settings.iteration_limit, 50);
-  }
+  i_t iter_limit = settings.iteration_limit;
+  if ((context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS)) {
+    iter_limit = std::min(iter_limit, i_t{50});
+  }
 
   i_t iter;
   upd.init_changed_constraints(pb.handle_ptr);
-  for (iter = 0; iter < settings.iteration_limit; ++iter) {
+  for (iter = 0; iter < iter_limit; ++iter) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/presolve/bounds_presolve.cu` around lines 170 - 177,
The code mutates the stored solver settings by clamping settings.iteration_limit
in-place; instead, compute a local loop bound and leave settings untouched. In
bound_presolve_t<i_t, f_t>::bound_update_loop, check
context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS and set a
local variable (e.g. max_iterations or loop_limit) to
std::min(settings.iteration_limit, 50) and use that local variable for the
loop/termination checks rather than assigning back to settings.iteration_limit
so future calls see the original user-configured value.
cpp/include/cuopt/linear_programming/cuopt_c.h (1)

786-797: Document the extended callback contract in the public header.

Please spell out whether cuOptSetMIPGetSolutionCallbackExt replaces or coexists with cuOptSetMIPGetSolutionCallback, and which thread invokes it. Users need that to avoid duplicate processing and to synchronize shared state safely.

As per coding guidelines, "cpp/include/cuopt/**/*: Flag API changes that may need corresponding docs/ updates" and "Suggest documenting thread-safety, GPU requirements, and numerical behavior".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/cuopt_c.h` around lines 786 - 797,
Update the public header documentation for cuOptSetMIPGetSolutionCallbackExt to
explicitly state its contract: clarify whether cuOptSetMIPGetSolutionCallbackExt
replaces or coexists with cuOptSetMIPGetSolutionCallback (e.g., "replaces
existing callback" or "coexists; both may be invoked"), name which
thread/context invokes the callback (e.g., solver main thread, solver worker
thread, or a dedicated callback thread), and enumerate thread-safety and
synchronization requirements (reentrancy, whether the callback may be called
concurrently, and whether callers must protect shared state). Also document
GPU/CPU context and numerical behavior expectations (e.g., pointers/arrays are
host memory, any lifetime guarantees for solution buffers, floating-point
formats/scaling), and add a brief note that API changes need corresponding docs
updates per the repository guidelines; reference
cuOptSetMIPGetSolutionCallbackExt and cuOptSetMIPGetSolutionCallback in the
wording so readers can find both symbols.
cpp/src/branch_and_bound/pseudo_costs.hpp (1)

362-367: Document the distinction from queue_update.

record_update() only records the update without mutating the snapshot's sum_*/num_* arrays, unlike queue_update() which does both. A brief doc comment would clarify this intentional difference.

   // Record an update that was already applied to the arrays (e.g. by strong branching).
+  // Unlike queue_update(), this does NOT modify sum_*/num_* arrays - only records for later replay.
   void record_update(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp` around lines 362 - 367, Add a
short doc comment above record_update explaining it only appends an update to
updates_ without modifying the snapshot's aggregation arrays (the sum_*/num_*
members), in contrast to queue_update which both records and applies the change
to those arrays; mention the parameters (variable, direction, delta, clock,
worker_id) to make the intent explicit so callers know to use queue_update when
they need the snapshot's sums/counts updated.
cpp/tests/mip/local_search_test.cu (2)

68-68: Empty stub function.

setup_device_symbols is an empty function that casts its parameter to void. If this is a placeholder for future implementation, consider adding a TODO comment. If it's no longer needed, remove it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/local_search_test.cu` at line 68, The function
setup_device_symbols(rmm::cuda_stream_view stream_view) is an empty stub; either
implement the intended device-symbol initialization logic or remove the function
and its declarations/usages; if you intend to keep it as a placeholder, add a
clear TODO comment explaining what must be implemented and why the parameter is
currently unused (or mark it [[maybe_unused]]), and update any callers or
headers referencing setup_device_symbols accordingly to avoid dead code or
unused-parameter warnings.

196-197: Clarify purpose of spin_stream_raii_t objects.

Two spin_stream_raii_t objects are created but their relationship to the determinism test isn't obvious from the code. Consider adding a comment explaining their purpose (e.g., stress-testing CUDA stream scheduling for determinism).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/local_search_test.cu` around lines 196 - 197, The two
spin_stream_raii_t instances (spin_stream_1 and spin_stream_2) are created
without explanation; add a concise comment above their declarations clarifying
they spawn additional CUDA streams that spin to stress test stream scheduling
and ensure the determinism test exercises concurrent stream behavior (i.e.,
stress-testing CUDA stream scheduling for determinism), so future readers
understand why these dummy/spinning streams are present.
cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh (2)

71-72: Consider documenting or extracting the work-unit scaling constant.

The 1e8 divisor is a magic number. The TODO comment suggests this is provisional. Consider extracting this to a named constant or documenting the rationale for this specific scale factor to aid future maintenance.

+// Work unit scale: one work unit per 100M variables from the other parent
+constexpr double SUB_MIP_WORK_SCALE = 1e8;
 // TODO: CHANGE
-double work = static_cast<double>(n_vars_from_other) / 1e8;
+double work = static_cast<double>(n_vars_from_other) / SUB_MIP_WORK_SCALE;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh` around lines 71 -
72, Replace the magic divisor 1e8 used when computing work (double work =
static_cast<double>(n_vars_from_other) / 1e8) with a named constant (e.g.,
kWorkUnitScale or WORK_UNIT_SCALE) and add a short comment explaining the
rationale or units for the scale; update any surrounding TODO to remove
ambiguity and use the new constant where work or similar scaling is needed
referencing the symbol n_vars_from_other and the variable work.

203-203: Using std::isnan to detect valid solutions may be fragile.

Relying on !std::isnan(branch_and_bound_solution.objective) to determine success assumes the objective is initialized to NaN when no solution is found. Consider checking branch_and_bound_status directly for a more explicit success condition.

-return std::make_tuple(offspring, !std::isnan(branch_and_bound_solution.objective), work);
+bool found_solution = (branch_and_bound_status != dual_simplex::mip_status_t::UNSET &&
+                       branch_and_bound_status != dual_simplex::mip_status_t::INFEASIBLE);
+return std::make_tuple(offspring, found_solution, work);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh` at line 203, The
return currently uses !std::isnan(branch_and_bound_solution.objective) to
indicate success; instead use the explicit branch_and_bound_status value to
determine success. Update the return in the function that returns
std::make_tuple(offspring, ..., work) to replace the second element with a
boolean derived from branch_and_bound_status (e.g., branch_and_bound_status ==
<success-state> or simply branch_and_bound_status) so it no longer relies on
NaN-initialization of branch_and_bound_solution.objective; reference the
variables branch_and_bound_solution, branch_and_bound_status, offspring, and
work when making this change.
cpp/src/mip_heuristics/solver_context.cuh (1)

72-74: Null termination pointer could cause issues if not set before use.

The termination pointer is initialized to nullptr and expected to be set by mip_solver_t after construction. Consider adding a getter that asserts non-null, or document more prominently that callers must set this before using any timer functionality that depends on it.

+  cuopt::termination_checker_t& get_termination() const
+  {
+    cuopt_assert(termination != nullptr, "termination must be set before use");
+    return *termination;
+  }
+
   // Root termination checker — set by mip_solver_t after construction.
   // All sub-timers should use this as parent for wall-clock safety.
   cuopt::termination_checker_t* termination{nullptr};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver_context.cuh` around lines 72 - 74, The
termination pointer (termination in solver_context.cuh) may remain nullptr until
mip_solver_t sets it, so add a safe accessor on the solver context (e.g.,
get_termination() or termination_checked()) that asserts or throws if
termination is null and returns the non-null pointer; update any callers that
read termination (timers or sub-timers) to use this accessor instead of
accessing the raw termination member directly, or alternatively change the
member to a non-null reference/init-on-construction pattern so the compiler
enforces it. Ensure the accessor name (get_termination/termination_checked) and
mip_solver_t usage are the unique symbols referenced so the fix is easy to
locate.
cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu (1)

32-39: Disabled determinism logging macro.

The #if 0 block disables verbose determinism logging. Consider using a runtime flag or build option instead of commented-out code, or remove if this is only for temporary debugging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu`
around lines 32 - 39, The disabled determinism logging block leaving
CUOPT_DETERMINISM_LOG wrapped in `#if` 0 should be replaced with a controllable
mechanism: either remove the block if it's temporary, or guard the macro with a
meaningful compile-time flag (e.g., replace `#if` 0 with `#ifdef`
ENABLE_CUOPT_DETERMINISM_LOG) and document/emit that flag from your build
system, or implement a runtime check that forwards to CUOPT_LOG_INFO only when a
global/flag (e.g., bool enableDeterminismLog) is true; update the macro
definition for CUOPT_DETERMINISM_LOG and references to CUOPT_LOG_INFO
accordingly so enabling determinism logging is controlled by a build or runtime
option rather than commented-out code.
cpp/src/mip_heuristics/diversity/population.cuh (1)

127-127: Make population.cuh self-contained for work_limit_timer_t.

adjust_threshold(...) and the timer member now expose cuopt::work_limit_timer_t, but this header still only includes utilities/timer.hpp. Please add the direct work_limit_timer.hpp include here so this header does not depend on transitive include order.

💡 Minimal fix
 `#include` <utilities/timer.hpp>
+#include <utilities/work_limit_timer.hpp>

Keep utilities/timer.hpp as well if external_solution_t::timer still needs timer_t.

Also applies to: 216-216

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/population.cuh` at line 127, The header
exposes cuopt::work_limit_timer_t (in adjust_threshold and the timer member) but
only includes utilities/timer.hpp; add a direct include of the work_limit_timer
header (work_limit_timer.hpp) to make population.cuh self-contained and not rely
on transitive includes, while keeping utilities/timer.hpp if
external_solution_t::timer still requires timer_t.
cpp/src/branch_and_bound/branch_and_bound.cpp (1)

1130-1148: Consider extracting common feasibility validation logic.

The check_guess() + rejection logging pattern is duplicated across nondeterministic_policy_t::handle_integer_solution, deterministic_bfs_policy_t::handle_integer_solution, and deterministic_diving_policy_t::handle_integer_solution. A shared helper would reduce maintenance burden.

♻️ Example helper signature
// In branch_and_bound_t or as a free function
template <typename i_t, typename f_t>
bool validate_integer_solution(
  const lp_problem_t<i_t, f_t>& lp,
  const simplex_solver_settings_t<i_t, f_t>& settings,
  const std::vector<variable_type_t>& var_types,
  const std::vector<f_t>& x,
  int worker_id,
  int node_id,
  int depth,
  f_t obj,
  logger_t& log);

Then each policy's handle_integer_solution would call this helper and return early if it returns false.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 1130 - 1148,
Extract the duplicated check_guess + rejection logging into a helper (e.g.,
validate_integer_solution) and have each policy's handle_integer_solution call
it and return early on false; specifically move the call to check_guess(...)
plus the bnb.settings_.log.printf(...) path into the helper (accepting
bnb.original_lp_, bnb.settings_, bnb.var_types_, x, node->node_id, node->depth,
obj, and logger) and return a bool so existing callers
(nondeterministic_policy_t::handle_integer_solution,
deterministic_bfs_policy_t::handle_integer_solution,
deterministic_diving_policy_t::handle_integer_solution) simply call
validate_integer_solution(...) and only call bnb.add_feasible_solution(...) when
it returns true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ci/release/update-version.sh`:
- Around line 155-157: Wrap the call to ci/utils/update_doc_versions.py in a
RUN_CONTEXT check and add error handling: only invoke the Python script when
RUN_CONTEXT equals "release" (or update if policy says run in both contexts) and
ensure the command's failure is caught and causes the script to exit with a
non‑zero status; reference the RUN_CONTEXT variable and the
ci/utils/update_doc_versions.py invocation so you update that section to
conditionally execute and append a failure handler that echoes an error and
exits.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 1422-1425: The two local variables previous_local_upper and
previous_seq are assigned but never used; remove their declarations and
assignments to clean up dead code around the call that updates
this->worker.local_upper_bound and calls
this->worker.queue_integer_solution(obj, x, node->depth). Ensure only the
necessary state updates (this->worker.local_upper_bound = obj and the
queue_integer_solution call) remain and do not alter any other logic in the
surrounding function.
- Around line 640-642: The local variable heuristic_queue_size is computed but
never used; remove the unused declaration to avoid dead code by deleting the
line that assigns const size_t heuristic_queue_size =
heuristic_solution_queue_.size() (located near the call to
heuristic_solution_queue_.push_back and before mutex_heuristic_queue_.unlock()),
or if the queue size is intended to be used later, replace the removal by using
heuristic_queue_size where needed—otherwise simply delete that variable
declaration.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 1690-1701: The pseudo-cost update divides by frac (computed from
root_soln[j]) which may be extremely small near integer boundaries; modify the
loops around strong_branch_down[k] and strong_branch_up[k] to guard against
near-zero frac by computing a safe divisor: set an epsilon (e.g. using
std::numeric_limits<f_t>::epsilon() scaled or a small constant like 1e-8) and
either skip the update when frac < eps or replace frac with std::max(frac, eps)
before dividing; apply this change to the fractions computed for down (frac =
root_soln[j] - std::floor(...)) and up (frac = std::ceil(...) - root_soln[j]) so
updates to pseudo_cost_sum_down[j], pseudo_cost_sum_up[j] and increments of
pseudo_cost_num_down[j]/pseudo_cost_num_up[j] only occur with a numerically safe
divisor.

In `@cpp/src/dual_simplex/simplex_solver_settings.hpp`:
- Around line 206-207: The default for
simplex_solver_settings_t::gpu_heur_wait_for_exploration is inconsistent with
the public MIP default; change the member declaration (currently next to
bb_work_unit_scale in simplex_solver_settings_t) to default to false so internal
constructions get the same scheduling behavior as mip_solver_settings_t (ensure
the bool gpu_heur_wait_for_exploration{...} initialization is set to false).

In `@cpp/src/mip_heuristics/diversity/population.cu`:
- Around line 226-231: The code incorrectly updates new_best_feasible_objective
when a non-CPU_FEASIBILITY_JUMP entry is worse (h_entry.objective >
new_best_feasible_objective); change the condition in the else-if that
references h_entry and internals::mip_solution_origin_t::CPU_FEASIBILITY_JUMP so
it only updates new_best_feasible_objective when the incoming objective is
better (use h_entry.objective < new_best_feasible_objective) to tighten the
drain threshold and keep later CPU_FEASIBILITY_JUMP entries correctly skipped.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 1037-1041: deterministic_work_estimate and use_graph currently
allow launching a full graph batch even when the remaining deterministic budget
is less than a full batch; change the logic so when deterministic_work_estimate
is true you compute the remaining deterministic budget and cap
iterations_per_batch (and the decision to use the graph) to not exceed
deterministic_batch_work — i.e., replace the unconditional use_graph=true and
iterations_per_batch = iterations_per_graph with logic that if
deterministic_work_estimate && deterministic_batch_work < iterations_per_graph
then set iterations_per_batch = deterministic_batch_work (or fall back to
non-graph single-iteration mode) before calling run_step_device(); apply the
same fix at the other similar sites referenced (the blocks around the other
occurrences) so no full batch is launched when budget would be overshot.
- Around line 995-1017: The new kernel launch
init_lhs_and_violated_constraints<i_t, f_t><<<4096, 256, 0, stream>>>(v) lacks
immediate CUDA error checking; add a CUDA_CHECK (or the project's equivalent)
right after that launch to call cudaGetLastError()/cudaStreamSynchronize as
appropriate using the same stream so failures are reported at the kernel launch
site before the subsequent thrust::transform_reduce calls (refer to
init_lhs_and_violated_constraints and the local variable stream for placement).

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1333-1337: The direct use of
context.settings.cpufj_work_unit_scale can be zero, negative, or NaN and must be
validated before applying to cpu_work_unit_scale and fj_cpu->work_unit_bias;
update the selection logic around cpu_work_unit_scale (which now chooses between
context.settings.cpufj_work_unit_scale and read_positive_work_unit_scale) to
explicitly check that context.settings.cpufj_work_unit_scale is finite and > 0
(and not NaN) and only use it if valid, otherwise call
read_positive_work_unit_scale(...) or clamp to a safe minimum, ensuring
fj_cpu->work_unit_bias is only multiplied by a strictly positive finite scale.

In `@cpp/src/mip_heuristics/local_search/local_search.cu`:
- Around line 233-249: The helpers are being given independent deterministic
budgets (cpu_fj.fj_cpu->work_budget = time_limit) which lets multiple helpers
consume more than the single deterministic budget; instead, register each
created CPU climber with the shared preempt/work context (use the same
registration mechanism as start_cpufj_deterministic) so they account against one
deterministic budget—replace the per-worker work_budget assignment for entries
in ls_cpu_fj with a call that registers the created cpu_fj.fj_cpu with
context.preempt_heuristic_solver_ (or invoke the same start_cpufj_deterministic
helper) so work_units_elapsed and work_budget are tracked by the shared
producer/consumer context rather than set independently.

In `@cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu`:
- Around line 945-947: The II-repair call uses the broader
`timer`/`*context.termination` instead of the local per-call budget created by
`apply_round()` (`max_timer`), which lets repair outlive
`max_time_for_bounds_prop`; update `find_integer()` (and the other occurrences
around `bounds_repair.repair_problem`) to construct the `repair_timer` from the
active parent/local `max_timer` (or derive it from the `timer` object passed
down by `apply_round()`), and pass that localized timer into
`bounds_repair.repair_problem(*sol.problem_ptr, *orig_sol.problem_ptr,
repair_timer, sol.handle_ptr)` so repair work is bounded by the per-round
`max_timer` hierarchy rather than `*context.termination`.

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 100-103: The helpers create an ad-hoc work_limit_context_t and
attach a work_limit_timer_t to diversity_manager.timer instead of using the
shared deterministic context; change them to use solver.context.gpu_heur_loop as
the deterministic work context (set solver.context.gpu_heur_loop.deterministic =
true where needed) and construct the work_limit_timer_t and any nested timers
with solver.context.gpu_heur_loop and the existing timer variable so the
nested/local-search/FJ timers read the same deterministic context as in
test_full_run_determinism(). Ensure you update the corresponding locations where
diversity_manager, work_limit_timer_t, and timer are set (also apply the same
change to the other two helper blocks mentioned).

In `@cpp/tests/mip/feasibility_jump_tests.cu`:
- Around line 193-198: The current check using first_val_map and EXPECT_NEAR
only asserts similar numeric objective, not exact determinism; change the test
to record and compare a full deterministic serialization of the solution
(include solution.is_feasible() flag, solution.get_user_objective(), and the
produced assignment/variable values) for each test_instance. Replace
first_val_map with a map from test_instance to the serialized result (e.g.,
string/byte vector) and on subsequent runs serialize the current solution the
same way and ASSERT_EQ the stored bytes with the current bytes. Use the existing
symbols solution.get_user_objective(), the feasible flag accessor (e.g.,
solution.is_feasible()), and however assignments are exposed to build a
byte-for-byte representation to compare exact equality rather than EXPECT_NEAR.

In `@cpp/tests/mip/presolve_test.cu`:
- Around line 138-144: The determinism serialization currently only reads
a.second[0].var_to_cached_bound_map into sorted_map, so nondeterminism from the
[1] probe is ignored; update the logic that builds
var_to_cached_bound_keys/var_to_cached_bound_lb/var_to_cached_bound_ub to
include both a.second[0].var_to_cached_bound_map and
a.second[1].var_to_cached_bound_map (e.g., iterate index 0 and 1, or merge both
maps before creating sorted_map) so that both cached-bound maps are serialized
into the hash; ensure you use the same sorting/iteration approach and push_back
the corresponding var_id, cached_bound.lb and cached_bound.ub for entries from
both maps.
- Around line 110-117: The root termination_checker_t is initialized with 0.0
which makes it immediately expired and causes work_limit_timer_t (probing_timer)
to abort before compute_probing_cache runs; update the root timer construction
in this test (the termination_checker_t instance used to create probing_timer)
to use a very large limit (e.g., std::numeric_limits<double>::max()) so the root
check does not short-circuit compute_probing_cache and the probing_cache
computation actually executes.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1484-1494: The loop currently only checks the finite budget every
100 iterations (the if (fj_cpu.iterations % 100 == 0 ...) block), allowing up to
one extra 100-iteration batch to overrun a small work_budget; fix this by adding
an authoritative budget check at the start of the loop using
fj_cpu.work_units_elapsed.load(std::memory_order_relaxed) >= fj_cpu.work_budget
and break early if true (so the budget is enforced on every iteration), and keep
the existing 100-iteration sampling (memory_aggregator.collect(), biased_work
update, producer_sync->notify_progress()) as-is.

In `@cpp/src/mip_heuristics/local_search/local_search.cu`:
- Around line 264-317: The function currently returns cpu_feasible which ignores
the merged solution state after possibly copying the CPU solution into solution;
change the return to reflect the final merged feasibility by returning
solution.get_feasible() (or assign a merged_feasible = solution.get_feasible()
after the final solution.compute_feasibility() and return that). Update the
final return statement (replace "return cpu_feasible;") in the local_search.cu
block handling CPU/GPU merge so callers receive the merged solution's true
feasibility.

---

Nitpick comments:
In `@cpp/CMakeLists.txt`:
- Around line 546-556: The commented PAPI detection block should be either
removed or converted to an option-gated implementation: add a CMake option like
ENABLE_PAPI (default OFF) and wrap the existing find_path/find_library logic in
if(ENABLE_PAPI) ... endif(), using the same variables PAPI_INCLUDE_DIR and
PAPI_LIBRARY and applying target_include_directories(cuopt PRIVATE
${PAPI_INCLUDE_DIR}) and target_link_libraries(cuopt PRIVATE ${PAPI_LIBRARY})
when found, and emit a clear message(FATAL_ERROR "...") only when ENABLE_PAPI is
ON and the libraries are missing; alternatively delete the commented block
entirely if PAPI support is not needed.
- Around line 666-667: The CMakeLists currently adds an unconditional -pie flag;
update it to only add PIE when supported and appropriate by calling CMake's
CheckPIESupported (or using CMake's PIE support macros), guard with a platform
check such as CMAKE_SYSTEM_NAME STREQUAL "Linux" (or the CheckPIESupported
result), and add the flag via the target_link_options/linker form (e.g.,
LINKER:-pie) instead of raw compile flags; locate the section that mentions
"-pie" and replace the unconditional addition with a conditional that queries
CheckPIESupported and CMAKE_SYSTEM_NAME and then appends LINKER:-pie to the
target_link_options for the relevant targets.

In `@cpp/include/cuopt/linear_programming/cuopt_c.h`:
- Around line 786-797: Update the public header documentation for
cuOptSetMIPGetSolutionCallbackExt to explicitly state its contract: clarify
whether cuOptSetMIPGetSolutionCallbackExt replaces or coexists with
cuOptSetMIPGetSolutionCallback (e.g., "replaces existing callback" or "coexists;
both may be invoked"), name which thread/context invokes the callback (e.g.,
solver main thread, solver worker thread, or a dedicated callback thread), and
enumerate thread-safety and synchronization requirements (reentrancy, whether
the callback may be called concurrently, and whether callers must protect shared
state). Also document GPU/CPU context and numerical behavior expectations (e.g.,
pointers/arrays are host memory, any lifetime guarantees for solution buffers,
floating-point formats/scaling), and add a brief note that API changes need
corresponding docs updates per the repository guidelines; reference
cuOptSetMIPGetSolutionCallbackExt and cuOptSetMIPGetSolutionCallback in the
wording so readers can find both symbols.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 1130-1148: Extract the duplicated check_guess + rejection logging
into a helper (e.g., validate_integer_solution) and have each policy's
handle_integer_solution call it and return early on false; specifically move the
call to check_guess(...) plus the bnb.settings_.log.printf(...) path into the
helper (accepting bnb.original_lp_, bnb.settings_, bnb.var_types_, x,
node->node_id, node->depth, obj, and logger) and return a bool so existing
callers (nondeterministic_policy_t::handle_integer_solution,
deterministic_bfs_policy_t::handle_integer_solution,
deterministic_diving_policy_t::handle_integer_solution) simply call
validate_integer_solution(...) and only call bnb.add_feasible_solution(...) when
it returns true.

In `@cpp/src/branch_and_bound/pseudo_costs.hpp`:
- Around line 362-367: Add a short doc comment above record_update explaining it
only appends an update to updates_ without modifying the snapshot's aggregation
arrays (the sum_*/num_* members), in contrast to queue_update which both records
and applies the change to those arrays; mention the parameters (variable,
direction, delta, clock, worker_id) to make the intent explicit so callers know
to use queue_update when they need the snapshot's sums/counts updated.

In `@cpp/src/mip_heuristics/diversity/population.cuh`:
- Line 127: The header exposes cuopt::work_limit_timer_t (in adjust_threshold
and the timer member) but only includes utilities/timer.hpp; add a direct
include of the work_limit_timer header (work_limit_timer.hpp) to make
population.cuh self-contained and not rely on transitive includes, while keeping
utilities/timer.hpp if external_solution_t::timer still requires timer_t.

In `@cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh`:
- Around line 71-72: Replace the magic divisor 1e8 used when computing work
(double work = static_cast<double>(n_vars_from_other) / 1e8) with a named
constant (e.g., kWorkUnitScale or WORK_UNIT_SCALE) and add a short comment
explaining the rationale or units for the scale; update any surrounding TODO to
remove ambiguity and use the new constant where work or similar scaling is
needed referencing the symbol n_vars_from_other and the variable work.
- Line 203: The return currently uses
!std::isnan(branch_and_bound_solution.objective) to indicate success; instead
use the explicit branch_and_bound_status value to determine success. Update the
return in the function that returns std::make_tuple(offspring, ..., work) to
replace the second element with a boolean derived from branch_and_bound_status
(e.g., branch_and_bound_status == <success-state> or simply
branch_and_bound_status) so it no longer relies on NaN-initialization of
branch_and_bound_solution.objective; reference the variables
branch_and_bound_solution, branch_and_bound_status, offspring, and work when
making this change.

In `@cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu`:
- Around line 32-39: The disabled determinism logging block leaving
CUOPT_DETERMINISM_LOG wrapped in `#if` 0 should be replaced with a controllable
mechanism: either remove the block if it's temporary, or guard the macro with a
meaningful compile-time flag (e.g., replace `#if` 0 with `#ifdef`
ENABLE_CUOPT_DETERMINISM_LOG) and document/emit that flag from your build
system, or implement a runtime check that forwards to CUOPT_LOG_INFO only when a
global/flag (e.g., bool enableDeterminismLog) is true; update the macro
definition for CUOPT_DETERMINISM_LOG and references to CUOPT_LOG_INFO
accordingly so enabling determinism logging is controlled by a build or runtime
option rather than commented-out code.

In `@cpp/src/mip_heuristics/presolve/bounds_presolve.cu`:
- Around line 170-177: The code mutates the stored solver settings by clamping
settings.iteration_limit in-place; instead, compute a local loop bound and leave
settings untouched. In bound_presolve_t<i_t, f_t>::bound_update_loop, check
context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS and set a
local variable (e.g. max_iterations or loop_limit) to
std::min(settings.iteration_limit, 50) and use that local variable for the
loop/termination checks rather than assigning back to settings.iteration_limit
so future calls see the original user-configured value.

In `@cpp/src/mip_heuristics/problem/presolve_data.cu`:
- Around line 225-228: Change the parameter type of
presolve_data_t<i_t,f_t>::papilo_uncrush_assignment from a non-const reference
to a const reference (i.e., const problem_t<i_t,f_t>& problem) so the method is
const-correct; update the function signature accordingly and ensure all uses
inside (e.g., accessing problem.handle_ptr) remain reads-only so no other
changes are required, then remove any call-site const_casts that were
compensating for the non-const parameter.

In `@cpp/src/mip_heuristics/solver_context.cuh`:
- Around line 72-74: The termination pointer (termination in solver_context.cuh)
may remain nullptr until mip_solver_t sets it, so add a safe accessor on the
solver context (e.g., get_termination() or termination_checked()) that asserts
or throws if termination is null and returns the non-null pointer; update any
callers that read termination (timers or sub-timers) to use this accessor
instead of accessing the raw termination member directly, or alternatively
change the member to a non-null reference/init-on-construction pattern so the
compiler enforces it. Ensure the accessor name
(get_termination/termination_checked) and mip_solver_t usage are the unique
symbols referenced so the fix is easy to locate.

In `@cpp/src/mip_heuristics/solver_solution.cu`:
- Around line 249-255: The current XOR of per-solution hashes (pool_hash ^=
detail::compute_hash(...)) collapses order and duplicates; instead compute a
sequence-aware combined hash by iterating solution_pool_, calling
cuopt::host_copy for each pool_sol and computing per_hash =
detail::compute_hash(host_pool_sol), then fold per_hash into pool_hash using an
order-sensitive combiner (e.g., pool_hash = pool_hash * PRIME + per_hash or use
a std::hash_combine-style mix) so the final pool_hash reflects the sequence and
content; apply the same change to the analogous block around the region noted
(lines 261-274) so both places use the ordered combination instead of XOR.

In `@cpp/tests/mip/local_search_test.cu`:
- Line 68: The function setup_device_symbols(rmm::cuda_stream_view stream_view)
is an empty stub; either implement the intended device-symbol initialization
logic or remove the function and its declarations/usages; if you intend to keep
it as a placeholder, add a clear TODO comment explaining what must be
implemented and why the parameter is currently unused (or mark it
[[maybe_unused]]), and update any callers or headers referencing
setup_device_symbols accordingly to avoid dead code or unused-parameter
warnings.
- Around line 196-197: The two spin_stream_raii_t instances (spin_stream_1 and
spin_stream_2) are created without explanation; add a concise comment above
their declarations clarifying they spawn additional CUDA streams that spin to
stress test stream scheduling and ensure the determinism test exercises
concurrent stream behavior (i.e., stress-testing CUDA stream scheduling for
determinism), so future readers understand why these dummy/spinning streams are
present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bddac155-630c-4b07-9087-ad7acabbd05b

📥 Commits

Reviewing files that changed from the base of the PR and between dc8f073 and 629f681.

📒 Files selected for processing (139)
  • .claude-plugin/marketplace.json
  • .cursor-plugin/plugin.json
  • .github/CODEOWNERS
  • .github/workflows/build.yaml
  • .github/workflows/build_test_publish_images.yaml
  • .github/workflows/pr.yaml
  • .github/workflows/test.yaml
  • .github/workflows/trigger-breaking-change-alert.yaml
  • CONTRIBUTING.md
  • RAPIDS_BRANCH
  • README.md
  • VERSION
  • ci/release/update-version.sh
  • conda/environments/all_cuda-129_arch-aarch64.yaml
  • conda/environments/all_cuda-129_arch-x86_64.yaml
  • conda/environments/all_cuda-131_arch-aarch64.yaml
  • conda/environments/all_cuda-131_arch-x86_64.yaml
  • cpp/CMakeLists.txt
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/cuopt_c.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/utilities/internals.hpp
  • cpp/src/barrier/barrier.cu
  • cpp/src/barrier/iterative_refinement.hpp
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/branch_and_bound/pseudo_costs.hpp
  • cpp/src/dual_simplex/phase2.cpp
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/mip_heuristics/diversity/diversity_config.hpp
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/diversity/diversity_manager.cuh
  • cpp/src/mip_heuristics/diversity/lns/rins.cu
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/diversity/population.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh
  • cpp/src/mip_heuristics/early_heuristic.cuh
  • cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuh
  • cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu
  • cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/mip_heuristics/local_search/local_search.cu
  • cpp/src/mip_heuristics/local_search/local_search.cuh
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/conditional_bound_strengthening.cu
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve_helpers.cuh
  • cpp/src/mip_heuristics/presolve/multi_probe.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cu
  • cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
  • cpp/src/mip_heuristics/presolve/third_party_presolve.hpp
  • cpp/src/mip_heuristics/presolve/trivial_presolve.cuh
  • cpp/src/mip_heuristics/problem/presolve_data.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/problem/problem.cuh
  • cpp/src/mip_heuristics/problem/problem_helpers.cuh
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
  • cpp/src/mip_heuristics/solution/solution.cu
  • cpp/src/mip_heuristics/solution/solution.cuh
  • cpp/src/mip_heuristics/solution_callbacks.cuh
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver.cuh
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/src/mip_heuristics/solver_solution.cu
  • cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
  • cpp/src/pdlp/pdhg.cu
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
  • cpp/src/pdlp/step_size_strategy/adaptive_step_size_strategy.cu
  • cpp/src/pdlp/termination_strategy/convergence_information.cu
  • cpp/src/pdlp/termination_strategy/infeasibility_information.cu
  • cpp/src/pdlp/utils.cuh
  • cpp/src/routing/local_search/compute_compatible.cu
  • cpp/src/routing/route/break_route.cuh
  • cpp/src/routing/route/capacity_route.cuh
  • cpp/src/routing/route/dimensions_route.cuh
  • cpp/src/routing/route/distance_route.cuh
  • cpp/src/routing/route/mismatch_route.cuh
  • cpp/src/routing/route/pdp_route.cuh
  • cpp/src/routing/route/prize_route.cuh
  • cpp/src/routing/route/route.cuh
  • cpp/src/routing/route/service_time_route.cuh
  • cpp/src/routing/route/tasks_route.cuh
  • cpp/src/routing/route/time_route.cuh
  • cpp/src/routing/route/tsp_route.cuh
  • cpp/src/routing/route/vehicle_fixed_cost_route.cuh
  • cpp/src/routing/solution/route_node_map.cuh
  • cpp/src/routing/utilities/check_input.cu
  • cpp/src/utilities/copy_helpers.hpp
  • cpp/tests/mip/CMakeLists.txt
  • cpp/tests/mip/determinism_test.cu
  • cpp/tests/mip/diversity_test.cu
  • cpp/tests/mip/feasibility_jump_tests.cu
  • cpp/tests/mip/load_balancing_test.cu
  • cpp/tests/mip/local_search_test.cu
  • cpp/tests/mip/mip_utils.cuh
  • cpp/tests/mip/multi_probe_test.cu
  • cpp/tests/mip/presolve_test.cu
  • datasets/mip/download_miplib_test_dataset.sh
  • dependencies.yaml
  • docs/cuopt/source/versions1.json
  • gemini-extension.json
  • helmchart/cuopt-server/Chart.yaml
  • helmchart/cuopt-server/values.yaml
  • merge_review_findings_release_26_04.md
  • python/cuopt/pyproject.toml
  • python/cuopt_self_hosted/pyproject.toml
  • python/cuopt_server/pyproject.toml
  • python/libcuopt/pyproject.toml
  • skills/cuopt-developer/SKILL.md
  • skills/cuopt-installation-api-c/SKILL.md
  • skills/cuopt-installation-api-python/SKILL.md
  • skills/cuopt-installation-common/SKILL.md
  • skills/cuopt-installation-developer/SKILL.md
  • skills/cuopt-lp-milp-api-c/SKILL.md
  • skills/cuopt-lp-milp-api-cli/SKILL.md
  • skills/cuopt-lp-milp-api-python/SKILL.md
  • skills/cuopt-qp-api-c/SKILL.md
  • skills/cuopt-qp-api-cli/SKILL.md
  • skills/cuopt-qp-api-python/SKILL.md
  • skills/cuopt-routing-api-python/SKILL.md
  • skills/cuopt-server-api-python/SKILL.md
  • skills/cuopt-server-common/SKILL.md
✅ Files skipped from review due to trivial changes (66)
  • RAPIDS_BRANCH
  • cpp/src/routing/route/break_route.cuh
  • cpp/src/pdlp/utils.cuh
  • .github/workflows/build_test_publish_images.yaml
  • README.md
  • .cursor-plugin/plugin.json
  • .claude-plugin/marketplace.json
  • conda/environments/all_cuda-129_arch-aarch64.yaml
  • cpp/src/routing/route/capacity_route.cuh
  • cpp/src/mip_heuristics/problem/problem_helpers.cuh
  • cpp/src/routing/route/tsp_route.cuh
  • gemini-extension.json
  • cpp/src/pdlp/step_size_strategy/adaptive_step_size_strategy.cu
  • cpp/src/routing/route/service_time_route.cuh
  • cpp/src/routing/route/pdp_route.cuh
  • cpp/src/pdlp/pdhg.cu
  • cpp/src/routing/route/tasks_route.cuh
  • skills/cuopt-lp-milp-api-cli/SKILL.md
  • cpp/src/routing/utilities/check_input.cu
  • cpp/src/routing/local_search/compute_compatible.cu
  • .github/CODEOWNERS
  • skills/cuopt-installation-common/SKILL.md
  • skills/cuopt-installation-developer/SKILL.md
  • datasets/mip/download_miplib_test_dataset.sh
  • skills/cuopt-installation-api-c/SKILL.md
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve_helpers.cuh
  • cpp/src/pdlp/termination_strategy/infeasibility_information.cu
  • cpp/src/routing/route/vehicle_fixed_cost_route.cuh
  • conda/environments/all_cuda-131_arch-x86_64.yaml
  • skills/cuopt-qp-api-cli/SKILL.md
  • cpp/src/pdlp/termination_strategy/convergence_information.cu
  • helmchart/cuopt-server/Chart.yaml
  • VERSION
  • skills/cuopt-installation-api-python/SKILL.md
  • skills/cuopt-lp-milp-api-c/SKILL.md
  • skills/cuopt-qp-api-c/SKILL.md
  • .github/workflows/trigger-breaking-change-alert.yaml
  • cpp/src/routing/route/time_route.cuh
  • cpp/src/routing/route/route.cuh
  • python/cuopt_self_hosted/pyproject.toml
  • skills/cuopt-qp-api-python/SKILL.md
  • skills/cuopt-server-common/SKILL.md
  • cpp/src/mip_heuristics/presolve/conditional_bound_strengthening.cu
  • cpp/src/routing/route/mismatch_route.cuh
  • skills/cuopt-lp-milp-api-python/SKILL.md
  • helmchart/cuopt-server/values.yaml
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/barrier/barrier.cu
  • conda/environments/all_cuda-129_arch-x86_64.yaml
  • cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
  • skills/cuopt-developer/SKILL.md
  • cpp/src/mip_heuristics/diversity/diversity_config.hpp
  • python/cuopt_server/pyproject.toml
  • cpp/src/mip_heuristics/presolve/trivial_presolve.cuh
  • cpp/src/barrier/iterative_refinement.hpp
  • skills/cuopt-routing-api-python/SKILL.md
  • cpp/src/routing/solution/route_node_map.cuh
  • skills/cuopt-server-api-python/SKILL.md
  • python/cuopt/pyproject.toml
  • cpp/src/routing/route/prize_route.cuh
  • cpp/src/routing/route/dimensions_route.cuh
  • docs/cuopt/source/versions1.json
  • cpp/src/routing/route/distance_route.cuh
  • python/libcuopt/pyproject.toml
  • cpp/src/mip_heuristics/solution_callbacks.cuh
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
🚧 Files skipped from review as they are similar to previous changes (24)
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/diversity/lns/rins.cu
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/third_party_presolve.hpp
  • cpp/tests/mip/load_balancing_test.cu
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/utilities/copy_helpers.hpp
  • cpp/tests/mip/CMakeLists.txt
  • cpp/src/mip_heuristics/solver.cuh
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/solution/solution.cu
  • cpp/src/mip_heuristics/problem/problem.cuh
  • cpp/src/mip_heuristics/presolve/multi_probe.cu
  • cpp/tests/mip/mip_utils.cuh
  • cpp/src/mip_heuristics/diversity/diversity_manager.cuh
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/local_search/local_search.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/tests/mip/determinism_test.cu
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu

Comment on lines +155 to +157

# Update docs version switcher to include the new version
python ci/utils/update_doc_versions.py
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the Python script exists and inspect its contents

# Check if the file exists
if [[ -f "ci/utils/update_doc_versions.py" ]]; then
    echo "✓ File exists: ci/utils/update_doc_versions.py"
    echo ""
    echo "=== File contents ==="
    cat ci/utils/update_doc_versions.py
else
    echo "✗ File NOT found: ci/utils/update_doc_versions.py"
    echo ""
    echo "=== Searching for similar files ==="
    fd -t f "update.*version" ci/
fi

Repository: NVIDIA/cuopt

Length of output: 2433


🏁 Script executed:

sed -n '140,160p' ci/release/update-version.sh

Repository: NVIDIA/cuopt

Length of output: 935


🏁 Script executed:

cat -n ci/release/update-version.sh | head -80 && echo "..." && cat -n ci/release/update-version.sh | tail -20

Repository: NVIDIA/cuopt

Length of output: 4490


Conditionally execute the version switcher update and add error handling.

The Python script should be conditional on RUN_CONTEXT, similar to the documentation reference updates (lines 147-154). Running in main context adds development versions to the public version switcher, which is likely unintended. Additionally, add error handling to catch failures in the Python script:

# Update docs version switcher to include the new version (release context only)
if [[ "${RUN_CONTEXT}" == "release" ]]; then
  python ci/utils/update_doc_versions.py || { echo "Error: Failed to update version switcher"; exit 1; }
fi

Or, if this should run in both contexts, at minimum add error handling to prevent silent failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/release/update-version.sh` around lines 155 - 157, Wrap the call to
ci/utils/update_doc_versions.py in a RUN_CONTEXT check and add error handling:
only invoke the Python script when RUN_CONTEXT equals "release" (or update if
policy says run in both contexts) and ensure the command's failure is caught and
causes the script to exit with a non‑zero status; reference the RUN_CONTEXT
variable and the ci/utils/update_doc_versions.py invocation so you update that
section to conditionally execute and append a failure handler that echoes an
error and exits.

Comment on lines +640 to 642
heuristic_solution_queue_.push_back({solution, user_objective, work_unit_ts, origin});
const size_t heuristic_queue_size = heuristic_solution_queue_.size();
mutex_heuristic_queue_.unlock();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused variable heuristic_queue_size.

This variable is computed but never used.

🧹 Suggested fix
   mutex_heuristic_queue_.lock();
   heuristic_solution_queue_.push_back({solution, user_objective, work_unit_ts, origin});
-  const size_t heuristic_queue_size = heuristic_solution_queue_.size();
   mutex_heuristic_queue_.unlock();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 640 - 642, The
local variable heuristic_queue_size is computed but never used; remove the
unused declaration to avoid dead code by deleting the line that assigns const
size_t heuristic_queue_size = heuristic_solution_queue_.size() (located near the
call to heuristic_solution_queue_.push_back and before
mutex_heuristic_queue_.unlock()), or if the queue size is intended to be used
later, replace the removal by using heuristic_queue_size where needed—otherwise
simply delete that variable declaration.

Comment on lines +1422 to 1425
const f_t previous_local_upper = this->worker.local_upper_bound;
const int previous_seq = this->worker.next_solution_seq;
this->worker.local_upper_bound = obj;
this->worker.queue_integer_solution(obj, x, node->depth);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused variables previous_local_upper and previous_seq.

These variables are assigned but never referenced. They appear to be leftover debug code.

🧹 Suggested fix
       }
-      const f_t previous_local_upper = this->worker.local_upper_bound;
-      const int previous_seq         = this->worker.next_solution_seq;
       this->worker.local_upper_bound = obj;
       this->worker.queue_integer_solution(obj, x, node->depth);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const f_t previous_local_upper = this->worker.local_upper_bound;
const int previous_seq = this->worker.next_solution_seq;
this->worker.local_upper_bound = obj;
this->worker.queue_integer_solution(obj, x, node->depth);
this->worker.local_upper_bound = obj;
this->worker.queue_integer_solution(obj, x, node->depth);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 1422 - 1425, The
two local variables previous_local_upper and previous_seq are assigned but never
used; remove their declarations and assignments to clean up dead code around the
call that updates this->worker.local_upper_bound and calls
this->worker.queue_integer_solution(obj, x, node->depth). Ensure only the
necessary state updates (this->worker.local_upper_bound = obj and the
queue_integer_solution call) remain and do not alter any other logic in the
surrounding function.

Comment on lines 1690 to 1701
for (i_t k = 0; k < fractional.size(); k++) {
const i_t j = fractional[k];
for (i_t branch = 0; branch < 2; branch++) {
if (branch == 0) {
f_t change_in_obj = strong_branch_down[k];
if (std::isnan(change_in_obj)) { continue; }
f_t frac = root_soln[j] - std::floor(root_soln[j]);
pseudo_cost_sum_down[j] += change_in_obj / frac;
pseudo_cost_num_down[j]++;
} else {
f_t change_in_obj = strong_branch_up[k];
if (std::isnan(change_in_obj)) { continue; }
f_t frac = std::ceil(root_soln[j]) - root_soln[j];
pseudo_cost_sum_up[j] += change_in_obj / frac;
pseudo_cost_num_up[j]++;
}
if (!std::isnan(strong_branch_down[k])) {
const f_t frac = root_soln[j] - std::floor(root_soln[j]);
pseudo_cost_sum_down[j] += strong_branch_down[k] / frac;
pseudo_cost_num_down[j]++;
}
if (!std::isnan(strong_branch_up[k])) {
const f_t frac = std::ceil(root_soln[j]) - root_soln[j];
pseudo_cost_sum_up[j] += strong_branch_up[k] / frac;
pseudo_cost_num_up[j]++;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider guarding against near-zero fractions in pseudo-cost updates.

When root_soln[j] is very close to an integer boundary, frac can be extremely small (e.g., 1e-10), causing the pseudo-cost update to be numerically unstable due to division amplification.

🛡️ Suggested guard
   if (!std::isnan(strong_branch_down[k])) {
     const f_t frac = root_soln[j] - std::floor(root_soln[j]);
+    if (frac > 1e-6) {  // guard against near-integer values
       pseudo_cost_sum_down[j] += strong_branch_down[k] / frac;
       pseudo_cost_num_down[j]++;
+    }
   }
   if (!std::isnan(strong_branch_up[k])) {
     const f_t frac = std::ceil(root_soln[j]) - root_soln[j];
+    if (frac > 1e-6) {
       pseudo_cost_sum_up[j] += strong_branch_up[k] / frac;
       pseudo_cost_num_up[j]++;
+    }
   }

As per coding guidelines: "Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1690 - 1701, The
pseudo-cost update divides by frac (computed from root_soln[j]) which may be
extremely small near integer boundaries; modify the loops around
strong_branch_down[k] and strong_branch_up[k] to guard against near-zero frac by
computing a safe divisor: set an epsilon (e.g. using
std::numeric_limits<f_t>::epsilon() scaled or a small constant like 1e-8) and
either skip the update when frac < eps or replace frac with std::max(frac, eps)
before dividing; apply this change to the fractions computed for down (frac =
root_soln[j] - std::floor(...)) and up (frac = std::ceil(...) - root_soln[j]) so
updates to pseudo_cost_sum_down[j], pseudo_cost_sum_up[j] and increments of
pseudo_cost_num_down[j]/pseudo_cost_num_up[j] only occur with a numerically safe
divisor.

Comment on lines +206 to +207
f_t bb_work_unit_scale{1.0};
bool gpu_heur_wait_for_exploration{true};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align this default with the public MIP setting.

simplex_solver_settings_t::gpu_heur_wait_for_exploration defaults to true here, but mip_solver_settings_t now defaults the same knob to false. Any internal path that constructs simplex_solver_settings_t directly will silently get different scheduling behavior unless every translation layer overwrites it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/dual_simplex/simplex_solver_settings.hpp` around lines 206 - 207, The
default for simplex_solver_settings_t::gpu_heur_wait_for_exploration is
inconsistent with the public MIP default; change the member declaration
(currently next to bb_work_unit_scale in simplex_solver_settings_t) to default
to false so internal constructions get the same scheduling behavior as
mip_solver_settings_t (ensure the bool gpu_heur_wait_for_exploration{...}
initialization is set to false).

Comment on lines +100 to +103
detail::diversity_manager_t<int, double> diversity_manager(solver.context);
solver.context.gpu_heur_loop.deterministic = true;
diversity_manager.timer =
work_limit_timer_t(solver.context.gpu_heur_loop, settings.work_limit, timer);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use solver.context.gpu_heur_loop as the deterministic work context in these helpers too.

The nested timers inside local search/FJ read solver.context.gpu_heur_loop, not the ad-hoc work_limit_context_t created here. Unlike test_full_run_determinism(), these helpers never make that shared context deterministic, so they can bypass the timer/work-accounting path the test is supposed to validate.

💡 Minimal fix

Apply the same pattern used by test_full_run_determinism() in both helpers:

-  work_limit_context_t work_limit_context("DiversityManager");
-  work_limit_context.deterministic = true;
-  diversity_manager.timer          = work_limit_timer_t(work_limit_context, 60000, timer);
+  solver.context.gpu_heur_loop.deterministic = true;
+  diversity_manager.timer =
+    work_limit_timer_t(solver.context.gpu_heur_loop, 60000, timer);

Also applies to: 147-153, 196-202

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 100 - 103, The helpers create
an ad-hoc work_limit_context_t and attach a work_limit_timer_t to
diversity_manager.timer instead of using the shared deterministic context;
change them to use solver.context.gpu_heur_loop as the deterministic work
context (set solver.context.gpu_heur_loop.deterministic = true where needed) and
construct the work_limit_timer_t and any nested timers with
solver.context.gpu_heur_loop and the existing timer variable so the
nested/local-search/FJ timers read the same deterministic context as in
test_full_run_determinism(). Ensure you update the corresponding locations where
diversity_manager, work_limit_timer_t, and timer are set (also apply the same
change to the other two helper blocks mentioned).

Comment on lines +193 to +198
static std::unordered_map<std::string, double> first_val_map;
if (first_val_map.count(test_instance) == 0) {
first_val_map[test_instance] = solution.get_user_objective();
}
EXPECT_NEAR(solution.get_user_objective(), first_val_map[test_instance], 1.0)
<< test_instance << " determinism objective mismatch";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This does not verify determinism, only “similar objective.”

EXPECT_NEAR(..., 1.0) will still pass for different incumbents, feasibility states, or assignments as long as the user objective stays within 1. For the determinism regression in this PR, compare the exact feasible flag plus the produced assignment/objective byte-for-byte across runs.
As per coding guidelines, "Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/feasibility_jump_tests.cu` around lines 193 - 198, The current
check using first_val_map and EXPECT_NEAR only asserts similar numeric
objective, not exact determinism; change the test to record and compare a full
deterministic serialization of the solution (include solution.is_feasible()
flag, solution.get_user_objective(), and the produced assignment/variable
values) for each test_instance. Replace first_val_map with a map from
test_instance to the serialized result (e.g., string/byte vector) and on
subsequent runs serialize the current solution the same way and ASSERT_EQ the
stored bytes with the current bytes. Use the existing symbols
solution.get_user_objective(), the feasible flag accessor (e.g.,
solution.is_feasible()), and however assignments are exposed to build a
byte-for-byte representation to compare exact equality rather than EXPECT_NEAR.

Comment on lines +138 to +144
auto sorted_map = std::map<int, detail::cached_bound_t<double>>(
a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end());
for (const auto& [var_id, cached_bound] : sorted_map) {
var_to_cached_bound_keys.push_back(var_id);
var_to_cached_bound_lb.push_back(cached_bound.lb);
var_to_cached_bound_ub.push_back(cached_bound.ub);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Include both cached-bound maps in the determinism hash.

The composite hash only serializes a.second[0].var_to_cached_bound_map. Any nondeterminism in the [1] probe direction is invisible to the test.

💡 Minimal fix
-    auto sorted_map = std::map<int, detail::cached_bound_t<double>>(
-      a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end());
-    for (const auto& [var_id, cached_bound] : sorted_map) {
-      var_to_cached_bound_keys.push_back(var_id);
-      var_to_cached_bound_lb.push_back(cached_bound.lb);
-      var_to_cached_bound_ub.push_back(cached_bound.ub);
-    }
+    for (int side = 0; side < 2; ++side) {
+      auto sorted_map = std::map<int, detail::cached_bound_t<double>>(
+        a.second[side].var_to_cached_bound_map.begin(),
+        a.second[side].var_to_cached_bound_map.end());
+      for (const auto& [var_id, cached_bound] : sorted_map) {
+        var_to_cached_bound_keys.push_back(var_id);
+        var_to_cached_bound_lb.push_back(cached_bound.lb);
+        var_to_cached_bound_ub.push_back(cached_bound.ub);
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/presolve_test.cu` around lines 138 - 144, The determinism
serialization currently only reads a.second[0].var_to_cached_bound_map into
sorted_map, so nondeterminism from the [1] probe is ignored; update the logic
that builds
var_to_cached_bound_keys/var_to_cached_bound_lb/var_to_cached_bound_ub to
include both a.second[0].var_to_cached_bound_map and
a.second[1].var_to_cached_bound_map (e.g., iterate index 0 and 1, or merge both
maps before creating sorted_map) so that both cached-bound maps are serialized
into the hash; ensure you use the same sorting/iteration approach and push_back
the corresponding var_id, cached_bound.lb and cached_bound.ub for entries from
both maps.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)

1184-1202: ⚠️ Potential issue | 🟠 Major

Forward work_unit_context into strong_branch_helper(...) call.

At Line 1184, strong_branch_helper(...) is invoked without the new work_unit_context parameter, so Line 307 falls back to nullptr and deterministic work accounting is dropped in this DS path.

Suggested fix
             strong_branch_helper(start,
                                  end,
                                  start_time,
                                  original_lp,
                                  settings,
                                  var_types,
                                  fractional,
                                  root_solution.x,
                                  root_vstatus,
                                  edge_norms,
                                  root_obj,
                                  upper_bound,
                                  simplex_iteration_limit,
                                  pc,
                                  dual_simplex_obj_down,
                                  dual_simplex_obj_up,
                                  dual_simplex_status_down,
                                  dual_simplex_status_up,
-                                 sb_view);
+                                 sb_view,
+                                 work_unit_context);

As per coding guidelines: "Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 1184 - 1202, The call
to strong_branch_helper(...) at the shown site omits the new work_unit_context
parameter so the helper receives nullptr (losing deterministic work accounting);
update the invocation to pass the existing work_unit_context variable into
strong_branch_helper, and ensure the helper's signature and all other call sites
accept the extra parameter (verify function strong_branch_helper and its
declaration are updated accordingly) so deterministic work accounting is
preserved on this DS path.
♻️ Duplicate comments (7)
cpp/src/branch_and_bound/pseudo_costs.cpp (1)

2086-2094: ⚠️ Potential issue | 🟡 Minor

Guard pseudo-cost updates against near-zero fractional distances.

Line 2088 and Line 2093 divide by frac without a minimum threshold; near-integer values can amplify noise or spike pseudo-costs.

Suggested guard
   for (i_t k = 0; k < fractional.size(); k++) {
     const i_t j = fractional[k];
+    constexpr f_t frac_eps = 1e-8;
     if (!std::isnan(strong_branch_down[k])) {
       const f_t frac = root_soln[j] - std::floor(root_soln[j]);
-      pseudo_cost_sum_down[j] += strong_branch_down[k] / frac;
-      pseudo_cost_num_down[j]++;
+      if (frac > frac_eps) {
+        pseudo_cost_sum_down[j] += strong_branch_down[k] / frac;
+        pseudo_cost_num_down[j]++;
+      }
     }
     if (!std::isnan(strong_branch_up[k])) {
       const f_t frac = std::ceil(root_soln[j]) - root_soln[j];
-      pseudo_cost_sum_up[j] += strong_branch_up[k] / frac;
-      pseudo_cost_num_up[j]++;
+      if (frac > frac_eps) {
+        pseudo_cost_sum_up[j] += strong_branch_up[k] / frac;
+        pseudo_cost_num_up[j]++;
+      }
     }
   }

As per coding guidelines: "Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 2086 - 2094, The
pseudo-cost updates in the block that updates
pseudo_cost_sum_down[j]/pseudo_cost_sum_up[j] (using strong_branch_down[k],
strong_branch_up[k] and root_soln[j]) divide by frac computed as
root_soln[j]-floor(root_soln[j]) and ceil(root_soln[j])-root_soln[j] without
guarding tiny denominators; change the logic in that section to protect against
near-zero frac by either clamping frac = std::max(frac, eps) or skipping the
update when frac < eps (choose a small constant eps, e.g. 1e-12 or a
configurable constant), so that pseudo_cost_num_down/up are only incremented
when a safe division occurred and no large spikes are introduced.
cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh (1)

103-103: ⚠️ Potential issue | 🟡 Minor

Latent issue: termination_checker_t member lacks initializer.

The termination_checker_t max_timer; declaration has no initializer, but termination_checker_t has no default constructor (per context snippet 1 showing only explicit constructors). This would cause a compilation error.

However, based on learnings, this lb_constraint_prop path is not currently compiled, so this is a latent issue that will only surface if the path is re-enabled. When that happens, the member will need explicit initialization similar to constraint_prop_t:

termination_checker_t max_timer{0.0, cuopt::termination_checker_t::root_tag_t{}};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh` at line
103, The member declaration termination_checker_t max_timer in
lb_constraint_prop lacks an initializer and will fail to compile because
termination_checker_t has no default constructor; initialize max_timer
explicitly (same pattern used for constraint_prop_t) by constructing it with a
zero timeout and the root_tag_t sentinel (use termination_checker_t{0.0,
cuopt::termination_checker_t::root_tag_t{}} or equivalent) inside the
lb_constraint_prop class so the member is properly initialized when that path is
enabled.
cpp/src/mip_heuristics/local_search/local_search.cu (1)

233-249: ⚠️ Potential issue | 🟠 Major

Account the helper CPUFJ threads against one deterministic budget.

In deterministic mode, each ls_cpu_fj worker gets an independent work_budget = time_limit (line 247). This means with multiple helpers, the aggregate work consumed can exceed the caller's budget, and timestamps won't align with the single work-total that B&B synchronizes against.

Unlike start_cpufj_deterministic() (lines 148-190) which registers with producer_sync, these helpers operate with isolated budgets.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/local_search.cu` around lines 233 - 249,
The helper CPUFJ threads in ls_cpu_fj are being given independent deterministic
budgets (setting cpu_fj.fj_cpu->work_units_elapsed = 0.0 and
cpu_fj.fj_cpu->work_budget = time_limit), which allows aggregate work to exceed
the single deterministic budget; instead, mirror start_cpufj_deterministic() and
register each helper with the shared producer_sync so they are charged against
the common deterministic budget (or otherwise attach the same shared budget
handle) after create_cpu_climber(...) rather than assigning an isolated
work_budget, ensuring helpers use the synchronized producer_sync accounting.
cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu (2)

1037-1041: ⚠️ Potential issue | 🟠 Major

Graph batch can overshoot deterministic budget.

When deterministic_work_estimate is true, use_graph remains true, so run_step_device() executes a full 50-iteration batch. The deterministic_batch_work is only charged to the timer after the batch completes (lines 1107-1112). This allows a tiny positive budget to overshoot by an entire batch, shifting work timestamps used for deterministic publication/sync.

Consider disabling graph mode when the remaining deterministic budget is less than a full batch's estimated work.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines
1037 - 1041, The current logic keeps use_graph=true even when
deterministic_work_estimate is set, allowing run_step_device() to execute a full
iterations_per_graph batch and overshoot the deterministic budget; modify the
decision so that when deterministic_work_estimate is true and the remaining
deterministic budget (compare against deterministic_batch_work) is less than the
estimated work for a full graph batch, disable graph mode or force
iterations_per_batch=1 (i.e., set use_graph=false or set iterations_per_batch =
1) before calling run_step_device(), referencing the variables
deterministic_work_estimate, use_graph, iterations_per_batch, run_step_device,
and deterministic_batch_work so the small remaining budget is respected and work
is charged before any multi-iteration graph run.

995-1017: ⚠️ Potential issue | 🟠 Major

Add CUDA error checking after the kernel launch.

The init_lhs_and_violated_constraints kernel launch at line 996 lacks immediate error checking. Any failure will only surface later from the subsequent thrust::transform_reduce calls, making the fault harder to localize.

💡 Suggested fix
   data.violated_constraints.clear(stream);
   init_lhs_and_violated_constraints<i_t, f_t><<<4096, 256, 0, stream>>>(v);
+  RAFT_CHECK_CUDA(stream);
   // both transformreduce could be fused; but oh well hardly a bottleneck

As per coding guidelines, "Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines 995
- 1017, The kernel launch init_lhs_and_violated_constraints<i_t, f_t><<<4096,
256, 0, stream>>>(v) lacks immediate CUDA error checking; add a post-launch
check using the project's CUDA_CHECK (or equivalent) macro/function right after
that launch (and before any thrust::transform_reduce calls) to capture launch
failures on stream, e.g., call CUDA_CHECK(cudaGetLastError()) and/or
CUDA_CHECK(cudaPeekAtLastError()/cudaStreamSynchronize(stream)) as appropriate
so failures are reported at the kernel boundary; update surrounding code in
feasibility_jump.cu near init_lhs_and_violated_constraints,
data.violated_constraints.clear, and before the transform_reduce invocations to
perform this check.
cpp/tests/mip/diversity_test.cu (2)

147-153: ⚠️ Potential issue | 🟠 Major

Use solver.context.gpu_heur_loop as the deterministic work context.

This helper creates an ad-hoc work_limit_context_t instead of using the shared context from solver.context.gpu_heur_loop. The nested timers inside local search/FJ read solver.context.gpu_heur_loop, not this ad-hoc context, so they can bypass the work-accounting path the test is supposed to validate.

💡 Suggested fix
   detail::diversity_manager_t<int, double> diversity_manager(solver.context);
   solver.context.diversity_manager_ptr = &diversity_manager;
-  work_limit_context_t work_limit_context("DiversityManager");
-  work_limit_context.deterministic = true;
-  diversity_manager.timer          = termination_checker_t(work_limit_context, 60000, timer);
+  solver.context.gpu_heur_loop.deterministic = true;
+  diversity_manager.timer =
+    termination_checker_t(solver.context.gpu_heur_loop, 60000, timer);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 147 - 153, The test creates an
ad-hoc work_limit_context_t and assigns diversity_manager.timer from it, which
bypasses the shared GPU heuristic work accounting; replace that ad-hoc context
with the shared solver.context.gpu_heur_loop: remove the local
work_limit_context_t creation and set diversity_manager.timer =
termination_checker_t(solver.context.gpu_heur_loop, 60000, timer) so nested
timers inside local search/FJ use solver.context.gpu_heur_loop; keep the rest of
the setup (diversity_manager,
diversity_manager.diversity_config.initial_solution_only,
solver.context.diversity_manager_ptr and diversity_manager.run_solver())
unchanged.

196-202: ⚠️ Potential issue | 🟠 Major

Use solver.context.gpu_heur_loop as the deterministic work context.

Same issue as test_initial_solution_determinism - the ad-hoc work_limit_context_t is not connected to the nested timers in FJ/local search.

💡 Suggested fix
   detail::diversity_manager_t<int, double> diversity_manager(solver.context);
   solver.context.diversity_manager_ptr = &diversity_manager;
-  work_limit_context_t work_limit_context("DiversityManager");
-  work_limit_context.deterministic = true;
-  diversity_manager.timer          = termination_checker_t(work_limit_context, 60000, timer);
+  solver.context.gpu_heur_loop.deterministic = true;
+  diversity_manager.timer =
+    termination_checker_t(solver.context.gpu_heur_loop, 60000, timer);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 196 - 202, The test creates an
ad-hoc work_limit_context_t and uses it to build the termination_checker_t for
diversity_manager (detail::diversity_manager_t<int,double>
diversity_manager(...)), but this isn’t hooked into nested timers; replace the
local work_limit_context_t with the existing solver.context.gpu_heur_loop: set
solver.context.gpu_heur_loop.deterministic = true (or ensure it is true),
construct diversity_manager.timer =
termination_checker_t(solver.context.gpu_heur_loop, 60000, timer), remove the
standalone work_limit_context_t, and keep
diversity_manager.diversity_config.dry_run = true before calling
diversity_manager.run_solver() so the nested FJ/local search timers use the
correct deterministic context.
🧹 Nitpick comments (5)
cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu (2)

74-90: Redundant condition and dead code in determinism mode block.

The check !std::isinf(work_limit) on line 79 is redundant because determinism_mode on line 74 is only true when std::isfinite(work_limit) (line 68), which already excludes infinity. Consequently, the else branch (lines 88-90) setting estim_iters = std::numeric_limits<int>::max() is dead code.

♻️ Proposed simplification
   if (determinism_mode) {
     // try to estimate the iteration count based on the requested work limit
     // TODO: replace with an actual model. this is a rather ugly hack to avoid having
     // to touch the PDLP code for this initial PR
     estim_iters = 100;
-    if (!std::isinf(work_limit)) {
-      do {
-        // TODO: use an actual predictor model here
-        double estim_ms = 313 + 200 * op_problem.n_variables - 400 * op_problem.n_constraints +
-                          600 * op_problem.coefficients.size() + 7100 * estim_iters;
-        estim_ms = std::max(0.0, estim_ms);
-        if (estim_ms > work_limit * 1000) { break; }
-        estim_iters += 100;
-      } while (true);
-    } else {
-      estim_iters = std::numeric_limits<int>::max();
-    }
+    do {
+      // TODO: use an actual predictor model here
+      double estim_ms = 313 + 200 * op_problem.n_variables - 400 * op_problem.n_constraints +
+                        600 * op_problem.coefficients.size() + 7100 * estim_iters;
+      estim_ms = std::max(0.0, estim_ms);
+      if (estim_ms > work_limit * 1000) { break; }
+      estim_iters += 100;
+    } while (true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu` around lines 74 - 90, The
determinism_mode block contains a redundant check for !std::isinf(work_limit)
and a dead else branch; remove the inner if/else and always run the estimators
loop because determinism_mode is only set when std::isfinite(work_limit)
earlier, so delete the else that sets estim_iters =
std::numeric_limits<int>::max() and collapse the loop to run unconditionally
inside the determinism_mode branch, keeping the estim_iters initialization and
the loop that computes estim_ms using op_problem.n_variables,
op_problem.n_constraints, op_problem.coefficients.size(), and updating
estim_iters.

23-23: Unused include: <atomic> is not used in this file.

The #include <atomic> header was added but no std::atomic types are used. The static counter on line 53 uses plain uint64_t. Consider removing the unused include or converting lp_call_counter to std::atomic<uint64_t> if thread-safety is intended.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu` at line 23, The include
<atomic> is unused or lp_call_counter should be atomic for thread-safety; either
remove the unused `#include` <atomic> or change the static counter lp_call_counter
(currently a uint64_t) to std::atomic<uint64_t> and update its
increment/usesites accordingly (use fetch_add or ++ semantics) so thread-safe
increments are correct; ensure to include <atomic> only when converting
lp_call_counter to std::atomic<uint64_t>.
cpp/tests/mip/local_search_test.cu (2)

196-197: Unused spin_stream_raii_t variables.

These RAII objects are declared but appear unused in the test. If they're intended to keep CUDA streams alive during the test, consider adding a comment explaining their purpose. Otherwise, they can be removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/local_search_test.cu` around lines 196 - 197, The two RAII
stream objects spin_stream_raii_t spin_stream_1 and spin_stream_raii_t
spin_stream_2 are declared but never used; either remove these unused
declarations or explicitly document their purpose if they are intended to keep
CUDA streams alive (add a comment above spin_stream_1/spin_stream_2 stating they
intentionally hold/flush streams for the test), and if they must be used, ensure
they're constructed with the appropriate stream handles where the test creates
CUDA streams so the RAII actually manages those resources.

231-236: Consider enabling FJ_ON_ZERO mode in the test suite.

The FJ_ON_ZERO mode is defined in the enum (line 74) and has a corresponding execution path (lines 162-164), but it's commented out in the test instantiation. If this mode is intended to be deterministic, it should be tested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/local_search_test.cu` around lines 231 - 236, Enable the
commented-out FJ_ON_ZERO test case in the test instantiation: update the
INSTANTIATE_TEST_SUITE_P call for LocalSearchTests/LocalSearchTestParams to
include std::make_tuple(local_search_mode_t::FJ_ON_ZERO) alongside FP,
FJ_LINE_SEGMENT and FJ_ANNEALING so the FJ_ON_ZERO execution path is covered by
the test suite; ensure you remove the comment markers around the
std::make_tuple(local_search_mode_t::FJ_ON_ZERO) entry so the test framework
will instantiate it.
cpp/tests/mip/diversity_test.cu (1)

240-241: Structured binding discards the third return value.

The recombine function returns std::tuple<solution, bool, double> but only two elements are captured. While this works in C++, explicitly ignoring the third element makes intent clearer.

💡 Suggested fix
-        auto [offspring, success] =
+        auto [offspring, success, /* work_done */] =
           diversity_manager.recombine(pop_vector[i], pop_vector[j], recombiner);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 240 - 241, The structured
binding is discarding the third value from diversity_manager.recombine which
returns std::tuple<solution, bool, double>; update the capture to explicitly
bind the third element (for example capture as a named variable like "metric" or
use std::ignore if you intend to discard it) so the intent is clear when calling
recombine and avoid silent discarding of the double return; update the line that
currently binds "offspring" and "success" to also bind the third value from
recombine.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh`:
- Around line 68-69: The current TODO placeholder sets work to a small static
value and then returns 0.0 on LP-based early exits, losing the work already
spent by get_relaxed_lp_solution; update the code in fp_recombiner.cuh so the
work variable (currently computed from n_vars_from_other) accumulates the LP
pass cost and is returned on all early-exit paths. Concretely: in the scope
around get_relaxed_lp_solution(...) ensure you add the LP work contribution to
the existing work variable (reference n_vars_from_other and work) and return
that accumulated work instead of 0.0; apply the same change to the other
early-exit sites mentioned (around the lines corresponding to 100-105) so
recombine_stats.set_recombiner_work(work) receives the correct nonzero value.

In `@cpp/src/mip_heuristics/problem/presolve_data.cu`:
- Around line 236-241: The code synchronizes the wrong handle's stream: replace
the call to problem.handle_ptr->sync_stream() with a sync on the actual handle
used for subsequent GPU work (the local pointer h), e.g. call h->sync_stream()
or otherwise synchronize h->get_stream() before invoking cuopt::host_copy and
other GPU operations; update any related comments to reflect that
synchronization targets the chosen handle (h) to avoid mismatched streams when
handle_override is provided.

In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu`:
- Around line 53-54: The static non-atomic lp_call_counter causes a data race
when multiple threads call get_relaxed_lp_solution; change lp_call_counter to an
atomic (e.g., std::atomic<uint64_t>) and obtain lp_call_id using an atomic
fetch-add (or equivalent) so increments are thread-safe, updating references to
lp_call_counter and lp_call_id in get_relaxed_lp_solution accordingly.

In `@cpp/src/mip_heuristics/solver.cu`:
- Around line 491-496: The async lambda that calls
RAFT_CUDA_TRY(cudaSetDevice(bb_device_id)) must verify the result and fail fast
on error: replace the bare cudaSetDevice call with the project’s CUDA
error-checking helper (e.g., RAFT_CUDA_TRY or CUDA_CHECK) or explicitly check
cudaError_t and handle failures before calling branch_and_bound->solve; update
the lambda that captures bb_device_id and calls
branch_and_bound->solve(branch_and_bound_solution) so it logs/propagates the
cudaSetDevice error (using the same error handling pattern used elsewhere) and
returns/throws an error instead of proceeding to solve when device
initialization fails.

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 267-279: The block that collects hashes and computes final_hash
(using diversity_manager.get_population_pointer(), pop->population_to_vector(),
detail::compute_hash(hashes), and the printf that references path and
final_hash) is dead/unreachable because an earlier return happens; remove this
unreachable code or move its logic before the earlier return so only reachable
code remains—specifically delete the loop that pushes sol.get_hash(), the
compute_hash call, the printf, and the trailing return final_hash; ensure no
references to final_hash or hashes remain after the earlier return.

---

Outside diff comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 1184-1202: The call to strong_branch_helper(...) at the shown site
omits the new work_unit_context parameter so the helper receives nullptr (losing
deterministic work accounting); update the invocation to pass the existing
work_unit_context variable into strong_branch_helper, and ensure the helper's
signature and all other call sites accept the extra parameter (verify function
strong_branch_helper and its declaration are updated accordingly) so
deterministic work accounting is preserved on this DS path.

---

Duplicate comments:
In `@cpp/src/branch_and_bound/pseudo_costs.cpp`:
- Around line 2086-2094: The pseudo-cost updates in the block that updates
pseudo_cost_sum_down[j]/pseudo_cost_sum_up[j] (using strong_branch_down[k],
strong_branch_up[k] and root_soln[j]) divide by frac computed as
root_soln[j]-floor(root_soln[j]) and ceil(root_soln[j])-root_soln[j] without
guarding tiny denominators; change the logic in that section to protect against
near-zero frac by either clamping frac = std::max(frac, eps) or skipping the
update when frac < eps (choose a small constant eps, e.g. 1e-12 or a
configurable constant), so that pseudo_cost_num_down/up are only incremented
when a safe division occurred and no large spikes are introduced.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 1037-1041: The current logic keeps use_graph=true even when
deterministic_work_estimate is set, allowing run_step_device() to execute a full
iterations_per_graph batch and overshoot the deterministic budget; modify the
decision so that when deterministic_work_estimate is true and the remaining
deterministic budget (compare against deterministic_batch_work) is less than the
estimated work for a full graph batch, disable graph mode or force
iterations_per_batch=1 (i.e., set use_graph=false or set iterations_per_batch =
1) before calling run_step_device(), referencing the variables
deterministic_work_estimate, use_graph, iterations_per_batch, run_step_device,
and deterministic_batch_work so the small remaining budget is respected and work
is charged before any multi-iteration graph run.
- Around line 995-1017: The kernel launch init_lhs_and_violated_constraints<i_t,
f_t><<<4096, 256, 0, stream>>>(v) lacks immediate CUDA error checking; add a
post-launch check using the project's CUDA_CHECK (or equivalent) macro/function
right after that launch (and before any thrust::transform_reduce calls) to
capture launch failures on stream, e.g., call CUDA_CHECK(cudaGetLastError())
and/or CUDA_CHECK(cudaPeekAtLastError()/cudaStreamSynchronize(stream)) as
appropriate so failures are reported at the kernel boundary; update surrounding
code in feasibility_jump.cu near init_lhs_and_violated_constraints,
data.violated_constraints.clear, and before the transform_reduce invocations to
perform this check.

In `@cpp/src/mip_heuristics/local_search/local_search.cu`:
- Around line 233-249: The helper CPUFJ threads in ls_cpu_fj are being given
independent deterministic budgets (setting cpu_fj.fj_cpu->work_units_elapsed =
0.0 and cpu_fj.fj_cpu->work_budget = time_limit), which allows aggregate work to
exceed the single deterministic budget; instead, mirror
start_cpufj_deterministic() and register each helper with the shared
producer_sync so they are charged against the common deterministic budget (or
otherwise attach the same shared budget handle) after create_cpu_climber(...)
rather than assigning an isolated work_budget, ensuring helpers use the
synchronized producer_sync accounting.

In `@cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh`:
- Line 103: The member declaration termination_checker_t max_timer in
lb_constraint_prop lacks an initializer and will fail to compile because
termination_checker_t has no default constructor; initialize max_timer
explicitly (same pattern used for constraint_prop_t) by constructing it with a
zero timeout and the root_tag_t sentinel (use termination_checker_t{0.0,
cuopt::termination_checker_t::root_tag_t{}} or equivalent) inside the
lb_constraint_prop class so the member is properly initialized when that path is
enabled.

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 147-153: The test creates an ad-hoc work_limit_context_t and
assigns diversity_manager.timer from it, which bypasses the shared GPU heuristic
work accounting; replace that ad-hoc context with the shared
solver.context.gpu_heur_loop: remove the local work_limit_context_t creation and
set diversity_manager.timer =
termination_checker_t(solver.context.gpu_heur_loop, 60000, timer) so nested
timers inside local search/FJ use solver.context.gpu_heur_loop; keep the rest of
the setup (diversity_manager,
diversity_manager.diversity_config.initial_solution_only,
solver.context.diversity_manager_ptr and diversity_manager.run_solver())
unchanged.
- Around line 196-202: The test creates an ad-hoc work_limit_context_t and uses
it to build the termination_checker_t for diversity_manager
(detail::diversity_manager_t<int,double> diversity_manager(...)), but this isn’t
hooked into nested timers; replace the local work_limit_context_t with the
existing solver.context.gpu_heur_loop: set
solver.context.gpu_heur_loop.deterministic = true (or ensure it is true),
construct diversity_manager.timer =
termination_checker_t(solver.context.gpu_heur_loop, 60000, timer), remove the
standalone work_limit_context_t, and keep
diversity_manager.diversity_config.dry_run = true before calling
diversity_manager.run_solver() so the nested FJ/local search timers use the
correct deterministic context.

---

Nitpick comments:
In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu`:
- Around line 74-90: The determinism_mode block contains a redundant check for
!std::isinf(work_limit) and a dead else branch; remove the inner if/else and
always run the estimators loop because determinism_mode is only set when
std::isfinite(work_limit) earlier, so delete the else that sets estim_iters =
std::numeric_limits<int>::max() and collapse the loop to run unconditionally
inside the determinism_mode branch, keeping the estim_iters initialization and
the loop that computes estim_ms using op_problem.n_variables,
op_problem.n_constraints, op_problem.coefficients.size(), and updating
estim_iters.
- Line 23: The include <atomic> is unused or lp_call_counter should be atomic
for thread-safety; either remove the unused `#include` <atomic> or change the
static counter lp_call_counter (currently a uint64_t) to std::atomic<uint64_t>
and update its increment/usesites accordingly (use fetch_add or ++ semantics) so
thread-safe increments are correct; ensure to include <atomic> only when
converting lp_call_counter to std::atomic<uint64_t>.

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 240-241: The structured binding is discarding the third value from
diversity_manager.recombine which returns std::tuple<solution, bool, double>;
update the capture to explicitly bind the third element (for example capture as
a named variable like "metric" or use std::ignore if you intend to discard it)
so the intent is clear when calling recombine and avoid silent discarding of the
double return; update the line that currently binds "offspring" and "success" to
also bind the third value from recombine.

In `@cpp/tests/mip/local_search_test.cu`:
- Around line 196-197: The two RAII stream objects spin_stream_raii_t
spin_stream_1 and spin_stream_raii_t spin_stream_2 are declared but never used;
either remove these unused declarations or explicitly document their purpose if
they are intended to keep CUDA streams alive (add a comment above
spin_stream_1/spin_stream_2 stating they intentionally hold/flush streams for
the test), and if they must be used, ensure they're constructed with the
appropriate stream handles where the test creates CUDA streams so the RAII
actually manages those resources.
- Around line 231-236: Enable the commented-out FJ_ON_ZERO test case in the test
instantiation: update the INSTANTIATE_TEST_SUITE_P call for
LocalSearchTests/LocalSearchTestParams to include
std::make_tuple(local_search_mode_t::FJ_ON_ZERO) alongside FP, FJ_LINE_SEGMENT
and FJ_ANNEALING so the FJ_ON_ZERO execution path is covered by the test suite;
ensure you remove the comment markers around the
std::make_tuple(local_search_mode_t::FJ_ON_ZERO) entry so the test framework
will instantiate it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 85b5b753-f0c9-41b8-be6a-0d56784ce4d5

📥 Commits

Reviewing files that changed from the base of the PR and between 629f681 and 40a72db.

📒 Files selected for processing (55)
  • cpp/CMakeLists.txt
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/utilities/internals.hpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/diversity/diversity_manager.cuh
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/diversity/population.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh
  • cpp/src/mip_heuristics/early_heuristic.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cuh
  • cpp/src/mip_heuristics/local_search/local_search.cu
  • cpp/src/mip_heuristics/local_search/local_search.cuh
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cuh
  • cpp/src/mip_heuristics/presolve/bounds_update_data.cu
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cuh
  • cpp/src/mip_heuristics/presolve/multi_probe.cu
  • cpp/src/mip_heuristics/presolve/multi_probe.cuh
  • cpp/src/mip_heuristics/presolve/probing_cache.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cuh
  • cpp/src/mip_heuristics/problem/presolve_data.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/src/utilities/copy_helpers.hpp
  • cpp/src/utilities/cuda_helpers.cuh
  • cpp/src/utilities/termination_checker.hpp
  • cpp/tests/CMakeLists.txt
  • cpp/tests/mip/CMakeLists.txt
  • cpp/tests/mip/determinism_test.cu
  • cpp/tests/mip/diversity_test.cu
  • cpp/tests/mip/incumbent_callback_test.cu
  • cpp/tests/mip/local_search_test.cu
  • cpp/tests/mip/presolve_test.cu
  • skills/cuopt-developer/SKILL.md
✅ Files skipped from review due to trivial changes (3)
  • skills/cuopt-developer/SKILL.md
  • cpp/tests/CMakeLists.txt
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
🚧 Files skipped from review as they are similar to previous changes (19)
  • cpp/src/mip_heuristics/presolve/bounds_update_data.cu
  • cpp/src/mip_heuristics/presolve/multi_probe.cuh
  • cpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuh
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
  • cpp/src/utilities/cuda_helpers.cuh
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cu
  • cpp/tests/mip/determinism_test.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/tests/mip/presolve_test.cu

Comment on lines +68 to +69
// TODO: CHANGE
double work = static_cast<double>(n_vars_from_other) / 1e8;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't drop recombiner work on LP-based early exits.

After get_relaxed_lp_solution(...) runs, these branches return 0.0 even though the infeasibility-detection path already spent work. Since cpp/src/mip_heuristics/diversity/diversity_manager.cu:1146-1150 forwards this value into recombine_stats.set_recombiner_work(work), the current code undercounts exactly the expensive failure cases. At minimum, return the accumulated work here; ideally replace the // TODO: CHANGE placeholder with a cost that also covers the LP pass.

Suggested minimal fix
-    // TODO: CHANGE
     double work = static_cast<double>(n_vars_from_other) / 1e8;
@@
       if (lp_response.get_termination_status() == pdlp_termination_status_t::PrimalInfeasible ||
           lp_response.get_termination_status() == pdlp_termination_status_t::DualInfeasible ||
           lp_response.get_termination_status() == pdlp_termination_status_t::TimeLimit) {
         CUOPT_LOG_DEBUG("FP recombiner failed because LP found infeasible!");
-        return std::make_tuple(offspring, false, 0.0);
+        return std::make_tuple(offspring, false, work);
       }

Also applies to: 100-105

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines
68 - 69, The current TODO placeholder sets work to a small static value and then
returns 0.0 on LP-based early exits, losing the work already spent by
get_relaxed_lp_solution; update the code in fp_recombiner.cuh so the work
variable (currently computed from n_vars_from_other) accumulates the LP pass
cost and is returned on all early-exit paths. Concretely: in the scope around
get_relaxed_lp_solution(...) ensure you add the LP work contribution to the
existing work variable (reference n_vars_from_other and work) and return that
accumulated work instead of 0.0; apply the same change to the other early-exit
sites mentioned (around the lines corresponding to 100-105) so
recombine_stats.set_recombiner_work(work) receives the correct nonzero value.

Comment on lines +236 to 241
const auto* h = handle_override ? handle_override : problem.handle_ptr;
cuopt_assert(assignment.size() == papilo_reduced_to_original_map.size(),
"Papilo uncrush assignment size mismatch");
auto h_assignment = cuopt::host_copy(assignment, problem.handle_ptr->get_stream());
problem.handle_ptr->sync_stream();
auto h_assignment = cuopt::host_copy(assignment, h->get_stream());
std::vector<f_t> full_assignment;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect all papilo_uncrush_assignment call sites and passed handle argument.
rg -nP -C3 --type=cu --type=cpp --type=cuh '\bpapilo_uncrush_assignment\s*\('

# 2) Inspect the exact sync/copy sequence in the implementation under review.
rg -nP -C8 'handle_override|sync_stream|host_copy\(assignment|raft::copy\(assignment\.data' cpp/src/mip_heuristics/problem/presolve_data.cu

# 3) Inspect host_copy helper implementation to confirm synchronization behavior.
rg -nP -C8 --type=hpp --type=h --type=cu --type=cpp '\bhost_copy\s*\('

Repository: NVIDIA/cuopt

Length of output: 82


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for papilo_uncrush_assignment call sites
echo "=== papilo_uncrush_assignment call sites ==="
rg -nP -C3 'papilo_uncrush_assignment\s*\(' --glob='*.cu' --glob='*.cpp' --glob='*.cuh'

echo ""
echo "=== Sync/copy sequence in presolve_data.cu ==="
rg -nP -C8 'handle_override|sync_stream|host_copy\(assignment|raft::copy\(assignment\.data' cpp/src/mip_heuristics/problem/presolve_data.cu

echo ""
echo "=== host_copy helper signature ==="
rg -nP -C8 'host_copy\s*\(' --glob='*.hpp' --glob='*.h' --glob='*.cu' --glob='*.cpp'

Repository: NVIDIA/cuopt

Length of output: 50369


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Read the exact lines from the file to understand context
echo "=== Content around lines 230-250 of presolve_data.cu ==="
sed -n '230,250p' cpp/src/mip_heuristics/problem/presolve_data.cu | cat -n

echo ""
echo "=== Full function context (broader view) ==="
# Find the function containing line 236 and show more context
awk 'NR==200,NR==260' cpp/src/mip_heuristics/problem/presolve_data.cu | cat -n

Repository: NVIDIA/cuopt

Length of output: 3700


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where handle_override is defined and used in this function
echo "=== Function containing the code at line 236 ==="
sed -n '180,280p' cpp/src/mip_heuristics/problem/presolve_data.cu | cat -n

Repository: NVIDIA/cuopt

Length of output: 3522


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find host_copy implementation
echo "=== Searching for host_copy definition/implementation ==="
find . -type f \( -name "*.hpp" -o -name "*.h" -o -name "*.cu" -o -name "*.cpp" \) -exec grep -l "host_copy" {} \; | head -20

# Look for the template definition
echo ""
echo "=== host_copy template definition ==="
rg -A 15 'template.*host_copy\s*\(' --glob='*.hpp' --glob='*.h' | head -80

Repository: NVIDIA/cuopt

Length of output: 2270


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check what host_copy is - look in header files for implementation
echo "=== Looking for host_copy in utilities ==="
find . -path ./build -prune -o -type f -name "*.hpp" -o -name "*.h" | xargs grep -l 'host_copy' | head -10

# Find it with context
echo ""
echo "=== host_copy usage pattern analysis ==="
rg 'host_copy\(' --glob='*.cu' -B2 -A2 | head -100

Repository: NVIDIA/cuopt

Length of output: 7907


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find and display host_copy implementation
echo "=== host_copy implementation in copy_helpers.hpp ==="
cat cpp/src/utilities/copy_helpers.hpp | head -200

Repository: NVIDIA/cuopt

Length of output: 4676


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# See the rest of host_copy implementation
echo "=== Continuation of copy_helpers.hpp ==="
sed -n '160,220p' cpp/src/utilities/copy_helpers.hpp

Repository: NVIDIA/cuopt

Length of output: 1859


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check where assignment data comes from in the call sites
echo "=== Call site 1: solution_callbacks.cuh:102 context ==="
sed -n '95,110p' cpp/src/mip_heuristics/solution_callbacks.cuh

echo ""
echo "=== Call site 2: solve.cu:113 context ==="
sed -n '108,120p' cpp/src/mip_heuristics/solve.cu

# Also check if temp_sol is using a different stream
echo ""
echo "=== solution_t structure and stream handling ==="
rg -A 5 'struct solution_t|class solution_t' --glob='*.cuh' --glob='*.hpp' cpp/src/mip_heuristics/solution/ | head -60

Repository: NVIDIA/cuopt

Length of output: 2489


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Now let's understand the rmm::device_uvector and stream associations
echo "=== Check how assignment is created/initialized ==="
rg -B5 -A5 'rmm::device_uvector<f_t>& assignment' cpp/src/mip_heuristics/problem/presolve_data.cu | head -40

# Check what stream assignment is associated with
echo ""
echo "=== Looking for assignment initialization in solution_t ==="
rg -B3 -A3 'assignment.*resize|assignment.*rmm' cpp/src/mip_heuristics/solution/solution.cu | head -50

# Check if there are any operations on assignment before papilo_uncrush_assignment is called
echo ""
echo "=== In solution_callbacks.cuh, what happens to assignment before papilo_uncrush_assignment ==="
sed -n '85,105p' cpp/src/mip_heuristics/solution_callbacks.cuh

Repository: NVIDIA/cuopt

Length of output: 4351


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Let's understand stream lifetime in solution_t copy constructor
echo "=== solution_t copy constructor ==="
rg -A 20 'solution_t<i_t, f_t>::solution_t\(const solution_t' cpp/src/mip_heuristics/solution/solution.cu

echo ""
echo "=== solution_t constructor from problem ==="
rg -A 15 'solution_t<i_t, f_t>::solution_t\(problem_t' cpp/src/mip_heuristics/solution/solution.cu | head -40

# Let me check if assignment retains its stream affinity across copies
echo ""
echo "=== rmm::device_uvector stream handling in copy ==="
rg 'assignment = other.assignment|assignment\(' cpp/src/mip_heuristics/solution/solution.cu | head -20

Repository: NVIDIA/cuopt

Length of output: 3059


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Let me trace through the actual execution in solution_callbacks.cuh more carefully
echo "=== Check solution_t constructor with handle_override ==="
sed -n '90,110p' cpp/src/mip_heuristics/solution_callbacks.cuh

echo ""
echo "=== Now let's look at the actual fix - does syncing problem.handle_ptr help or hurt? ==="
# The key question: when handle_override differs from problem.handle_ptr, what stream is assignment on?
echo "When papilo_uncrush_assignment is called with handle_override:"
echo "- h = handle_override (if not nullptr)"
echo "- h = problem.handle_ptr (if nullptr)"
echo ""
echo "The assignment device_uvector - which handle does it use?"
echo "Looking at line 240-246 operations:"
rg -B2 -A10 'const auto\* h = handle_override' cpp/src/mip_heuristics/problem/presolve_data.cu

echo ""
echo "=== Check if there are similar mixed-handle sync patterns elsewhere ==="
rg 'problem\.handle_ptr->sync|h->sync|handle_override' cpp/src/mip_heuristics/problem/presolve_data.cu

Repository: NVIDIA/cuopt

Length of output: 2706


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for similar patterns: sync on one handle, use another handle's stream
echo "=== Looking for mixed-handle sync/stream patterns ==="
rg -B3 'sync_stream\(\)' cpp/src/mip_heuristics/problem/ --glob='*.cu' | grep -A3 -B3 'handle_override\|h->get_stream\|h->sync'

echo ""
echo "=== Confirm the issue: in solution_callbacks.cuh, sol.handle_ptr vs problem_ptr->handle_ptr ==="
sed -n '88,105p' cpp/src/mip_heuristics/solution_callbacks.cuh | cat -n

Repository: NVIDIA/cuopt

Length of output: 2727


Synchronize the correct stream before GPU operations

At line 240, the code syncs problem.handle_ptr->sync_stream(), but the subsequent GPU operations on lines 241 and 244 use h->get_stream(), which may differ when handle_override is provided (as in solution_callbacks.cuh:102). This mixed-handle synchronization is incorrect; the initial sync should target the stream actually used for GPU work.

Suggested fix
-  problem.handle_ptr->sync_stream();
+  h->sync_stream();
   auto h_assignment = cuopt::host_copy(assignment, h->get_stream());
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const auto* h = handle_override ? handle_override : problem.handle_ptr;
cuopt_assert(assignment.size() == papilo_reduced_to_original_map.size(),
"Papilo uncrush assignment size mismatch");
auto h_assignment = cuopt::host_copy(assignment, problem.handle_ptr->get_stream());
problem.handle_ptr->sync_stream();
auto h_assignment = cuopt::host_copy(assignment, h->get_stream());
std::vector<f_t> full_assignment;
const auto* h = handle_override ? handle_override : problem.handle_ptr;
cuopt_assert(assignment.size() == papilo_reduced_to_original_map.size(),
"Papilo uncrush assignment size mismatch");
h->sync_stream();
auto h_assignment = cuopt::host_copy(assignment, h->get_stream());
std::vector<f_t> full_assignment;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/problem/presolve_data.cu` around lines 236 - 241, The
code synchronizes the wrong handle's stream: replace the call to
problem.handle_ptr->sync_stream() with a sync on the actual handle used for
subsequent GPU work (the local pointer h), e.g. call h->sync_stream() or
otherwise synchronize h->get_stream() before invoking cuopt::host_copy and other
GPU operations; update any related comments to reflect that synchronization
targets the chosen handle (h) to avoid mismatched streams when handle_override
is provided.

Comment on lines +491 to +496
int bb_device_id = context.handle_ptr->get_device();
branch_and_bound_status_future =
std::async(std::launch::async, [&branch_and_bound, &branch_and_bound_solution, bb_device_id] {
RAFT_CUDA_TRY(cudaSetDevice(bb_device_id));
return branch_and_bound->solve(branch_and_bound_solution);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add CUDA error checking after cudaSetDevice.

The cudaSetDevice call inside the async B&B thread should have error checking to catch device initialization failures early.

💡 Suggested fix
     branch_and_bound_status_future =
       std::async(std::launch::async, [&branch_and_bound, &branch_and_bound_solution, bb_device_id] {
-        RAFT_CUDA_TRY(cudaSetDevice(bb_device_id));
+        cudaError_t err = cudaSetDevice(bb_device_id);
+        if (err != cudaSuccess) {
+          CUOPT_LOG_ERROR("Failed to set CUDA device %d: %s", bb_device_id, cudaGetErrorString(err));
+          return dual_simplex::mip_status_t::NUMERICAL_ISSUES;
+        }
         return branch_and_bound->solve(branch_and_bound_solution);
       });

As per coding guidelines, "Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
int bb_device_id = context.handle_ptr->get_device();
branch_and_bound_status_future =
std::async(std::launch::async, [&branch_and_bound, &branch_and_bound_solution, bb_device_id] {
RAFT_CUDA_TRY(cudaSetDevice(bb_device_id));
return branch_and_bound->solve(branch_and_bound_solution);
});
int bb_device_id = context.handle_ptr->get_device();
branch_and_bound_status_future =
std::async(std::launch::async, [&branch_and_bound, &branch_and_bound_solution, bb_device_id] {
cudaError_t err = cudaSetDevice(bb_device_id);
if (err != cudaSuccess) {
CUOPT_LOG_ERROR("Failed to set CUDA device %d: %s", bb_device_id, cudaGetErrorString(err));
return dual_simplex::mip_status_t::NUMERICAL_ISSUES;
}
return branch_and_bound->solve(branch_and_bound_solution);
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver.cu` around lines 491 - 496, The async lambda
that calls RAFT_CUDA_TRY(cudaSetDevice(bb_device_id)) must verify the result and
fail fast on error: replace the bare cudaSetDevice call with the project’s CUDA
error-checking helper (e.g., RAFT_CUDA_TRY or CUDA_CHECK) or explicitly check
cudaError_t and handle failures before calling branch_and_bound->solve; update
the lambda that captures bb_device_id and calls
branch_and_bound->solve(branch_and_bound_solution) so it logs/propagates the
cudaSetDevice error (using the same error handling pattern used elsewhere) and
returns/throws an error instead of proceeding to solve when device
initialization fails.

Comment on lines +267 to +279

auto pop = diversity_manager.get_population_pointer();
for (const auto& sol : pop->population_to_vector()) {
hashes.push_back(sol.get_hash());
}

uint32_t final_hash = detail::compute_hash(hashes);
printf("%s: final hash: 0x%x, pop size %d\n",
path.c_str(),
final_hash,
(int)pop->population_to_vector().size());
return final_hash;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unreachable dead code.

Lines 267-279 are unreachable because line 266 returns early. This appears to be leftover code from an earlier iteration.

💡 Suggested fix
   return detail::compute_hash(hashes);
-
-  auto pop = diversity_manager.get_population_pointer();
-  for (const auto& sol : pop->population_to_vector()) {
-    hashes.push_back(sol.get_hash());
-  }
-
-  uint32_t final_hash = detail::compute_hash(hashes);
-  printf("%s: final hash: 0x%x, pop size %d\n",
-         path.c_str(),
-         final_hash,
-         (int)pop->population_to_vector().size());
-  return final_hash;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto pop = diversity_manager.get_population_pointer();
for (const auto& sol : pop->population_to_vector()) {
hashes.push_back(sol.get_hash());
}
uint32_t final_hash = detail::compute_hash(hashes);
printf("%s: final hash: 0x%x, pop size %d\n",
path.c_str(),
final_hash,
(int)pop->population_to_vector().size());
return final_hash;
}
return detail::compute_hash(hashes);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 267 - 279, The block that
collects hashes and computes final_hash (using
diversity_manager.get_population_pointer(), pop->population_to_vector(),
detail::compute_hash(hashes), and the printf that references path and
final_hash) is dead/unreachable because an earlier return happens; remove this
unreachable code or move its logic before the earlier return so only reachable
code remains—specifically delete the loop that pushes sol.get_hash(), the
compute_hash call, the printf, and the trailing return final_hash; ensure no
references to final_hash or hashes remain after the earlier return.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
cpp/tests/mip/diversity_test.cu (4)

69-69: Empty function should be removed or documented.

setup_device_symbols is a no-op function. If it's a placeholder for future implementation, add a TODO comment; otherwise, remove it to avoid confusion.

💡 Suggested fix

Either remove the function and its call sites, or document the intent:

-static void setup_device_symbols(rmm::cuda_stream_view stream_view) { (void)stream_view; }
+// TODO: Implement device symbol initialization if needed for determinism tests
+// static void setup_device_symbols(rmm::cuda_stream_view stream_view) { (void)stream_view; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` at line 69, The function
setup_device_symbols(rmm::cuda_stream_view stream_view) is a no-op; either
remove this unused function (and clean up any call sites that only exist to
invoke it) or make its purpose explicit by adding a TODO/comment indicating it's
a deliberate placeholder for future device-symbol setup; update references to
setup_device_symbols in tests (e.g., any calls in diversity_test.cu) to reflect
the removal or to clarify that the call is intentionally a no-op.

71-73: Unused seed parameter in test runner functions.

The seed parameter in test_full_run_determinism (line 72) and test_initial_solution_determinism (line 121) has a default value but is never used within the function body. The global seed is set externally via cuopt::seed_generator::set_seed(seed) in the calling test. Either remove the parameter or use it to seed something locally.

💡 Suggested fix

If the seed is intended to be set locally within these functions:

 static uint32_t test_full_run_determinism(std::string path,
-                                          unsigned long seed = std::random_device{}(),
+                                          unsigned long seed,
                                           float work_limit   = 10.0f)
 {
+  cuopt::seed_generator::set_seed(seed);
   const raft::handle_t handle_{};

Or remove the parameter if external seeding is intentional.

Also applies to: 120-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 71 - 73, The two test runner
functions test_full_run_determinism and test_initial_solution_determinism
declare an unused seed parameter; either remove the seed parameter from both
function signatures (and update any callers) if external seeding is intended, or
make the functions actually use the value by calling
cuopt::seed_generator::set_seed(seed) (or equivalent local seeding) at the start
of each function to apply the provided seed; pick one approach and apply it
consistently to both functions.

382-394: Test instantiation includes duplicate entries.

Lines 387-388 instantiate mip/neos5.mps and mip/gen-ip054.mps, which appear both in active entries and as commented-out entries (lines 385-386). The commented-out duplicates should be removed to avoid confusion.

💡 Suggested cleanup
 INSTANTIATE_TEST_SUITE_P(DiversityTest,
                          DiversityTestParams,
                          testing::Values(
-                           // std::make_tuple("mip/gen-ip054.mps", 5.0f),
-                           // std::make_tuple("mip/pk1.mps", 5.0f),
                            std::make_tuple("mip/neos5.mps", 5.0f),
                            std::make_tuple("mip/gen-ip054.mps", 5.0f),
                            std::make_tuple("mip/pk1.mps", 5.0f),
-                           //  std::make_tuple("uccase9.mps"),
-                           // std::make_tuple("mip/neos5.mps", 5.0f),
                            std::make_tuple("mip/50v-10.mps", 5.0f)
-                           // std::make_tuple("mip/rmatr200-p5.mps", 5.0f)
                            ));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 382 - 394, The
INSTANTIATE_TEST_SUITE_P block for DiversityTest (using DiversityTestParams)
contains commented-out entries that duplicate active tuples (e.g.,
"mip/neos5.mps" and "mip/gen-ip054.mps"); to clean up, remove the commented
duplicate lines so only one declaration remains per test case in the
INSTANTIATE_TEST_SUITE_P(DiversityTest, DiversityTestParams,
testing::Values(...)) invocation and leave the active std::make_tuple entries
(such as "mip/neos5.mps", "mip/gen-ip054.mps", "mip/pk1.mps", "mip/50v-10.mps")
intact.

283-314: Test structure is sound but consider removing debug logging in CI.

The test correctly validates recombiner determinism by comparing hashes across runs. However, the debug-level logging setup (lines 286-288) may produce excessive output in CI. Consider either:

  • Removing the logger configuration, or
  • Guarding it behind an environment variable check
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 283 - 314, The test
recombiners_deterministic currently forces debug logging by calling
cuopt::default_logger().set_pattern, set_level and flush_on at the start of the
test; remove these three logger configuration calls or wrap them in a runtime
guard (e.g., check an env var like CUOPT_DEBUG_LOG) so CI runs don't emit
excessive debug output; locate the calls in the TEST_P(DiversityTestParams,
recombiners_deterministic) body and either delete the
cuopt::default_logger().set_pattern(...),
cuopt::default_logger().set_level(...), and
cuopt::default_logger().flush_on(...) lines or condition them on
std::getenv("CUOPT_DEBUG_LOG") before applying them.
cpp/tests/mip/determinism_test.cu (1)

103-136: Consider removing unused helper function.

count_callbacks_with_origin (lines 115-121) is defined but not used anywhere in this file. If it's intended for future use, consider adding a comment; otherwise, it could be removed to avoid dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/determinism_test.cu` around lines 103 - 136, The helper
function count_callbacks_with_origin is currently unused in this file; either
remove its definition to eliminate dead code or, if you plan to use it later,
add a brief explanatory comment above count_callbacks_with_origin indicating its
intended future use (or mark it with a clear TODO), so reviewers know it’s
intentional; locate the function by name and delete it or add the comment
accordingly while leaving the other helpers (is_gpu_callback_origin,
count_gpu_callbacks, count_branch_and_bound_callbacks) unchanged.
cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu (1)

82-84: Model produces unreliable estimates for constraint-heavy problems.

The negative coefficient for n_constraints (-400) causes the model to underestimate work for problems with many constraints relative to variables. For such problems, estim_ms clamps to 0, and the loop may overshoot the iteration estimate significantly.

Since this is marked as a temporary placeholder (per the TODO), consider adding a tracking issue or ensuring the predictor replacement is prioritized.

Do you want me to open a tracking issue for replacing this heuristic model?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu` around lines 82 - 84, The
current heuristic for estim_ms uses a negative weight on
op_problem.n_constraints which causes underestimation and clamping to 0 for
constraint-heavy problems; update the formula in relaxed_lp.cu (the estim_ms
computation using op_problem.n_variables, op_problem.n_constraints,
op_problem.coefficients.size(), and estim_iters) to remove or make the
n_constraints coefficient non-negative (e.g., replace -400 with a small positive
or zero weight) and ensure a reasonable lower bound (e.g., min_ms > 0) instead
of hard clamping to 0; also add a TODO comment and open a tracking issue to
prioritize replacing this temporary predictor as referenced by the existing TODO
so the heuristic is not relied on long-term.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu`:
- Around line 79-90: The loop that increases estim_iters can overflow because
estim_iters is an int and may wrap if work_limit is huge; update the logic in
the block that computes estim_ms and increments estim_iters so it cannot run
unbounded: either change estim_iters to a wider integer type (e.g., int64_t)
and/or add an explicit upper bound check (compare against a safe MAX_ITERS
constant or std::numeric_limits<int>::max()/a guarded cap) and break when
reached, and ensure the estim_ms > work_limit*1000 check is preserved; locate
and modify the variables/loop referencing estim_iters, estim_ms, and work_limit
inside the do/while to add the guard and avoid signed overflow.

---

Nitpick comments:
In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu`:
- Around line 82-84: The current heuristic for estim_ms uses a negative weight
on op_problem.n_constraints which causes underestimation and clamping to 0 for
constraint-heavy problems; update the formula in relaxed_lp.cu (the estim_ms
computation using op_problem.n_variables, op_problem.n_constraints,
op_problem.coefficients.size(), and estim_iters) to remove or make the
n_constraints coefficient non-negative (e.g., replace -400 with a small positive
or zero weight) and ensure a reasonable lower bound (e.g., min_ms > 0) instead
of hard clamping to 0; also add a TODO comment and open a tracking issue to
prioritize replacing this temporary predictor as referenced by the existing TODO
so the heuristic is not relied on long-term.

In `@cpp/tests/mip/determinism_test.cu`:
- Around line 103-136: The helper function count_callbacks_with_origin is
currently unused in this file; either remove its definition to eliminate dead
code or, if you plan to use it later, add a brief explanatory comment above
count_callbacks_with_origin indicating its intended future use (or mark it with
a clear TODO), so reviewers know it’s intentional; locate the function by name
and delete it or add the comment accordingly while leaving the other helpers
(is_gpu_callback_origin, count_gpu_callbacks, count_branch_and_bound_callbacks)
unchanged.

In `@cpp/tests/mip/diversity_test.cu`:
- Line 69: The function setup_device_symbols(rmm::cuda_stream_view stream_view)
is a no-op; either remove this unused function (and clean up any call sites that
only exist to invoke it) or make its purpose explicit by adding a TODO/comment
indicating it's a deliberate placeholder for future device-symbol setup; update
references to setup_device_symbols in tests (e.g., any calls in
diversity_test.cu) to reflect the removal or to clarify that the call is
intentionally a no-op.
- Around line 71-73: The two test runner functions test_full_run_determinism and
test_initial_solution_determinism declare an unused seed parameter; either
remove the seed parameter from both function signatures (and update any callers)
if external seeding is intended, or make the functions actually use the value by
calling cuopt::seed_generator::set_seed(seed) (or equivalent local seeding) at
the start of each function to apply the provided seed; pick one approach and
apply it consistently to both functions.
- Around line 382-394: The INSTANTIATE_TEST_SUITE_P block for DiversityTest
(using DiversityTestParams) contains commented-out entries that duplicate active
tuples (e.g., "mip/neos5.mps" and "mip/gen-ip054.mps"); to clean up, remove the
commented duplicate lines so only one declaration remains per test case in the
INSTANTIATE_TEST_SUITE_P(DiversityTest, DiversityTestParams,
testing::Values(...)) invocation and leave the active std::make_tuple entries
(such as "mip/neos5.mps", "mip/gen-ip054.mps", "mip/pk1.mps", "mip/50v-10.mps")
intact.
- Around line 283-314: The test recombiners_deterministic currently forces debug
logging by calling cuopt::default_logger().set_pattern, set_level and flush_on
at the start of the test; remove these three logger configuration calls or wrap
them in a runtime guard (e.g., check an env var like CUOPT_DEBUG_LOG) so CI runs
don't emit excessive debug output; locate the calls in the
TEST_P(DiversityTestParams, recombiners_deterministic) body and either delete
the cuopt::default_logger().set_pattern(...),
cuopt::default_logger().set_level(...), and
cuopt::default_logger().flush_on(...) lines or condition them on
std::getenv("CUOPT_DEBUG_LOG") before applying them.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6d186eaf-cff0-4117-9f13-2cb2e79896ba

📥 Commits

Reviewing files that changed from the base of the PR and between 40a72db and 079a84e.

📒 Files selected for processing (5)
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
  • cpp/tests/mip/determinism_test.cu
  • cpp/tests/mip/diversity_test.cu
🚧 Files skipped from review as they are similar to previous changes (2)
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu

Comment on lines +79 to +90
if (!std::isinf(work_limit)) {
do {
// TODO: use an actual predictor model here
double estim_ms = 313 + 200 * op_problem.n_variables - 400 * op_problem.n_constraints +
600 * op_problem.coefficients.size() + 7100 * estim_iters;
estim_ms = std::max(0.0, estim_ms);
if (estim_ms > work_limit * 1000) { break; }
estim_iters += 100;
} while (true);
} else {
estim_iters = std::numeric_limits<int>::max();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential unbounded loop and signed integer overflow.

When work_limit is finite but very large (e.g., work_limit = 1e6 seconds), this loop may iterate until estim_iters overflows. Since estim_iters is a signed int, overflow is undefined behavior.

Consider adding bounds:

🛡️ Proposed fix to bound iterations
+    constexpr int max_estim_iters = std::numeric_limits<int>::max() - 100;
     if (!std::isinf(work_limit)) {
       do {
         // TODO: use an actual predictor model here
         double estim_ms = 313 + 200 * op_problem.n_variables - 400 * op_problem.n_constraints +
                           600 * op_problem.coefficients.size() + 7100 * estim_iters;
         estim_ms = std::max(0.0, estim_ms);
         if (estim_ms > work_limit * 1000) { break; }
+        if (estim_iters >= max_estim_iters) { break; }
         estim_iters += 100;
       } while (true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu` around lines 79 - 90, The
loop that increases estim_iters can overflow because estim_iters is an int and
may wrap if work_limit is huge; update the logic in the block that computes
estim_ms and increments estim_iters so it cannot run unbounded: either change
estim_iters to a wider integer type (e.g., int64_t) and/or add an explicit upper
bound check (compare against a safe MAX_ITERS constant or
std::numeric_limits<int>::max()/a guarded cap) and break when reached, and
ensure the estim_ms > work_limit*1000 check is preserved; locate and modify the
variables/loop referencing estim_iters, estim_ms, and work_limit inside the
do/while to add the guard and avoid signed overflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants