Delay new node partitioning#1660
Conversation
In some cases we know we won't need more than one iteration of sync, so we can cut our communication in half by skipping the second "check if everything's fine" iteration
This lets us identify new nodes when sync'ing up a distributed mesh later. *That* lets us truly respect skip_partitioning requests, because we can assign processor ids to new nodes without risking inadvertently reassigning the processor id of an existing node.
|
Still passing Rattlesnake? In that case I'll merge once the rest of the CI checkboxes are happy. |
|
Hmm.. or maybe use this excuse to add some more expensive optional MOOSE tests, since GRINS-dbg might take a while. |
|
DistributedMesh recover exodiff failures at 3 processors with variables/fe_hier.test_hier_2_1d, at 10 with mesh/named_entities.test_periodic_names, at 12 with restart/restart.test_nodal_var_2, and at 16 with executioners/executioner.test_steady, misc/exception.parallel_exception_jacobian_transient_non_zero_rank, and misc/exception.parallel_exception_residual_transient_non_zero_rank... And I can't replicate a single one of those failures. We still have those tests marked "failed but allowed" (for good reason; there was one long-standing failure there that I never managed to replicate) and this PR may not be what broke them (when's the last time we did a distributed recover pass on Civet? The https://civet.inl.gov/recipe_events/23169/ log doesn't show any of the previous runs.) so I'm going to merge anyway, but I'm despairing trying to figure out how to bisect test failures I can't reproduce. |
The second half of the work in #1659; this delays partitioning nodes newly created by refinement, allowing us to partition them with any heuristic but without accidentally repartitioning old nodes if the user has disallowed that.