The tuple-stress benchmark appears to be ridiculously slow with NLL. Profiling suggests that the majority of costs come from the liveness constraint generation code:
|
pub(super) fn generate<'gcx, 'tcx>( |
|
cx: &mut TypeChecker<'_, 'gcx, 'tcx>, |
|
mir: &Mir<'tcx>, |
|
liveness: &LivenessResults, |
|
flow_inits: &mut FlowAtLocation<MaybeInitializedPlaces<'_, 'gcx, 'tcx>>, |
|
move_data: &MoveData<'tcx>, |
|
) { |
Specifically, the vast majority of samples (50%) occur in the push_type_live_constraint function:
|
fn push_type_live_constraint<T>( |
|
cx: &mut TypeChecker<'_, 'gcx, 'tcx>, |
|
value: T, |
|
location: Location, |
|
) where |
|
T: TypeFoldable<'tcx>, |
This function primarily consists of a walk over all the free regions within a type:
|
cx.tcx().for_each_free_region(&value, |live_region| { |
|
cx.constraints.liveness_set.push((live_region, location)); |
|
}); |
However, the types in question don't really involve regions (they are things like (u32, f64, u32) etc). It turns out that we have a "flags" mechanism that tracks the content of types, designed for just such a purpose. This should allow us to quickly skip. The flags are defined here, using the bitflags! macro:
|
bitflags! { |
|
pub struct TypeFlags: u32 { |
The flag we are interested in HAS_FREE_REGIONS:
|
/// Does this have any region that "appears free" in the type? |
|
/// Basically anything but `ReLateBound` and `ReErased`. |
|
const HAS_FREE_REGIONS = 1 << 6; |
We should be able to optimize the for_each_free_region to consult this flag and quickly skip past types that do not contain any regions. for_each_free_region is defined here:
|
pub fn for_each_free_region<T,F>(self, |
|
value: &T, |
|
callback: F) |
|
where F: FnMut(ty::Region<'tcx>), |
|
T: TypeFoldable<'tcx>, |
It uses a "type visitor" to do its work:
|
impl<'tcx, F> TypeVisitor<'tcx> for RegionVisitor<F> |
|
where F : FnMut(ty::Region<'tcx>) |
we want to add callback for the case of visiting types which will check this flag. Something like the following ought to do it:
fn visit_ty(&mut self, ty: Ty<'tcx>) -> bool {
if ty.flags.intersects(HAS_FREE_REGIONS) {
self.super_ty(ty)
} else {
false // keep visiting
}
}
The tuple-stress benchmark appears to be ridiculously slow with NLL. Profiling suggests that the majority of costs come from the liveness constraint generation code:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 36 to 42 in 860d169
Specifically, the vast majority of samples (50%) occur in the
push_type_live_constraintfunction:rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 158 to 163 in 860d169
This function primarily consists of a walk over all the free regions within a type:
rust/src/librustc_mir/borrow_check/nll/type_check/liveness.rs
Lines 170 to 172 in 860d169
However, the types in question don't really involve regions (they are things like
(u32, f64, u32)etc). It turns out that we have a "flags" mechanism that tracks the content of types, designed for just such a purpose. This should allow us to quickly skip. The flags are defined here, using thebitflags!macro:rust/src/librustc/ty/mod.rs
Lines 418 to 419 in 860d169
The flag we are interested in
HAS_FREE_REGIONS:rust/src/librustc/ty/mod.rs
Lines 432 to 434 in 860d169
We should be able to optimize the
for_each_free_regionto consult this flag and quickly skip past types that do not contain any regions.for_each_free_regionis defined here:rust/src/librustc/ty/fold.rs
Lines 256 to 260 in 860d169
It uses a "type visitor" to do its work:
rust/src/librustc/ty/fold.rs
Lines 289 to 290 in 860d169
we want to add callback for the case of visiting types which will check this flag. Something like the following ought to do it: