Two reasons:
- Blas offers this as
axpy, so it can be optimized (note: axpy is for vectors, not strided matrix)
- It's a common case.
This operation is already _available efficiently_ using .zip_mut_with:
a1.zip_mut_with(&a2, |x, &y| *x += k * y)
The question is if this is so common and the zip_mut_with so ugly that it warrants its own method.