Add checked, wrapping and saturating arithmetic for add/sub/neg#190
Add checked, wrapping and saturating arithmetic for add/sub/neg#190kimikage merged 1 commit intoJuliaMath:masterfrom
Conversation
|
Although I didn't intend to work on optimization in this PR, I added the specialized methods for unsigned saturation operation. It may be better for Julia to support the LLVM's saturation arithmetic intrinsics. Perhaps that should have been discussed, but I couldn't find the latest info. Anyway, julia> x_n0f8 = collect(rand(N0f8, 1000, 1000)); y_n0f8 = collect(rand(N0f8, 1000, 1000));
julia> x_q0f7 = collect(rand(Q0f7, 1000, 1000)); y_q0f7 = collect(rand(Q0f7, 1000, 1000));
julia> @btime wrapping_add.($x_n0f8, $y_n0f8); # Julia v1.4.2, v1.5.0
87.900 μs (2 allocations: 976.70 KiB)
julia> @btime saturating_add.($x_n0f8, $y_n0f8); # Julia v1.4.2
424.299 μs (2 allocations: 976.70 KiB)
julia> @btime saturating_add.($x_n0f8, $y_n0f8); # Julia v1.5.0 (using AVX2 vpaddusb instruction)
98.499 μs (2 allocations: 976.70 KiB)julia> @btime saturating_add.($x_q0f7, $y_q0f7); # Julia v1.4.2 using add_with_overflow
2.523 ms (2 allocations: 976.70 KiB)
julia> @btime saturating_add.($x_q0f7, $y_q0f7); # Julia v1.5.0 using add_with_overflow
679.901 μs (2 allocations: 976.70 KiB)Edit:
They are still slow, but it's better than not doing anything. |
Codecov Report
@@ Coverage Diff @@
## master #190 +/- ##
==========================================
+ Coverage 87.89% 88.24% +0.34%
==========================================
Files 6 6
Lines 471 485 +14
==========================================
+ Hits 414 428 +14
Misses 57 57
Continue to review full report at Codecov.
|
32f7463 to
7248e33
Compare
timholy
left a comment
There was a problem hiding this comment.
Very nice bit twiddling!
Do we need to check any intermediate values? All the tests use extreme values (typemin/typemax/zero/eps).
|
I foresee that the 100% test of 16-bit types for division family will take time. (Of course, the division operation itself should not be slow.) However, I feel like we can test the 8-bit types. |
|
That's kind of what I was thinking too. |
|
I added the 8-bit |
The default arithmetic is still the wrapping arithmetic.
|
There are still opportunities to review the test code. I plan to improve the multiplication implementation in two or three PRs after this. |
|
Checked arithmetic may throw an exception, so a slowdown is unavoidable. Instead of julia> x = rand(N0f8, 1000, 1000);
julia> y = typemax(N0f8) .- x;
julia> function checked_add_arrays(x, y)
s = saturating_add.(x, y)
all(wrapping_add.(x, y).==s) || throw(OverflowError("overflowed somewhere"))
s
end
checked_add_arrays (generic function with 1 method)
julia> @btime wrapping_add.($x, $y);
86.299 μs (2 allocations: 976.70 KiB)
julia> @btime checked_add.($x, $y);
898.900 μs (2 allocations: 976.70 KiB) # Windows
678.800 μs (2 allocations: 976.70 KiB) # Linux
julia> @btime checked_add_arrays($x, $y);
313.200 μs (6 allocations: 1.08 MiB)Edit: function checked_add(x::C, y::C) where {C <: AbstractRGB{<:FixedPoint}}
s = mapc(saturating_add, x, y)
s === mapc(wrapping_add, x, y) || throw(OverflowError(""))
s
end |
Closes #152.
The default arithmetic is still the wrapping arithmetic.
We may want to customize the overflow message.