You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ARROW-9131: [C++] Faster ascii_lower and ascii_upper.
Following up on #7418 I tried and benchmarked a different way for
* ascii_lower
* ascii_upper
Before (lower is similar):
```
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
AsciiUpper_median 4922843 ns 4918961 ns 10 bytes_per_second=3.1457G/s items_per_second=213.17M/s
```
After:
```
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
AsciiUpper_median 1391272 ns 1390014 ns 10 bytes_per_second=11.132G/s items_per_second=754.363M/s
```
This is a 3.7x speedup (on a AMD machine).
Using http://quick-bench.com/JaDErmVCY23Z1tu6YZns_KBt0qU I found 4.6x speedup for clang 9, 6.4x for GCC 9.2.
Also, the test is expanded a bit to include a non-ascii codepoint, to make explicit it is fine to upper
or lower case a utf8 string. The non-overlap encoding of utf8 make this ok (see section 2.5 of Unicode
Standard Core Specification v13.0).
Closes#7434 from maartenbreddels/ARROW-9131
Authored-by: Maarten A. Breddels <maartenbreddels@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
0 commit comments