commit | 981cab75d3bd75fbc770cae50b49cedc53ee0ee7 | [log] [tgz] |
---|---|---|
author | Heres, Daniel <danielheres@gmail.com> | Sun Nov 29 08:20:35 2020 +0200 |
committer | Neville Dipale <nevilledips@gmail.com> | Sun Nov 29 08:20:35 2020 +0200 |
tree | bf6c904834c457a951c68b10f1a44d6ddd2dfb20 | |
parent | a2d17081f8e4e48efe0ae5fac714de99088fd55e [diff] |
ARROW-10763: [Rust] Speed up take for primitive / boolean for non-null arrays This PR significantly speeds up the take (primitive and boolean) kernels for non-null arrays (even more if indices contain nulls) ``` take i32 512 time: [1.1847 us 1.1879 us 1.1915 us] change: [-47.038% -46.813% -46.609%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild Benchmarking take i32 1024: Collecting 100 samples in estimated 5.0083 s (2.2M i take i32 1024 time: [2.2183 us 2.2255 us 2.2330 us] change: [-48.699% -47.683% -46.797%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild Benchmarking take i32 nulls 512: Collecting 100 samples in estimated 5.0016 s (3 take i32 nulls 512 time: [1.2828 us 1.2882 us 1.2941 us] change: [-44.592% -44.377% -44.178%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 6 (6.00%) high mild 4 (4.00%) high severe Benchmarking take i32 nulls 1024: Collecting 100 samples in estimated 5.0112 s ( take i32 nulls 1024 time: [2.3798 us 2.3846 us 2.3894 us] change: [-41.139% -40.735% -40.358%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) low mild 1 (1.00%) high mild Benchmarking take bool 512: Collecting 100 samples in estimated 5.0061 s (3.6M i take bool 512 time: [1.3864 us 1.3937 us 1.4009 us] change: [-38.319% -38.028% -37.734%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild Benchmarking take bool 1024: Collecting 100 samples in estimated 5.0006 s (2.0M take bool 1024 time: [2.4654 us 2.4722 us 2.4790 us] change: [-36.041% -35.820% -35.621%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild Benchmarking take bool nulls 512: Collecting 100 samples in estimated 5.0002 s ( take bool nulls 512 time: [1.1865 us 1.1901 us 1.1939 us] change: [-66.326% -65.988% -65.656%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking take bool nulls 1024: Collecting 100 samples in estimated 5.0098 s take bool nulls 1024 time: [2.0748 us 2.0814 us 2.0889 us] change: [-73.180% -73.053% -72.925%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe ``` Closes #8795 from Dandandan/opt_take Authored-by: Heres, Daniel <danielheres@gmail.com> Signed-off-by: Neville Dipale <nevilledips@gmail.com>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git master.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: