You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Yordan Pavlov (Jira)" <ji...@apache.org> on 2020/05/17 09:41:00 UTC

[jira] [Created] (ARROW-8831) [Rust] incomplete SIMD implementation in simd_compare_op

Yordan Pavlov created ARROW-8831:
------------------------------------

             Summary: [Rust] incomplete SIMD implementation in simd_compare_op
                 Key: ARROW-8831
                 URL: https://issues.apache.org/jira/browse/ARROW-8831
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust
    Affects Versions: 0.17.0
            Reporter: Yordan Pavlov


Currently the simd_compare_op function defined here [https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/comparison.rs#L204] is only about 10% faster compared to the non-SIMD implementation and is  taking approximately the same time for types of different length (which indicates that the SIMD implementation is not complete). Below are results from benchmarks with Int8 and Float32 types:

eq Int8 time: [947.53 us 947.81 us 948.05 us]
eq Int8 simd time: [855.02 us 858.26 us 862.48 us]
neq Int8 time: [904.09 us 907.34 us 911.44 us]
neq Int8 simd time: [848.49 us 849.28 us 850.28 us]
lt Int8 time: [900.87 us 902.65 us 904.86 us]
lt Int8 simd time: [850.32 us 850.96 us 851.90 us]
lt_eq Int8 time: [974.68 us 983.03 us 991.98 us]
lt_eq Int8 simd time: [851.83 us 852.22 us 852.74 us]
gt Int8 time: [908.48 us 911.76 us 914.72 us]
gt Int8 simd time: [851.93 us 852.43 us 853.04 us]
gt_eq Int8 time: [981.53 us 983.37 us 986.31 us]
gt_eq Int8 simd time: [855.59 us 856.83 us 858.61 us]

eq Float32 time: [911.46 us 911.70 us 912.01 us]
eq Float32 simd time: [884.74 us 885.97 us 887.74 us]
neq Float32 time: [904.26 us 904.73 us 905.27 us]
neq Float32 simd time: [884.40 us 892.32 us 901.98 us]
lt Float32 time: [907.90 us 908.54 us 909.34 us]
lt Float32 simd time: [883.23 us 886.05 us 889.31 us]
lt_eq Float32 time: [911.44 us 911.62 us 911.82 us]
lt_eq Float32 simd time: [882.78 us 886.78 us 891.05 us]
gt Float32 time: [906.88 us 907.96 us 909.32 us]
gt Float32 simd time: [879.78 us 883.03 us 886.63 us]
gt_eq Float32 time: [924.72 us 926.03 us 928.29 us]
gt_eq Float32 simd time: [884.80 us 885.93 us 887.35 us]

In the benchmark results above, notice how both the SIMD and non-SIMD operations take similar amounts of time for types of different size (Int8 and Float32). This is normal for a non-SIMD implementation but is not normal for a SIMD implementation as SIMD operations can be executed on more values of smaller size.

 

This pull request attempts to fix that: [https://github.com/apache/arrow/pull/7204]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)