You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/14 17:34:21 UTC

[GitHub] [arrow-rs] tustvold commented on pull request #1041: Generify ColumnReaderImpl and RecordReader (#1040)

tustvold commented on pull request #1041:
URL: https://github.com/apache/arrow-rs/pull/1041#issuecomment-993816497


   Running benchmarks on my local machine I get somewhat erratic results, from which I conclude this has no major impact on performance
   
   ```
   arrow_array_reader/read Int32Array, plain encoded, mandatory, no NULLs - old                                                                             
                           time:   [3.7939 us 3.8031 us 3.8114 us]
                           change: [-3.6579% -3.4154% -3.1951%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read Int32Array, plain encoded, mandatory, no NULLs - new                                                                             
                           time:   [2.3030 us 2.3048 us 2.3073 us]
                           change: [+2.5908% +2.7441% +2.9142%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, plain encoded, optional, no NULLs - old                                                                            
                           time:   [59.193 us 59.275 us 59.363 us]
                           change: [-4.2623% -4.1285% -4.0009%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read Int32Array, plain encoded, optional, no NULLs - new                                                                             
                           time:   [23.209 us 23.221 us 23.236 us]
                           change: [+32.531% +32.663% +32.835%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, plain encoded, optional, half NULLs - old                                                                            
                           time:   [142.37 us 142.41 us 142.44 us]
                           change: [+5.5942% +6.6789% +7.7376%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, plain encoded, optional, half NULLs - new                                                                            
                           time:   [139.07 us 139.89 us 140.59 us]
                           change: [+0.4422% +0.9960% +1.6028%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   arrow_array_reader/read Int32Array, dictionary encoded, mandatory, no NULLs - old                                                                             
                           time:   [21.919 us 21.923 us 21.927 us]
                           change: [+1.3392% +1.7681% +2.0113%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, dictionary encoded, mandatory, no NULLs - new                                                                            
                           time:   [99.347 us 101.00 us 102.37 us]
                           change: [+5.5715% +6.7636% +8.2107%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, dictionary encoded, optional, no NULLs - old                                                                            
                           time:   [75.648 us 75.663 us 75.681 us]
                           change: [-1.5816% -1.5384% -1.4963%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read Int32Array, dictionary encoded, optional, no NULLs - new                                                                            
                           time:   [112.52 us 113.33 us 114.36 us]
                           change: [+5.2751% +7.2166% +9.0108%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read Int32Array, dictionary encoded, optional, half NULLs - old                                                                            
                           time:   [144.77 us 144.80 us 144.83 us]
                           change: [-11.013% -10.318% -9.6258%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read Int32Array, dictionary encoded, optional, half NULLs - new                                                                            
                           time:   [191.06 us 191.12 us 191.18 us]
                           change: [+3.4773% +3.5370% +3.5957%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, plain encoded, mandatory, no NULLs - old                                                                            
                           time:   [800.06 us 800.19 us 800.32 us]
                           change: [-1.6826% -1.6388% -1.5967%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read StringArray, plain encoded, mandatory, no NULLs - new                                                                            
                           time:   [124.84 us 124.86 us 124.88 us]
                           change: [+4.1077% +4.1575% +4.2088%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, plain encoded, optional, no NULLs - old                                                                            
                           time:   [846.35 us 846.59 us 846.87 us]
                           change: [+0.8637% +0.9228% +0.9834%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   arrow_array_reader/read StringArray, plain encoded, optional, no NULLs - new                                                                            
                           time:   [143.25 us 143.30 us 143.35 us]
                           change: [+2.6977% +2.7794% +2.8847%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, plain encoded, optional, half NULLs - old                                                                            
                           time:   [773.74 us 776.61 us 779.87 us]
                           change: [+3.2218% +3.4681% +3.7063%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, plain encoded, optional, half NULLs - new                                                                            
                           time:   [264.22 us 264.80 us 265.57 us]
                           change: [-1.3401% -1.1712% -0.9903%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   arrow_array_reader/read StringArray, dictionary encoded, mandatory, no NULLs - old                                                                            
                           time:   [726.17 us 726.74 us 727.44 us]
                           change: [+1.2812% +1.3725% +1.4618%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, dictionary encoded, mandatory, no NULLs - new                                                                            
                           time:   [116.83 us 116.91 us 116.99 us]
                           change: [-3.2217% -3.0893% -2.9282%] (p = 0.00 < 0.05)
                           Performance has improved.
   arrow_array_reader/read StringArray, dictionary encoded, optional, no NULLs - old                                                                            
                           time:   [802.16 us 803.89 us 805.57 us]
                           change: [-0.4055% -0.2549% -0.1073%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   arrow_array_reader/read StringArray, dictionary encoded, optional, no NULLs - new                                                                            
                           time:   [134.39 us 134.43 us 134.48 us]
                           change: [+0.0304% +0.2086% +0.3678%] (p = 0.02 < 0.05)
                           Change within noise threshold.
   arrow_array_reader/read StringArray, dictionary encoded, optional, half NULLs - old                                                                            
                           time:   [742.00 us 742.57 us 743.00 us]
                           change: [+3.4464% +3.6453% +3.8440%] (p = 0.00 < 0.05)
                           Performance has regressed.
   arrow_array_reader/read StringArray, dictionary encoded, optional, half NULLs - new                                                                            
                           time:   [236.67 us 237.14 us 238.07 us]
                           change: [+1.7094% +1.9629% +2.5264%] (p = 0.00 < 0.05)
                           Performance has regressed.
   ```
   
   What is strange to me is that this seems to have a consistent ~5% impact on the new `StringArrayReader` despite this change touching none of the code used by it. I suspect we're in the weeds of the wims of LLVM, which I'm not really sure it makes sense to optimise for at this stage - there's a lot of lower hanging fruit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org