You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/22 23:36:17 UTC

[GitHub] [arrow] wesm commented on pull request #7521: ARROW-9210: [C++] Use BitBlockCounter in array/visitor_inline.h

wesm commented on pull request #7521:
URL: https://github.com/apache/arrow/pull/7521#issuecomment-647821487


   Here's a benchmark run with gcc-8 
   
   ```
   ---------------------------------------------------------------
   Benchmark                        Time           CPU Iterations
   ---------------------------------------------------------------
   BuildDictionary            3219443 ns    3219440 ns        218   1.21215GB/s
   BuildStringDictionary      3692881 ns    3692881 ns        192   81.7532MB/s
   UniqueInt64/0             14413456 ns   14413251 ns         48 null_percent=0   2.16814GB/s
   UniqueInt64/1             15516052 ns   15515737 ns         45 null_percent=0.1   2.01408GB/s
   UniqueInt64/2             17031282 ns   17031266 ns         41 null_percent=1   1.83486GB/s
   UniqueInt64/3             20680114 ns   20680064 ns         34 null_percent=10   1.51112GB/s
   UniqueInt64/4             12018069 ns   12017844 ns         57 null_percent=99    2.6003GB/s
   UniqueInt64/5              9179953 ns    9179946 ns         77 null_percent=100   3.40416GB/s
   UniqueInt64/6             15501523 ns   15501496 ns         45 null_percent=0   2.01593GB/s
   UniqueInt64/7             16482935 ns   16482300 ns         41 null_percent=0.1   1.89597GB/s
   UniqueInt64/8             18349988 ns   18349317 ns         38 null_percent=1   1.70306GB/s
   UniqueInt64/9             21439268 ns   21439244 ns         32 null_percent=10   1.45761GB/s
   UniqueInt64/10            12530067 ns   12529871 ns         55 null_percent=99   2.49404GB/s
   UniqueInt64/11             9167314 ns    9167365 ns         75 null_percent=100   3.40883GB/s
   UniqueString10bytes/0     43535899 ns   43535846 ns         16 null_percent=0   918.783MB/s
   UniqueString10bytes/1     45130595 ns   45129634 ns         16 null_percent=0.1   886.336MB/s
   UniqueString10bytes/2     45249034 ns   45247983 ns         15 null_percent=1   884.017MB/s
   UniqueString10bytes/3     45101533 ns   45100209 ns         16 null_percent=10   886.914MB/s
   UniqueString10bytes/4      4316048 ns    4316019 ns        163 null_percent=99   9.05059GB/s
   UniqueString10bytes/5      1435781 ns    1435763 ns        485 null_percent=100   27.2068GB/s
   UniqueString10bytes/6     59100344 ns   59098817 ns         12 null_percent=0   676.832MB/s
   UniqueString10bytes/7     59797544 ns   59795857 ns         12 null_percent=0.1   668.943MB/s
   UniqueString10bytes/8     61024697 ns   61023090 ns         11 null_percent=1    655.49MB/s
   UniqueString10bytes/9     59817211 ns   59816339 ns         12 null_percent=10   668.714MB/s
   UniqueString10bytes/10     4950387 ns    4950242 ns        134 null_percent=99   7.89103GB/s
   UniqueString10bytes/11     1443482 ns    1443434 ns        446 null_percent=100   27.0622GB/s
   UniqueString100bytes/0    95609006 ns   95606132 ns          7 null_percent=0   4.08577GB/s
   UniqueString100bytes/1    96850582 ns   96849441 ns          7 null_percent=0.1   4.03332GB/s
   UniqueString100bytes/2    95404742 ns   95404634 ns          7 null_percent=1    4.0944GB/s
   UniqueString100bytes/3    89401775 ns   89401006 ns          8 null_percent=10   4.36936GB/s
   UniqueString100bytes/4     4705868 ns    4705746 ns        148 null_percent=99   83.0102GB/s
   UniqueString100bytes/5     1434077 ns    1434055 ns        486 null_percent=100   272.392GB/s
   UniqueString100bytes/6   206155133 ns  206148425 ns          3 null_percent=0   1.89487GB/s
   UniqueString100bytes/7   204661287 ns  204653659 ns          3 null_percent=0.1   1.90871GB/s
   UniqueString100bytes/8   205941884 ns  205941271 ns          3 null_percent=1   1.89678GB/s
   UniqueString100bytes/9   192074501 ns  192073431 ns          4 null_percent=10   2.03373GB/s
   UniqueString100bytes/10    6180349 ns    6180227 ns        111 null_percent=99   63.2056GB/s
   UniqueString100bytes/11    1474565 ns    1474564 ns        482 null_percent=100   264.909GB/s
   UniqueUInt8/0              1990025 ns    1990023 ns        348 null_percent=0   1.96292GB/s
   UniqueUInt8/1              2594146 ns    2594089 ns        272 null_percent=0.1   1.50583GB/s
   UniqueUInt8/2              4726027 ns    4726053 ns        145 null_percent=1   846.372MB/s
   UniqueUInt8/3              9465222 ns    9465126 ns         75 null_percent=10   422.604MB/s
   UniqueUInt8/4              3557141 ns    3557135 ns        195 null_percent=99    1124.5MB/s
   UniqueUInt8/5              2259664 ns    2259664 ns        314 null_percent=100   1.72869GB/s
   ```
   
   Here is the % diff versus the baseline. 
   
   * Cases 1 and 7 are the mostly-not-null cases. This shows a 15-20% perf improvement
   * Cases 5 and 11 are the all-null cases.
   * Case 4 is the 99% null case
   * The "BuildDictionary" case at the bottom with the perf regression is one of the "worst case scenarios". 89% of the values are null and so we almost never observe an all-null or all-not-null block. The use of `BitUtil::GetBit` over BitmapReader causes this slightly regression since nearly every validity bit must be checked separately. I don't think it's worth optimizing for this case since the others are more empirically representative of real world data
   
   ```
                     benchmark          baseline        contender  change %  regression
   8    UniqueString100bytes/5    40.668 GiB/sec  272.392 GiB/sec   569.787       False
   37    UniqueString10bytes/5     4.064 GiB/sec   27.207 GiB/sec   569.456       False
   33   UniqueString10bytes/11     4.065 GiB/sec   27.062 GiB/sec   565.751       False
   12  UniqueString100bytes/11    40.578 GiB/sec  264.909 GiB/sec   552.841       False
   0     UniqueString10bytes/4     3.568 GiB/sec    9.051 GiB/sec   153.692       False
   36   UniqueString100bytes/4    34.408 GiB/sec   83.010 GiB/sec   141.252       False
   19   UniqueString10bytes/10     3.375 GiB/sec    7.891 GiB/sec   133.794       False
   24            UniqueUInt8/1   677.981 MiB/sec    1.506 GiB/sec   127.435       False
   5   UniqueString100bytes/10    30.775 GiB/sec   63.206 GiB/sec   105.381       False
   27            UniqueUInt8/5  1000.163 MiB/sec    1.729 GiB/sec    76.989       False
   13            UniqueUInt8/2   650.819 MiB/sec  846.372 MiB/sec    30.047       False
   29           UniqueInt64/11     2.703 GiB/sec    3.409 GiB/sec    26.126       False
   7             UniqueInt64/5     2.704 GiB/sec    3.404 GiB/sec    25.903       False
   18            UniqueUInt8/4   932.926 MiB/sec    1.098 GiB/sec    20.535       False
   23            UniqueInt64/1     1.681 GiB/sec    2.014 GiB/sec    19.840       False
   21            UniqueInt64/7     1.628 GiB/sec    1.896 GiB/sec    16.476       False
   31            UniqueInt64/2     1.658 GiB/sec    1.835 GiB/sec    10.651       False
   20    UniqueString10bytes/7   612.647 MiB/sec  668.943 MiB/sec     9.189       False
   16            UniqueInt64/3     1.386 GiB/sec    1.511 GiB/sec     9.053       False
   38    UniqueString10bytes/8   601.259 MiB/sec  655.490 MiB/sec     9.019       False
   1             UniqueUInt8/0     1.808 GiB/sec    1.963 GiB/sec     8.588       False
   41            UniqueInt64/9     1.355 GiB/sec    1.458 GiB/sec     7.562       False
   14    UniqueString10bytes/1   830.614 MiB/sec  886.336 MiB/sec     6.709       False
   4             UniqueInt64/8     1.603 GiB/sec    1.703 GiB/sec     6.260       False
   32    UniqueString10bytes/2   847.018 MiB/sec  884.017 MiB/sec     4.368       False
   42            UniqueInt64/4     2.508 GiB/sec    2.600 GiB/sec     3.701       False
   39    UniqueString10bytes/3   855.985 MiB/sec  886.914 MiB/sec     3.613       False
   28           UniqueInt64/10     2.413 GiB/sec    2.494 GiB/sec     3.360       False
   34   UniqueString100bytes/3     4.254 GiB/sec    4.369 GiB/sec     2.722       False
   11   UniqueString100bytes/2     3.993 GiB/sec    4.094 GiB/sec     2.544       False
   9     UniqueString10bytes/9   654.257 MiB/sec  668.714 MiB/sec     2.210       False
   35    UniqueString10bytes/6   662.915 MiB/sec  676.832 MiB/sec     2.099       False
   6     BuildStringDictionary    80.971 MiB/sec   81.753 MiB/sec     0.966       False
   22   UniqueString100bytes/1     4.002 GiB/sec    4.033 GiB/sec     0.783       False
   25            UniqueInt64/0     2.153 GiB/sec    2.168 GiB/sec     0.697       False
   17    UniqueString10bytes/0   917.726 MiB/sec  918.783 MiB/sec     0.115       False
   43            UniqueInt64/6     2.017 GiB/sec    2.016 GiB/sec    -0.071       False
   40   UniqueString100bytes/0     4.091 GiB/sec    4.086 GiB/sec    -0.130       False
   3    UniqueString100bytes/7     1.938 GiB/sec    1.909 GiB/sec    -1.519       False
   26   UniqueString100bytes/8     1.954 GiB/sec    1.897 GiB/sec    -2.935       False
   2    UniqueString100bytes/9     2.114 GiB/sec    2.034 GiB/sec    -3.782       False
   30   UniqueString100bytes/6     2.008 GiB/sec    1.895 GiB/sec    -5.649        True
   10            UniqueUInt8/3   474.468 MiB/sec  422.604 MiB/sec   -10.931        True
   15          BuildDictionary     1.776 GiB/sec    1.212 GiB/sec   -31.742        True
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org