You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/25 19:56:48 UTC

[GitHub] [arrow] pitrou opened a new pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

pitrou opened a new pull request #8770:
URL: https://github.com/apache/arrow/pull/8770


   A specialized bitmap reader that yields runs of set bits, for use cases where reset bits (e.g. null bits) don't need any handling.
   
   On some use cases it can be significantly faster than the alternatives.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-734223439


   Aggregation benchmarks:
   ```
                                 benchmark          baseline        contender   change %                                                                                                                                                                      counters
   108         ModeKernelBoolean/1048576/0    57.487 MiB/sec   37.406 GiB/sec  66530.741            {'run_name': 'ModeKernelBoolean/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 41, 'null_percent': 0.0}
   93          ModeKernelBoolean/1048576/2    28.239 MiB/sec    1.863 GiB/sec   6655.773           {'run_name': 'ModeKernelBoolean/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 20, 'null_percent': 50.0}
   56         ModeKernelBoolean/1048576/10    43.613 MiB/sec    1.799 GiB/sec   4123.421          {'run_name': 'ModeKernelBoolean/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31, 'null_percent': 10.0}
   29        ModeKernelBoolean/1048576/100    58.888 MiB/sec    1.826 GiB/sec   3075.803          {'run_name': 'ModeKernelBoolean/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 39, 'null_percent': 1.0}
   124     ModeKernelBoolean/1048576/10000    61.189 MiB/sec    1.864 GiB/sec   3018.637       {'run_name': 'ModeKernelBoolean/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 43, 'null_percent': 0.01}
   35             ModeKernelInt8/1048576/0   792.467 MiB/sec    2.473 GiB/sec    219.603             {'run_name': 'ModeKernelInt8/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1000, 'null_percent': 0.0}
   36        ModeKernelInt16/1048576/10000     1.922 GiB/sec    4.217 GiB/sec    119.409       {'run_name': 'ModeKernelInt16/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2138, 'null_percent': 0.01}
   100    VarianceKernelDouble/1048576/100     2.445 GiB/sec    4.960 GiB/sec    102.888     {'run_name': 'VarianceKernelDouble/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1746, 'null_percent': 1.0}
   76            ModeKernelInt32/1048576/0     4.050 GiB/sec    8.075 GiB/sec     99.390            {'run_name': 'ModeKernelInt32/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4419, 'null_percent': 0.0}
   109     VarianceKernelFloat/1048576/100     1.216 GiB/sec    2.387 GiB/sec     96.341       {'run_name': 'VarianceKernelFloat/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 871, 'null_percent': 1.0}
   20        ModeKernelInt64/1048576/10000     5.547 GiB/sec    9.831 GiB/sec     77.229       {'run_name': 'ModeKernelInt64/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6475, 'null_percent': 0.01}
   54      VarianceKernelInt64/1048576/100     3.511 GiB/sec    5.748 GiB/sec     63.693      {'run_name': 'VarianceKernelInt64/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2519, 'null_percent': 1.0}
   28          ModeKernelInt16/1048576/100     1.292 GiB/sec    2.077 GiB/sec     60.697           {'run_name': 'ModeKernelInt16/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 927, 'null_percent': 1.0}
   94          ModeKernelInt32/1048576/100     2.530 GiB/sec    3.969 GiB/sec     56.872          {'run_name': 'ModeKernelInt32/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1805, 'null_percent': 1.0}
   67             ModeKernelInt8/1048576/2   237.264 MiB/sec  362.684 MiB/sec     52.861             {'run_name': 'ModeKernelInt8/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 166, 'null_percent': 50.0}
   11        VarianceKernelFloat/1048576/2   508.083 MiB/sec  727.822 MiB/sec     43.249        {'run_name': 'VarianceKernelFloat/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 355, 'null_percent': 50.0}
   65       VarianceKernelDouble/1048576/2     1.030 GiB/sec    1.457 GiB/sec     41.439       {'run_name': 'VarianceKernelDouble/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 734, 'null_percent': 50.0}
   84            ModeKernelInt64/1048576/2     1.003 GiB/sec    1.409 GiB/sec     40.411            {'run_name': 'ModeKernelInt64/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 713, 'null_percent': 50.0}
   22            ModeKernelInt16/1048576/2   253.082 MiB/sec  355.311 MiB/sec     40.394            {'run_name': 'ModeKernelInt16/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 177, 'null_percent': 50.0}
   15        VarianceKernelInt32/1048576/2  1011.938 MiB/sec    1.387 GiB/sec     40.305        {'run_name': 'VarianceKernelInt32/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 701, 'null_percent': 50.0}
   98            ModeKernelInt32/1048576/2   502.640 MiB/sec  702.642 MiB/sec     39.790            {'run_name': 'ModeKernelInt32/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 341, 'null_percent': 50.0}
   61        VarianceKernelInt64/1048576/2     1.024 GiB/sec    1.422 GiB/sec     38.871        {'run_name': 'VarianceKernelInt64/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 726, 'null_percent': 50.0}
   81       VarianceKernelFloat/1048576/10   854.282 MiB/sec    1.153 GiB/sec     38.189       {'run_name': 'VarianceKernelFloat/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 598, 'null_percent': 10.0}
   34          ModeKernelInt64/1048576/100     4.766 GiB/sec    6.547 GiB/sec     37.389          {'run_name': 'ModeKernelInt64/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3416, 'null_percent': 1.0}
   2       VarianceKernelDouble/1048576/10     1.670 GiB/sec    2.290 GiB/sec     37.083     {'run_name': 'VarianceKernelDouble/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1195, 'null_percent': 10.0}
   111     VarianceKernelInt32/1048576/100     4.159 GiB/sec    5.506 GiB/sec     32.395      {'run_name': 'VarianceKernelInt32/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2978, 'null_percent': 1.0}
   30           ModeKernelInt8/1048576/100     1.276 GiB/sec    1.634 GiB/sec     28.042            {'run_name': 'ModeKernelInt8/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 919, 'null_percent': 1.0}
   73    VarianceKernelInt32/1048576/10000     6.548 GiB/sec    7.310 GiB/sec     11.634   {'run_name': 'VarianceKernelInt32/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4687, 'null_percent': 0.01}
   1         VarianceKernelInt32/1048576/0     7.168 GiB/sec    7.952 GiB/sec     10.944        {'run_name': 'VarianceKernelInt32/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5116, 'null_percent': 0.0}
   [...]
   16           ModeKernelInt64/1048576/10     2.376 GiB/sec    2.181 GiB/sec     -8.192          {'run_name': 'ModeKernelInt64/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1700, 'null_percent': 10.0}
   63        ModeKernelInt32/1048576/10000     6.489 GiB/sec    3.047 GiB/sec    -53.040       {'run_name': 'ModeKernelInt32/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4653, 'null_percent': 0.01}
   45         ModeKernelInt8/1048576/10000     2.262 GiB/sec    1.002 GiB/sec    -55.719        {'run_name': 'ModeKernelInt8/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1617, 'null_percent': 0.01}
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733918851


   ArrayRangeEquals benchmarks:
   ```
                                                                       benchmark            baseline           contender  change %                                                                                                                                                                                       counters
   22             ArrayRangeEqualsStruct/32768/100   23.863m items/sec  343.067m items/sec  1337.631                {'run_name': 'ArrayRangeEqualsStruct/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 510, 'null_percent': 1.0}
   5     ArrayRangeEqualsFixedSizeBinary/32768/100  614.186m items/sec    5.185b items/sec   744.270     {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 12979, 'null_percent': 1.0}
   33        ArrayRangeEqualsListOfInt32/32768/100   26.215m items/sec  215.239m items/sec   721.041           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 560, 'null_percent': 1.0}
   21          ArrayRangeEqualsBoolean/32768/10000    2.826b items/sec   19.479b items/sec   589.262          {'run_name': 'ArrayRangeEqualsBoolean/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 61059, 'null_percent': 0.01}
   51           ArrayRangeEqualsStruct/32768/10000  222.532m items/sec    1.483b items/sec   566.535            {'run_name': 'ArrayRangeEqualsStruct/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4789, 'null_percent': 0.01}
   47             ArrayRangeEqualsString/32768/100  252.039m items/sec    1.350b items/sec   435.816               {'run_name': 'ArrayRangeEqualsString/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5365, 'null_percent': 1.0}
   20              ArrayRangeEqualsStruct/32768/10   12.987m items/sec   61.633m items/sec   374.582                {'run_name': 'ArrayRangeEqualsStruct/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277, 'null_percent': 10.0}
   36            ArrayRangeEqualsFloat32/32768/100    1.096b items/sec    4.474b items/sec   308.366             {'run_name': 'ArrayRangeEqualsFloat32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 23270, 'null_percent': 1.0}
   29              ArrayRangeEqualsInt32/32768/100    1.675b items/sec    6.533b items/sec   290.081               {'run_name': 'ArrayRangeEqualsInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 35608, 'null_percent': 1.0}
   46              ArrayRangeEqualsString/32768/10  135.386m items/sec  526.754m items/sec   289.076               {'run_name': 'ArrayRangeEqualsString/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2853, 'null_percent': 10.0}
   38         ArrayRangeEqualsListOfInt32/32768/10   12.637m items/sec   43.741m items/sec   246.142           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269, 'null_percent': 10.0}
   28     ArrayRangeEqualsFixedSizeBinary/32768/10  294.337m items/sec  969.941m items/sec   229.534      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6252, 'null_percent': 10.0}
   42      ArrayRangeEqualsListOfInt32/32768/10000  202.207m items/sec  656.251m items/sec   224.544       {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4360, 'null_percent': 0.01}
   13            ArrayRangeEqualsBoolean/32768/100  979.715m items/sec    2.908b items/sec   196.868             {'run_name': 'ArrayRangeEqualsBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 21042, 'null_percent': 1.0}
   40              ArrayRangeEqualsBoolean/32768/1   12.017b items/sec   35.217b items/sec   193.068            {'run_name': 'ArrayRangeEqualsBoolean/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 256448, 'null_percent': 100.0}
   45      ArrayRangeEqualsFixedSizeBinary/32768/1   12.204b items/sec   35.234b items/sec   188.703    {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269241, 'null_percent': 100.0}
   17                ArrayRangeEqualsInt32/32768/1   12.406b items/sec   34.810b items/sec   180.598              {'run_name': 'ArrayRangeEqualsInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 265638, 'null_percent': 100.0}
   26          ArrayRangeEqualsListOfInt32/32768/1   12.626b items/sec   35.288b items/sec   179.485        {'run_name': 'ArrayRangeEqualsListOfInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269942, 'null_percent': 100.0}
   44               ArrayRangeEqualsString/32768/1   13.196b items/sec   35.124b items/sec   166.178             {'run_name': 'ArrayRangeEqualsString/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 281694, 'null_percent': 100.0}
   35              ArrayRangeEqualsFloat32/32768/1   12.898b items/sec   34.159b items/sec   164.846            {'run_name': 'ArrayRangeEqualsFloat32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277754, 'null_percent': 100.0}
   53               ArrayRangeEqualsStruct/32768/1   13.618b items/sec   35.687b items/sec   162.059             {'run_name': 'ArrayRangeEqualsStruct/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 291514, 'null_percent': 100.0}
   15  ArrayRangeEqualsFixedSizeBinary/32768/10000    4.166b items/sec   10.835b items/sec   160.089  {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 89321, 'null_percent': 0.01}
   16            ArrayRangeEqualsInt32/32768/10000    6.886b items/sec   15.942b items/sec   131.514           {'run_name': 'ArrayRangeEqualsInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 146029, 'null_percent': 0.01}
   19               ArrayRangeEqualsStruct/32768/0  748.134m items/sec    1.686b items/sec   125.426                {'run_name': 'ArrayRangeEqualsStruct/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 16023, 'null_percent': 0.0}
   32      ArrayRangeEqualsFixedSizeBinary/32768/2  196.551m items/sec  375.563m items/sec    91.077       {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4153, 'null_percent': 50.0}
   49               ArrayRangeEqualsString/32768/2  139.017m items/sec  239.038m items/sec    71.948                {'run_name': 'ArrayRangeEqualsString/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2953, 'null_percent': 50.0}
   31          ArrayRangeEqualsFloat32/32768/10000    4.639b items/sec    7.537b items/sec    62.473          {'run_name': 'ArrayRangeEqualsFloat32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 96758, 'null_percent': 0.01}
   39          ArrayRangeEqualsListOfInt32/32768/2   12.135m items/sec   19.630m items/sec    61.761            {'run_name': 'ArrayRangeEqualsListOfInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 259, 'null_percent': 50.0}
   2                ArrayRangeEqualsStruct/32768/2   21.952m items/sec   34.669m items/sec    57.935                 {'run_name': 'ArrayRangeEqualsStruct/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 465, 'null_percent': 50.0}
   1                ArrayRangeEqualsInt32/32768/10  678.038m items/sec    1.057b items/sec    55.821               {'run_name': 'ArrayRangeEqualsInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14391, 'null_percent': 10.0}
   12              ArrayRangeEqualsFloat32/32768/2  271.242m items/sec  394.340m items/sec    45.383               {'run_name': 'ArrayRangeEqualsFloat32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5633, 'null_percent': 50.0}
   24                ArrayRangeEqualsInt32/32768/2  274.259m items/sec  393.021m items/sec    43.303                 {'run_name': 'ArrayRangeEqualsInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5764, 'null_percent': 50.0}
   50              ArrayRangeEqualsBoolean/32768/2  221.275m items/sec  307.534m items/sec    38.983               {'run_name': 'ArrayRangeEqualsBoolean/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4708, 'null_percent': 50.0}
   48           ArrayRangeEqualsString/32768/10000    1.330b items/sec    1.816b items/sec    36.476           {'run_name': 'ArrayRangeEqualsString/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28580, 'null_percent': 0.01}
   8           ArrayRangeEqualsSparseUnion/32768/0   34.852m items/sec   45.628m items/sec    30.919             {'run_name': 'ArrayRangeEqualsSparseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 737, 'null_percent': 0.0}
   23           ArrayRangeEqualsDenseUnion/32768/0   36.416m items/sec   45.582m items/sec    25.171              {'run_name': 'ArrayRangeEqualsDenseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 773, 'null_percent': 0.0}
   9           ArrayRangeEqualsSparseUnion/32768/1   22.389m items/sec   27.076m items/sec    20.932           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 478, 'null_percent': 100.0}
   14             ArrayRangeEqualsFloat32/32768/10  569.970m items/sec  660.566m items/sec    15.895             {'run_name': 'ArrayRangeEqualsFloat32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 11970, 'null_percent': 10.0}
   11           ArrayRangeEqualsDenseUnion/32768/1   23.336m items/sec   25.912m items/sec    11.037            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 496, 'null_percent': 100.0}
   0               ArrayRangeEqualsFloat32/32768/0   10.265b items/sec   11.049b items/sec     7.641              {'run_name': 'ArrayRangeEqualsFloat32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 218383, 'null_percent': 0.0}
   43          ArrayRangeEqualsListOfInt32/32768/0  790.628m items/sec  838.173m items/sec     6.014           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 17131, 'null_percent': 0.0}
   27              ArrayRangeEqualsBoolean/32768/0   48.625b items/sec   51.207b items/sec     5.309             {'run_name': 'ArrayRangeEqualsBoolean/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1034591, 'null_percent': 0.0}
   6                ArrayRangeEqualsString/32768/0    1.847b items/sec    1.942b items/sec     5.149                {'run_name': 'ArrayRangeEqualsString/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 39452, 'null_percent': 0.0}
   10      ArrayRangeEqualsFixedSizeBinary/32768/0   15.947b items/sec   16.438b items/sec     3.075      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 337925, 'null_percent': 0.0}
   30          ArrayRangeEqualsSparseUnion/32768/2   20.058m items/sec   20.362m items/sec     1.514            {'run_name': 'ArrayRangeEqualsSparseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 427, 'null_percent': 50.0}
   3          ArrayRangeEqualsSparseUnion/32768/10   20.014m items/sec   20.036m items/sec     0.109           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 425, 'null_percent': 10.0}
   7         ArrayRangeEqualsSparseUnion/32768/100   20.000m items/sec   19.925m items/sec    -0.373           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 424, 'null_percent': 1.0}
   25      ArrayRangeEqualsSparseUnion/32768/10000   19.848m items/sec   19.658m items/sec    -0.958        {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 422, 'null_percent': 0.01}
   37                ArrayRangeEqualsInt32/32768/0   31.782b items/sec   31.411b items/sec    -1.168                {'run_name': 'ArrayRangeEqualsInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 665046, 'null_percent': 0.0}
   4              ArrayRangeEqualsBoolean/32768/10  512.687m items/sec  496.923m items/sec    -3.075             {'run_name': 'ArrayRangeEqualsBoolean/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 10930, 'null_percent': 10.0}
   18         ArrayRangeEqualsDenseUnion/32768/100   20.641m items/sec   19.655m items/sec    -4.777            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 1.0}
   52       ArrayRangeEqualsDenseUnion/32768/10000   20.665m items/sec   19.609m items/sec    -5.113         {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 442, 'null_percent': 0.01}
   41          ArrayRangeEqualsDenseUnion/32768/10   20.551m items/sec   19.388m items/sec    -5.658            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 441, 'null_percent': 10.0}
   34           ArrayRangeEqualsDenseUnion/32768/2   20.645m items/sec   19.374m items/sec    -6.156             {'run_name': 'ArrayRangeEqualsDenseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 50.0}
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733918746


   https://issues.apache.org/jira/browse/ARROW-10696


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733918851


   ArrayRangeEquals benchmarks:
   ```
                                         benchmark            baseline           contender  change %                                                                                                                                                                                counters
   22             ArrayRangeEqualsStruct/32768/100   23.863m items/sec  343.067m items/sec  1337.631                {'run_name': 'ArrayRangeEqualsStruct/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 510, 'null_percent': 1.0}
   5     ArrayRangeEqualsFixedSizeBinary/32768/100  614.186m items/sec    5.185b items/sec   744.270     {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 12979, 'null_percent': 1.0}
   33        ArrayRangeEqualsListOfInt32/32768/100   26.215m items/sec  215.239m items/sec   721.041           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 560, 'null_percent': 1.0}
   21          ArrayRangeEqualsBoolean/32768/10000    2.826b items/sec   19.479b items/sec   589.262          {'run_name': 'ArrayRangeEqualsBoolean/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 61059, 'null_percent': 0.01}
   51           ArrayRangeEqualsStruct/32768/10000  222.532m items/sec    1.483b items/sec   566.535            {'run_name': 'ArrayRangeEqualsStruct/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4789, 'null_percent': 0.01}
   47             ArrayRangeEqualsString/32768/100  252.039m items/sec    1.350b items/sec   435.816               {'run_name': 'ArrayRangeEqualsString/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5365, 'null_percent': 1.0}
   20              ArrayRangeEqualsStruct/32768/10   12.987m items/sec   61.633m items/sec   374.582                {'run_name': 'ArrayRangeEqualsStruct/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277, 'null_percent': 10.0}
   36            ArrayRangeEqualsFloat32/32768/100    1.096b items/sec    4.474b items/sec   308.366             {'run_name': 'ArrayRangeEqualsFloat32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 23270, 'null_percent': 1.0}
   29              ArrayRangeEqualsInt32/32768/100    1.675b items/sec    6.533b items/sec   290.081               {'run_name': 'ArrayRangeEqualsInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 35608, 'null_percent': 1.0}
   46              ArrayRangeEqualsString/32768/10  135.386m items/sec  526.754m items/sec   289.076               {'run_name': 'ArrayRangeEqualsString/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2853, 'null_percent': 10.0}
   38         ArrayRangeEqualsListOfInt32/32768/10   12.637m items/sec   43.741m items/sec   246.142           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269, 'null_percent': 10.0}
   28     ArrayRangeEqualsFixedSizeBinary/32768/10  294.337m items/sec  969.941m items/sec   229.534      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6252, 'null_percent': 10.0}
   42      ArrayRangeEqualsListOfInt32/32768/10000  202.207m items/sec  656.251m items/sec   224.544       {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4360, 'null_percent': 0.01}
   13            ArrayRangeEqualsBoolean/32768/100  979.715m items/sec    2.908b items/sec   196.868             {'run_name': 'ArrayRangeEqualsBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 21042, 'null_percent': 1.0}
   40              ArrayRangeEqualsBoolean/32768/1   12.017b items/sec   35.217b items/sec   193.068            {'run_name': 'ArrayRangeEqualsBoolean/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 256448, 'null_percent': 100.0}
   45      ArrayRangeEqualsFixedSizeBinary/32768/1   12.204b items/sec   35.234b items/sec   188.703    {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269241, 'null_percent': 100.0}
   17                ArrayRangeEqualsInt32/32768/1   12.406b items/sec   34.810b items/sec   180.598              {'run_name': 'ArrayRangeEqualsInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 265638, 'null_percent': 100.0}
   26          ArrayRangeEqualsListOfInt32/32768/1   12.626b items/sec   35.288b items/sec   179.485        {'run_name': 'ArrayRangeEqualsListOfInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269942, 'null_percent': 100.0}
   44               ArrayRangeEqualsString/32768/1   13.196b items/sec   35.124b items/sec   166.178             {'run_name': 'ArrayRangeEqualsString/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 281694, 'null_percent': 100.0}
   35              ArrayRangeEqualsFloat32/32768/1   12.898b items/sec   34.159b items/sec   164.846            {'run_name': 'ArrayRangeEqualsFloat32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277754, 'null_percent': 100.0}
   53               ArrayRangeEqualsStruct/32768/1   13.618b items/sec   35.687b items/sec   162.059             {'run_name': 'ArrayRangeEqualsStruct/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 291514, 'null_percent': 100.0}
   15  ArrayRangeEqualsFixedSizeBinary/32768/10000    4.166b items/sec   10.835b items/sec   160.089  {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 89321, 'null_percent': 0.01}
   16            ArrayRangeEqualsInt32/32768/10000    6.886b items/sec   15.942b items/sec   131.514           {'run_name': 'ArrayRangeEqualsInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 146029, 'null_percent': 0.01}
   19               ArrayRangeEqualsStruct/32768/0  748.134m items/sec    1.686b items/sec   125.426                {'run_name': 'ArrayRangeEqualsStruct/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 16023, 'null_percent': 0.0}
   32      ArrayRangeEqualsFixedSizeBinary/32768/2  196.551m items/sec  375.563m items/sec    91.077       {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4153, 'null_percent': 50.0}
   49               ArrayRangeEqualsString/32768/2  139.017m items/sec  239.038m items/sec    71.948                {'run_name': 'ArrayRangeEqualsString/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2953, 'null_percent': 50.0}
   31          ArrayRangeEqualsFloat32/32768/10000    4.639b items/sec    7.537b items/sec    62.473          {'run_name': 'ArrayRangeEqualsFloat32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 96758, 'null_percent': 0.01}
   39          ArrayRangeEqualsListOfInt32/32768/2   12.135m items/sec   19.630m items/sec    61.761            {'run_name': 'ArrayRangeEqualsListOfInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 259, 'null_percent': 50.0}
   2                ArrayRangeEqualsStruct/32768/2   21.952m items/sec   34.669m items/sec    57.935                 {'run_name': 'ArrayRangeEqualsStruct/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 465, 'null_percent': 50.0}
   1                ArrayRangeEqualsInt32/32768/10  678.038m items/sec    1.057b items/sec    55.821               {'run_name': 'ArrayRangeEqualsInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14391, 'null_percent': 10.0}
   12              ArrayRangeEqualsFloat32/32768/2  271.242m items/sec  394.340m items/sec    45.383               {'run_name': 'ArrayRangeEqualsFloat32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5633, 'null_percent': 50.0}
   24                ArrayRangeEqualsInt32/32768/2  274.259m items/sec  393.021m items/sec    43.303                 {'run_name': 'ArrayRangeEqualsInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5764, 'null_percent': 50.0}
   50              ArrayRangeEqualsBoolean/32768/2  221.275m items/sec  307.534m items/sec    38.983               {'run_name': 'ArrayRangeEqualsBoolean/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4708, 'null_percent': 50.0}
   48           ArrayRangeEqualsString/32768/10000    1.330b items/sec    1.816b items/sec    36.476           {'run_name': 'ArrayRangeEqualsString/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28580, 'null_percent': 0.01}
   8           ArrayRangeEqualsSparseUnion/32768/0   34.852m items/sec   45.628m items/sec    30.919             {'run_name': 'ArrayRangeEqualsSparseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 737, 'null_percent': 0.0}
   23           ArrayRangeEqualsDenseUnion/32768/0   36.416m items/sec   45.582m items/sec    25.171              {'run_name': 'ArrayRangeEqualsDenseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 773, 'null_percent': 0.0}
   9           ArrayRangeEqualsSparseUnion/32768/1   22.389m items/sec   27.076m items/sec    20.932           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 478, 'null_percent': 100.0}
   14             ArrayRangeEqualsFloat32/32768/10  569.970m items/sec  660.566m items/sec    15.895             {'run_name': 'ArrayRangeEqualsFloat32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 11970, 'null_percent': 10.0}
   11           ArrayRangeEqualsDenseUnion/32768/1   23.336m items/sec   25.912m items/sec    11.037            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 496, 'null_percent': 100.0}
   0               ArrayRangeEqualsFloat32/32768/0   10.265b items/sec   11.049b items/sec     7.641              {'run_name': 'ArrayRangeEqualsFloat32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 218383, 'null_percent': 0.0}
   43          ArrayRangeEqualsListOfInt32/32768/0  790.628m items/sec  838.173m items/sec     6.014           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 17131, 'null_percent': 0.0}
   27              ArrayRangeEqualsBoolean/32768/0   48.625b items/sec   51.207b items/sec     5.309             {'run_name': 'ArrayRangeEqualsBoolean/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1034591, 'null_percent': 0.0}
   6                ArrayRangeEqualsString/32768/0    1.847b items/sec    1.942b items/sec     5.149                {'run_name': 'ArrayRangeEqualsString/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 39452, 'null_percent': 0.0}
   10      ArrayRangeEqualsFixedSizeBinary/32768/0   15.947b items/sec   16.438b items/sec     3.075      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 337925, 'null_percent': 0.0}
   30          ArrayRangeEqualsSparseUnion/32768/2   20.058m items/sec   20.362m items/sec     1.514            {'run_name': 'ArrayRangeEqualsSparseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 427, 'null_percent': 50.0}
   3          ArrayRangeEqualsSparseUnion/32768/10   20.014m items/sec   20.036m items/sec     0.109           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 425, 'null_percent': 10.0}
   7         ArrayRangeEqualsSparseUnion/32768/100   20.000m items/sec   19.925m items/sec    -0.373           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 424, 'null_percent': 1.0}
   25      ArrayRangeEqualsSparseUnion/32768/10000   19.848m items/sec   19.658m items/sec    -0.958        {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 422, 'null_percent': 0.01}
   37                ArrayRangeEqualsInt32/32768/0   31.782b items/sec   31.411b items/sec    -1.168                {'run_name': 'ArrayRangeEqualsInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 665046, 'null_percent': 0.0}
   4              ArrayRangeEqualsBoolean/32768/10  512.687m items/sec  496.923m items/sec    -3.075             {'run_name': 'ArrayRangeEqualsBoolean/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 10930, 'null_percent': 10.0}
   18         ArrayRangeEqualsDenseUnion/32768/100   20.641m items/sec   19.655m items/sec    -4.777            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 1.0}
   52       ArrayRangeEqualsDenseUnion/32768/10000   20.665m items/sec   19.609m items/sec    -5.113         {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 442, 'null_percent': 0.01}
   41          ArrayRangeEqualsDenseUnion/32768/10   20.551m items/sec   19.388m items/sec    -5.658            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 441, 'null_percent': 10.0}
   34           ArrayRangeEqualsDenseUnion/32768/2   20.645m items/sec   19.374m items/sec    -6.156             {'run_name': 'ArrayRangeEqualsDenseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 50.0}
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #8770:
URL: https://github.com/apache/arrow/pull/8770


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733918851


   ArrayRangeEquals benchmarks:
   ```
   22             ArrayRangeEqualsStruct/32768/100   23.863m items/sec  343.067m items/sec  1337.631                {'run_name': 'ArrayRangeEqualsStruct/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 510, 'null_percent': 1.0}
   5     ArrayRangeEqualsFixedSizeBinary/32768/100  614.186m items/sec    5.185b items/sec   744.270     {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 12979, 'null_percent': 1.0}
   33        ArrayRangeEqualsListOfInt32/32768/100   26.215m items/sec  215.239m items/sec   721.041           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 560, 'null_percent': 1.0}
   21          ArrayRangeEqualsBoolean/32768/10000    2.826b items/sec   19.479b items/sec   589.262          {'run_name': 'ArrayRangeEqualsBoolean/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 61059, 'null_percent': 0.01}
   51           ArrayRangeEqualsStruct/32768/10000  222.532m items/sec    1.483b items/sec   566.535            {'run_name': 'ArrayRangeEqualsStruct/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4789, 'null_percent': 0.01}
   47             ArrayRangeEqualsString/32768/100  252.039m items/sec    1.350b items/sec   435.816               {'run_name': 'ArrayRangeEqualsString/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5365, 'null_percent': 1.0}
   20              ArrayRangeEqualsStruct/32768/10   12.987m items/sec   61.633m items/sec   374.582                {'run_name': 'ArrayRangeEqualsStruct/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277, 'null_percent': 10.0}
   36            ArrayRangeEqualsFloat32/32768/100    1.096b items/sec    4.474b items/sec   308.366             {'run_name': 'ArrayRangeEqualsFloat32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 23270, 'null_percent': 1.0}
   29              ArrayRangeEqualsInt32/32768/100    1.675b items/sec    6.533b items/sec   290.081               {'run_name': 'ArrayRangeEqualsInt32/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 35608, 'null_percent': 1.0}
   46              ArrayRangeEqualsString/32768/10  135.386m items/sec  526.754m items/sec   289.076               {'run_name': 'ArrayRangeEqualsString/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2853, 'null_percent': 10.0}
   38         ArrayRangeEqualsListOfInt32/32768/10   12.637m items/sec   43.741m items/sec   246.142           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269, 'null_percent': 10.0}
   28     ArrayRangeEqualsFixedSizeBinary/32768/10  294.337m items/sec  969.941m items/sec   229.534      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6252, 'null_percent': 10.0}
   42      ArrayRangeEqualsListOfInt32/32768/10000  202.207m items/sec  656.251m items/sec   224.544       {'run_name': 'ArrayRangeEqualsListOfInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4360, 'null_percent': 0.01}
   13            ArrayRangeEqualsBoolean/32768/100  979.715m items/sec    2.908b items/sec   196.868             {'run_name': 'ArrayRangeEqualsBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 21042, 'null_percent': 1.0}
   40              ArrayRangeEqualsBoolean/32768/1   12.017b items/sec   35.217b items/sec   193.068            {'run_name': 'ArrayRangeEqualsBoolean/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 256448, 'null_percent': 100.0}
   45      ArrayRangeEqualsFixedSizeBinary/32768/1   12.204b items/sec   35.234b items/sec   188.703    {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269241, 'null_percent': 100.0}
   17                ArrayRangeEqualsInt32/32768/1   12.406b items/sec   34.810b items/sec   180.598              {'run_name': 'ArrayRangeEqualsInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 265638, 'null_percent': 100.0}
   26          ArrayRangeEqualsListOfInt32/32768/1   12.626b items/sec   35.288b items/sec   179.485        {'run_name': 'ArrayRangeEqualsListOfInt32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 269942, 'null_percent': 100.0}
   44               ArrayRangeEqualsString/32768/1   13.196b items/sec   35.124b items/sec   166.178             {'run_name': 'ArrayRangeEqualsString/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 281694, 'null_percent': 100.0}
   35              ArrayRangeEqualsFloat32/32768/1   12.898b items/sec   34.159b items/sec   164.846            {'run_name': 'ArrayRangeEqualsFloat32/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 277754, 'null_percent': 100.0}
   53               ArrayRangeEqualsStruct/32768/1   13.618b items/sec   35.687b items/sec   162.059             {'run_name': 'ArrayRangeEqualsStruct/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 291514, 'null_percent': 100.0}
   15  ArrayRangeEqualsFixedSizeBinary/32768/10000    4.166b items/sec   10.835b items/sec   160.089  {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 89321, 'null_percent': 0.01}
   16            ArrayRangeEqualsInt32/32768/10000    6.886b items/sec   15.942b items/sec   131.514           {'run_name': 'ArrayRangeEqualsInt32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 146029, 'null_percent': 0.01}
   19               ArrayRangeEqualsStruct/32768/0  748.134m items/sec    1.686b items/sec   125.426                {'run_name': 'ArrayRangeEqualsStruct/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 16023, 'null_percent': 0.0}
   32      ArrayRangeEqualsFixedSizeBinary/32768/2  196.551m items/sec  375.563m items/sec    91.077       {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4153, 'null_percent': 50.0}
   49               ArrayRangeEqualsString/32768/2  139.017m items/sec  239.038m items/sec    71.948                {'run_name': 'ArrayRangeEqualsString/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2953, 'null_percent': 50.0}
   31          ArrayRangeEqualsFloat32/32768/10000    4.639b items/sec    7.537b items/sec    62.473          {'run_name': 'ArrayRangeEqualsFloat32/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 96758, 'null_percent': 0.01}
   39          ArrayRangeEqualsListOfInt32/32768/2   12.135m items/sec   19.630m items/sec    61.761            {'run_name': 'ArrayRangeEqualsListOfInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 259, 'null_percent': 50.0}
   2                ArrayRangeEqualsStruct/32768/2   21.952m items/sec   34.669m items/sec    57.935                 {'run_name': 'ArrayRangeEqualsStruct/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 465, 'null_percent': 50.0}
   1                ArrayRangeEqualsInt32/32768/10  678.038m items/sec    1.057b items/sec    55.821               {'run_name': 'ArrayRangeEqualsInt32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14391, 'null_percent': 10.0}
   12              ArrayRangeEqualsFloat32/32768/2  271.242m items/sec  394.340m items/sec    45.383               {'run_name': 'ArrayRangeEqualsFloat32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5633, 'null_percent': 50.0}
   24                ArrayRangeEqualsInt32/32768/2  274.259m items/sec  393.021m items/sec    43.303                 {'run_name': 'ArrayRangeEqualsInt32/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5764, 'null_percent': 50.0}
   50              ArrayRangeEqualsBoolean/32768/2  221.275m items/sec  307.534m items/sec    38.983               {'run_name': 'ArrayRangeEqualsBoolean/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4708, 'null_percent': 50.0}
   48           ArrayRangeEqualsString/32768/10000    1.330b items/sec    1.816b items/sec    36.476           {'run_name': 'ArrayRangeEqualsString/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28580, 'null_percent': 0.01}
   8           ArrayRangeEqualsSparseUnion/32768/0   34.852m items/sec   45.628m items/sec    30.919             {'run_name': 'ArrayRangeEqualsSparseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 737, 'null_percent': 0.0}
   23           ArrayRangeEqualsDenseUnion/32768/0   36.416m items/sec   45.582m items/sec    25.171              {'run_name': 'ArrayRangeEqualsDenseUnion/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 773, 'null_percent': 0.0}
   9           ArrayRangeEqualsSparseUnion/32768/1   22.389m items/sec   27.076m items/sec    20.932           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 478, 'null_percent': 100.0}
   14             ArrayRangeEqualsFloat32/32768/10  569.970m items/sec  660.566m items/sec    15.895             {'run_name': 'ArrayRangeEqualsFloat32/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 11970, 'null_percent': 10.0}
   11           ArrayRangeEqualsDenseUnion/32768/1   23.336m items/sec   25.912m items/sec    11.037            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 496, 'null_percent': 100.0}
   0               ArrayRangeEqualsFloat32/32768/0   10.265b items/sec   11.049b items/sec     7.641              {'run_name': 'ArrayRangeEqualsFloat32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 218383, 'null_percent': 0.0}
   43          ArrayRangeEqualsListOfInt32/32768/0  790.628m items/sec  838.173m items/sec     6.014           {'run_name': 'ArrayRangeEqualsListOfInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 17131, 'null_percent': 0.0}
   27              ArrayRangeEqualsBoolean/32768/0   48.625b items/sec   51.207b items/sec     5.309             {'run_name': 'ArrayRangeEqualsBoolean/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1034591, 'null_percent': 0.0}
   6                ArrayRangeEqualsString/32768/0    1.847b items/sec    1.942b items/sec     5.149                {'run_name': 'ArrayRangeEqualsString/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 39452, 'null_percent': 0.0}
   10      ArrayRangeEqualsFixedSizeBinary/32768/0   15.947b items/sec   16.438b items/sec     3.075      {'run_name': 'ArrayRangeEqualsFixedSizeBinary/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 337925, 'null_percent': 0.0}
   30          ArrayRangeEqualsSparseUnion/32768/2   20.058m items/sec   20.362m items/sec     1.514            {'run_name': 'ArrayRangeEqualsSparseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 427, 'null_percent': 50.0}
   3          ArrayRangeEqualsSparseUnion/32768/10   20.014m items/sec   20.036m items/sec     0.109           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 425, 'null_percent': 10.0}
   7         ArrayRangeEqualsSparseUnion/32768/100   20.000m items/sec   19.925m items/sec    -0.373           {'run_name': 'ArrayRangeEqualsSparseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 424, 'null_percent': 1.0}
   25      ArrayRangeEqualsSparseUnion/32768/10000   19.848m items/sec   19.658m items/sec    -0.958        {'run_name': 'ArrayRangeEqualsSparseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 422, 'null_percent': 0.01}
   37                ArrayRangeEqualsInt32/32768/0   31.782b items/sec   31.411b items/sec    -1.168                {'run_name': 'ArrayRangeEqualsInt32/32768/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 665046, 'null_percent': 0.0}
   4              ArrayRangeEqualsBoolean/32768/10  512.687m items/sec  496.923m items/sec    -3.075             {'run_name': 'ArrayRangeEqualsBoolean/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 10930, 'null_percent': 10.0}
   18         ArrayRangeEqualsDenseUnion/32768/100   20.641m items/sec   19.655m items/sec    -4.777            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 1.0}
   52       ArrayRangeEqualsDenseUnion/32768/10000   20.665m items/sec   19.609m items/sec    -5.113         {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 442, 'null_percent': 0.01}
   41          ArrayRangeEqualsDenseUnion/32768/10   20.551m items/sec   19.388m items/sec    -5.658            {'run_name': 'ArrayRangeEqualsDenseUnion/32768/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 441, 'null_percent': 10.0}
   34           ArrayRangeEqualsDenseUnion/32768/2   20.645m items/sec   19.374m items/sec    -6.156             {'run_name': 'ArrayRangeEqualsDenseUnion/32768/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 50.0}
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-734223439


   Aggregation benchmarks:
   ```
                                 benchmark          baseline        contender   change %                                                                                                                                                                      counters
   108         ModeKernelBoolean/1048576/0    57.487 MiB/sec   37.406 GiB/sec  66530.741            {'run_name': 'ModeKernelBoolean/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 41, 'null_percent': 0.0}
   93          ModeKernelBoolean/1048576/2    28.239 MiB/sec    1.863 GiB/sec   6655.773           {'run_name': 'ModeKernelBoolean/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 20, 'null_percent': 50.0}
   56         ModeKernelBoolean/1048576/10    43.613 MiB/sec    1.799 GiB/sec   4123.421          {'run_name': 'ModeKernelBoolean/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31, 'null_percent': 10.0}
   29        ModeKernelBoolean/1048576/100    58.888 MiB/sec    1.826 GiB/sec   3075.803          {'run_name': 'ModeKernelBoolean/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 39, 'null_percent': 1.0}
   124     ModeKernelBoolean/1048576/10000    61.189 MiB/sec    1.864 GiB/sec   3018.637       {'run_name': 'ModeKernelBoolean/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 43, 'null_percent': 0.01}
   35             ModeKernelInt8/1048576/0   792.467 MiB/sec    2.473 GiB/sec    219.603             {'run_name': 'ModeKernelInt8/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1000, 'null_percent': 0.0}
   36        ModeKernelInt16/1048576/10000     1.922 GiB/sec    4.217 GiB/sec    119.409       {'run_name': 'ModeKernelInt16/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2138, 'null_percent': 0.01}
   100    VarianceKernelDouble/1048576/100     2.445 GiB/sec    4.960 GiB/sec    102.888     {'run_name': 'VarianceKernelDouble/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1746, 'null_percent': 1.0}
   76            ModeKernelInt32/1048576/0     4.050 GiB/sec    8.075 GiB/sec     99.390            {'run_name': 'ModeKernelInt32/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4419, 'null_percent': 0.0}
   109     VarianceKernelFloat/1048576/100     1.216 GiB/sec    2.387 GiB/sec     96.341       {'run_name': 'VarianceKernelFloat/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 871, 'null_percent': 1.0}
   20        ModeKernelInt64/1048576/10000     5.547 GiB/sec    9.831 GiB/sec     77.229       {'run_name': 'ModeKernelInt64/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6475, 'null_percent': 0.01}
   54      VarianceKernelInt64/1048576/100     3.511 GiB/sec    5.748 GiB/sec     63.693      {'run_name': 'VarianceKernelInt64/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2519, 'null_percent': 1.0}
   28          ModeKernelInt16/1048576/100     1.292 GiB/sec    2.077 GiB/sec     60.697           {'run_name': 'ModeKernelInt16/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 927, 'null_percent': 1.0}
   94          ModeKernelInt32/1048576/100     2.530 GiB/sec    3.969 GiB/sec     56.872          {'run_name': 'ModeKernelInt32/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1805, 'null_percent': 1.0}
   67             ModeKernelInt8/1048576/2   237.264 MiB/sec  362.684 MiB/sec     52.861             {'run_name': 'ModeKernelInt8/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 166, 'null_percent': 50.0}
   11        VarianceKernelFloat/1048576/2   508.083 MiB/sec  727.822 MiB/sec     43.249        {'run_name': 'VarianceKernelFloat/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 355, 'null_percent': 50.0}
   65       VarianceKernelDouble/1048576/2     1.030 GiB/sec    1.457 GiB/sec     41.439       {'run_name': 'VarianceKernelDouble/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 734, 'null_percent': 50.0}
   84            ModeKernelInt64/1048576/2     1.003 GiB/sec    1.409 GiB/sec     40.411            {'run_name': 'ModeKernelInt64/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 713, 'null_percent': 50.0}
   22            ModeKernelInt16/1048576/2   253.082 MiB/sec  355.311 MiB/sec     40.394            {'run_name': 'ModeKernelInt16/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 177, 'null_percent': 50.0}
   15        VarianceKernelInt32/1048576/2  1011.938 MiB/sec    1.387 GiB/sec     40.305        {'run_name': 'VarianceKernelInt32/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 701, 'null_percent': 50.0}
   98            ModeKernelInt32/1048576/2   502.640 MiB/sec  702.642 MiB/sec     39.790            {'run_name': 'ModeKernelInt32/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 341, 'null_percent': 50.0}
   61        VarianceKernelInt64/1048576/2     1.024 GiB/sec    1.422 GiB/sec     38.871        {'run_name': 'VarianceKernelInt64/1048576/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 726, 'null_percent': 50.0}
   81       VarianceKernelFloat/1048576/10   854.282 MiB/sec    1.153 GiB/sec     38.189       {'run_name': 'VarianceKernelFloat/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 598, 'null_percent': 10.0}
   34          ModeKernelInt64/1048576/100     4.766 GiB/sec    6.547 GiB/sec     37.389          {'run_name': 'ModeKernelInt64/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3416, 'null_percent': 1.0}
   2       VarianceKernelDouble/1048576/10     1.670 GiB/sec    2.290 GiB/sec     37.083     {'run_name': 'VarianceKernelDouble/1048576/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1195, 'null_percent': 10.0}
   111     VarianceKernelInt32/1048576/100     4.159 GiB/sec    5.506 GiB/sec     32.395      {'run_name': 'VarianceKernelInt32/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2978, 'null_percent': 1.0}
   30           ModeKernelInt8/1048576/100     1.276 GiB/sec    1.634 GiB/sec     28.042            {'run_name': 'ModeKernelInt8/1048576/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 919, 'null_percent': 1.0}
   73    VarianceKernelInt32/1048576/10000     6.548 GiB/sec    7.310 GiB/sec     11.634   {'run_name': 'VarianceKernelInt32/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4687, 'null_percent': 0.01}
   1         VarianceKernelInt32/1048576/0     7.168 GiB/sec    7.952 GiB/sec     10.944        {'run_name': 'VarianceKernelInt32/1048576/0', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5116, 'null_percent': 0.0}
   [...]
   63        ModeKernelInt32/1048576/10000     6.489 GiB/sec    3.047 GiB/sec    -53.040       {'run_name': 'ModeKernelInt32/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4653, 'null_percent': 0.01}
   45         ModeKernelInt8/1048576/10000     2.262 GiB/sec    1.002 GiB/sec    -55.719        {'run_name': 'ModeKernelInt8/1048576/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1617, 'null_percent': 0.01}
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733919445


   Parquet benchmarks:
   ```
   199                                     BM_PlainDecodingSpacedFloat/32768/100       5.506 GiB/sec      18.466 GiB/sec   235.349                {'run_name': 'BM_PlainDecodingSpacedFloat/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31377, 'null_percent': 1.0}
   177                                 BM_PlainEncodingSpacedBoolean/32768/10000      11.377 GiB/sec      33.462 GiB/sec   194.106         {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 259382, 'null_percent': 100.0}
   179                                   BM_PlainDecodingSpacedBoolean/32768/100       1.358 GiB/sec       3.572 GiB/sec   163.120              {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31133, 'null_percent': 1.0}
   282                                     BM_PlainEncodingSpacedFloat/32768/100       6.276 GiB/sec      15.605 GiB/sec   148.621                {'run_name': 'BM_PlainEncodingSpacedFloat/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 35719, 'null_percent': 1.0}
   152                                  BM_ArrowBinaryDict/EncodeLowLevel/262144      82.896 MiB/sec     203.718 MiB/sec   145.751                                     {'run_name': 'BM_ArrowBinaryDict/EncodeLowLevel/262144', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 38}
   298                                  BM_PlainEncodingSpacedDouble/32768/10000     115.446 GiB/sec     262.819 GiB/sec   127.654          {'run_name': 'BM_PlainEncodingSpacedDouble/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 330346, 'null_percent': 100.0}
   247                                    BM_PlainDecodingSpacedDouble/32768/100       9.953 GiB/sec      22.653 GiB/sec   127.594               {'run_name': 'BM_PlainDecodingSpacedDouble/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28669, 'null_percent': 1.0}
   207                                   BM_PlainEncodingSpacedFloat/32768/10000      57.732 GiB/sec     130.097 GiB/sec   125.348           {'run_name': 'BM_PlainEncodingSpacedFloat/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 330727, 'null_percent': 100.0}
   203                                       BM_PlainDecodingSpacedFloat/32768/1      21.382 GiB/sec      36.628 GiB/sec    71.298                {'run_name': 'BM_PlainDecodingSpacedFloat/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 122797, 'null_percent': 0.01}
   231                                    BM_PlainEncodingSpacedDouble/32768/100      10.467 GiB/sec      17.029 GiB/sec    62.697               {'run_name': 'BM_PlainEncodingSpacedDouble/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 30042, 'null_percent': 1.0}
   256                                    BM_PlainEncodingSpacedFloat/32768/5000    1008.863 MiB/sec       1.592 GiB/sec    61.538               {'run_name': 'BM_PlainEncodingSpacedFloat/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5619, 'null_percent': 50.0}
   273                                   BM_PlainEncodingSpacedDouble/32768/5000       2.022 GiB/sec       3.201 GiB/sec    58.281              {'run_name': 'BM_PlainEncodingSpacedDouble/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5712, 'null_percent': 50.0}
   162                                    BM_PlainDecodingSpacedFloat/32768/5000     981.563 MiB/sec       1.516 GiB/sec    58.168               {'run_name': 'BM_PlainDecodingSpacedFloat/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5473, 'null_percent': 50.0}
   173                                   BM_PlainDecodingSpacedDouble/32768/5000       1.891 GiB/sec       2.991 GiB/sec    58.157              {'run_name': 'BM_PlainDecodingSpacedDouble/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5401, 'null_percent': 50.0}
   238                                  BM_PlainDecodingSpacedBoolean/32768/5000     250.841 MiB/sec     384.756 MiB/sec    53.386             {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5585, 'null_percent': 50.0}
   264                                  BM_PlainEncodingSpacedBoolean/32768/5000     234.661 MiB/sec     353.009 MiB/sec    50.433             {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5211, 'null_percent': 50.0}
   286                                    BM_PlainDecodingSpacedFloat/32768/1000       2.531 GiB/sec       3.739 GiB/sec    47.699              {'run_name': 'BM_PlainDecodingSpacedFloat/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14280, 'null_percent': 10.0}
   229                                    BM_PlainEncodingSpacedFloat/32768/1000       2.680 GiB/sec       3.783 GiB/sec    41.159              {'run_name': 'BM_PlainEncodingSpacedFloat/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 15210, 'null_percent': 10.0}
   251                                      BM_PlainDecodingSpacedDouble/32768/1      25.016 GiB/sec      34.929 GiB/sec    39.627                {'run_name': 'BM_PlainDecodingSpacedDouble/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 71628, 'null_percent': 0.01}
   201                                   BM_PlainDecodingSpacedDouble/32768/1000       4.987 GiB/sec       6.893 GiB/sec    38.224             {'run_name': 'BM_PlainDecodingSpacedDouble/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14031, 'null_percent': 10.0}
   244                                   BM_PlainEncodingSpacedBoolean/32768/100     664.198 MiB/sec     913.312 MiB/sec    37.506              {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14952, 'null_percent': 1.0}
   239                                   BM_PlainEncodingSpacedDouble/32768/1000       5.174 GiB/sec       7.071 GiB/sec    36.652             {'run_name': 'BM_PlainEncodingSpacedDouble/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14708, 'null_percent': 10.0}
   230                                  BM_PlainDecodingSpacedBoolean/32768/1000     678.559 MiB/sec     857.465 MiB/sec    26.366            {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 15333, 'null_percent': 10.0}
   102       BM_WriteInt64Column<Repetition::OPTIONAL, Compression::LZ4>/1048576     332.343 MiB/sec     409.541 MiB/sec    23.228         {'run_name': 'BM_WriteInt64Column<Repetition::OPTIONAL, Compression::LZ4>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 109}
   185                                       BM_PlainEncodingSpacedFloat/32768/1      17.571 GiB/sec      21.578 GiB/sec    22.805                {'run_name': 'BM_PlainEncodingSpacedFloat/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 100736, 'null_percent': 0.01}
   176                                  BM_PlainEncodingSpacedBoolean/32768/1000     450.133 MiB/sec     545.271 MiB/sec    21.136             {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 9948, 'null_percent': 10.0}
   187                                     BM_PlainDecodingSpacedBoolean/32768/1       4.496 GiB/sec       5.314 GiB/sec    18.200              {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 103077, 'null_percent': 0.01}
   25                                            BM_WriteColumn<false,Int64Type>       1.057 GiB/sec       1.220 GiB/sec    15.454                                              {'run_name': 'BM_WriteColumn<false,Int64Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 10}
   39                                       BM_ReadColumn<true,BooleanType>/5/10     250.138 MiB/sec     283.639 MiB/sec    13.393                                         {'run_name': 'BM_ReadColumn<true,BooleanType>/5/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18}
   75        BM_WriteInt64Column<Repetition::REQUIRED, Compression::LZ4>/1048576       1.084 GiB/sec       1.216 GiB/sec    12.132         {'run_name': 'BM_WriteInt64Column<Repetition::REQUIRED, Compression::LZ4>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 443}
   115                                                    BM_RleEncoding/32768/1     724.153 MiB/sec     810.490 MiB/sec    11.923                                                     {'run_name': 'BM_RleEncoding/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 8141}
   92                                                      BM_RleEncoding/1024/1     706.002 MiB/sec     788.645 MiB/sec    11.706                                                    {'run_name': 'BM_RleEncoding/1024/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 232581}
   106                         BM_WriteInt64Column<Repetition::REPEATED>/1048576     218.593 MiB/sec     243.729 MiB/sec    11.499                            {'run_name': 'BM_WriteInt64Column<Repetition::REPEATED>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 77}
   109                                                    BM_RleEncoding/65536/1     729.356 MiB/sec     811.227 MiB/sec    11.225                                                     {'run_name': 'BM_RleEncoding/65536/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4076}
   89                                                      BM_RleEncoding/4096/1     725.079 MiB/sec     804.486 MiB/sec    10.952                                                     {'run_name': 'BM_RleEncoding/4096/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 64655}
   7                                        BM_ReadColumn<false,Int32Type>/-1/10       1.811 GiB/sec       2.008 GiB/sec    10.851                                         {'run_name': 'BM_ReadColumn<false,Int32Type>/-1/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 32}
   58                                        BM_ReadColumn<false,Int32Type>/-1/1       5.426 GiB/sec       5.999 GiB/sec    10.568                                          {'run_name': 'BM_ReadColumn<false,Int32Type>/-1/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 89}
   87     BM_ReadInt64Column<Repetition::REQUIRED, Compression::ZSTD>/65536/1024      12.156 GiB/sec      11.951 GiB/sec    -1.682    {'run_name': 'BM_ReadInt64Column<Repetition::REQUIRED, Compression::ZSTD>/65536/1024', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 69528}
   [...]
   204                                             BM_PlainEncodingBoolean/65536     802.645 MiB/sec     707.974 MiB/sec   -11.795                                              {'run_name': 'BM_PlainEncodingBoolean/65536', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 8937}
   132                                                    BM_RleEncoding/65536/8     482.792 MiB/sec     424.143 MiB/sec   -12.148                                                     {'run_name': 'BM_RleEncoding/65536/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2684}
   15                                            BM_WriteColumn<true,DoubleType>     622.954 MiB/sec     498.484 MiB/sec   -19.981                                               {'run_name': 'BM_WriteColumn<true,DoubleType>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   16                                             BM_WriteColumn<true,Int64Type>     668.120 MiB/sec     526.837 MiB/sec   -21.146                                                {'run_name': 'BM_WriteColumn<true,Int64Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   34                                             BM_WriteColumn<true,Int32Type>     362.327 MiB/sec     256.361 MiB/sec   -29.246                                                {'run_name': 'BM_WriteColumn<true,Int32Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   27                                           BM_WriteColumn<true,BooleanType>     101.732 MiB/sec      69.602 MiB/sec   -31.583                                              {'run_name': 'BM_WriteColumn<true,BooleanType>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7}
   60                                       BM_ReadColumn<true,BooleanType>/-1/1     480.372 MiB/sec     219.376 MiB/sec   -54.332                                         {'run_name': 'BM_ReadColumn<true,BooleanType>/-1/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 32}
   289                                 BM_ArrowBinaryDict/EncodeLowLevel/1048576     316.208 MiB/sec     131.127 MiB/sec   -58.531                                    {'run_name': 'BM_ArrowBinaryDict/EncodeLowLevel/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 37}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#discussion_r533955521



##########
File path: cpp/src/arrow/util/bit_run_reader.h
##########
@@ -166,7 +167,350 @@ class ARROW_EXPORT BitRunReader {
 using BitRunReader = BitRunReaderLinear;
 #endif
 
-// TODO SetBitRunReader?
+struct SetBitRun {
+  int64_t position;
+  int64_t length;
+
+  bool AtEnd() const { return length == 0; }
+
+  std::string ToString() const {
+    return std::string("{pos=") + std::to_string(position) +
+           ", len=" + std::to_string(length) + "}";
+  }
+
+  bool operator==(const SetBitRun& other) const {
+    return position == other.position && length == other.length;
+  }
+  bool operator!=(const SetBitRun& other) const {
+    return position != other.position || length != other.length;
+  }
+};
+
+template <bool Reverse>
+class BaseSetBitRunReader {
+ public:
+  /// \brief Constructs new SetBitRunReader.
+  ///
+  /// \param[in] bitmap source data
+  /// \param[in] start_offset bit offset into the source data
+  /// \param[in] length number of bits to copy
+  inline BaseSetBitRunReader(const uint8_t* bitmap, int64_t start_offset, int64_t length);
+
+  SetBitRun NextRun() {
+    int64_t pos = 0;
+    int64_t len = 0;
+    if (current_num_bits_) {
+      const auto run = FindCurrentRun();
+      assert(remaining_ >= 0);
+      if (run.length && current_num_bits_) {
+        // The run ends in current_word_
+        return AdjustRun(run);
+      }
+      pos = run.position;
+      len = run.length;
+    }
+    if (!len) {
+      // We didn't get any ones in current_word_, so we can skip any zeros
+      // in the following words
+      SkipNextZeros();
+      if (remaining_ == 0) {
+        return {0, 0};
+      }
+      assert(current_num_bits_);
+      pos = position();
+    } else if (!current_num_bits_) {
+      if (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+        current_word_ = LoadFullWord();
+        current_num_bits_ = 64;
+      } else if (remaining_ > 0) {
+        current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+        current_num_bits_ = static_cast<int32_t>(remaining_);
+      } else {
+        // No bits remaining, perhaps we found a run?
+        return AdjustRun({pos, len});
+      }
+      // If current word starts with a zero, we got a full run
+      if (!(current_word_ & kFirstBit)) {
+        return AdjustRun({pos, len});
+      }
+    }
+    // Current word should now start with a set bit
+    len += CountNextOnes();
+    return AdjustRun({pos, len});
+  }
+
+ protected:
+  int64_t position() const {
+    if (Reverse) {
+      return remaining_;
+    } else {
+      return length_ - remaining_;
+    }
+  }
+
+  SetBitRun AdjustRun(SetBitRun run) {
+    if (Reverse) {
+      assert(run.position >= run.length);
+      run.position -= run.length;
+    }
+    return run;
+  }
+
+  uint64_t LoadFullWord() {
+    uint64_t word;
+    if (Reverse) {
+      bitmap_ -= 8;
+    }
+    memcpy(&word, bitmap_, 8);
+    if (!Reverse) {
+      bitmap_ += 8;
+    }
+    return BitUtil::ToLittleEndian(word);
+  }
+
+  uint64_t LoadPartialWord(int8_t bit_offset, int64_t num_bits) {
+    assert(num_bits > 0);
+    uint64_t word = 0;
+    const int64_t num_bytes = BitUtil::BytesForBits(num_bits);
+    if (Reverse) {
+      // Read in the most significant bytes of the word
+      bitmap_ -= num_bytes;
+      memcpy(reinterpret_cast<char*>(&word) + 8 - num_bytes, bitmap_, num_bytes);
+      // XXX MostSignificantBitmask
+      return (BitUtil::ToLittleEndian(word) << bit_offset) &
+             ~BitUtil::LeastSignificantBitMask(64 - num_bits);
+    } else {
+      memcpy(&word, bitmap_, num_bytes);
+      bitmap_ += num_bytes;
+      return (BitUtil::ToLittleEndian(word) >> bit_offset) &
+             BitUtil::LeastSignificantBitMask(num_bits);
+    }
+  }
+
+  void SkipNextZeros() {
+    assert(current_num_bits_ == 0);
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_zeros = CountFirstZeros(current_word_);
+      if (num_zeros < 64) {
+        current_word_ = ConsumeBits(current_word_, num_zeros);
+        current_num_bits_ = 64 - num_zeros;
+        remaining_ -= num_zeros;
+        assert(remaining_ >= 0);
+        assert(current_num_bits_ >= 0);
+        return;
+      }
+      remaining_ -= 64;
+    }
+    if (remaining_ > 0) {
+      current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+      current_num_bits_ = static_cast<int32_t>(remaining_);
+      const auto num_zeros =
+          std::min<int32_t>(current_num_bits_, CountFirstZeros(current_word_));
+      current_word_ = ConsumeBits(current_word_, num_zeros);
+      current_num_bits_ -= num_zeros;
+      remaining_ -= num_zeros;
+      assert(remaining_ >= 0);
+      assert(current_num_bits_ >= 0);
+    }
+  }
+
+  int64_t CountNextOnes() {
+    assert(current_word_ & kFirstBit);
+
+    int64_t len;
+    if (~current_word_) {
+      const auto num_ones = CountFirstZeros(~current_word_);
+      assert(num_ones <= current_num_bits_);
+      assert(num_ones <= remaining_);
+      remaining_ -= num_ones;
+      current_word_ = ConsumeBits(current_word_, num_ones);
+      current_num_bits_ -= num_ones;
+      if (current_num_bits_) {
+        // There are pending zeros in current_word_
+        return num_ones;
+      }
+      len = num_ones;
+    } else {
+      // current_word_ is all ones
+      remaining_ -= 64;
+      current_num_bits_ = 0;
+      len = 64;
+    }
+
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_ones = CountFirstZeros(~current_word_);
+      len += num_ones;
+      remaining_ -= num_ones;
+      if (num_ones < 64) {

Review comment:
       Well, depending on the distribution of input bits, this may be commonly true (same for the other comment above).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#discussion_r533640298



##########
File path: cpp/src/arrow/util/bit_util_test.cc
##########
@@ -66,6 +66,15 @@ using internal::InvertBitmap;
 
 using ::testing::ElementsAreArray;
 
+namespace internal {
+void PrintTo(const BitRun& run, std::ostream* os) {
+  *os << run.ToString();  // whatever needed to print bar to os
+}
+void PrintTo(const SetBitRun& run, std::ostream* os) {
+  *os << run.ToString();  // whatever needed to print bar to os
+}

Review comment:
       ```suggestion
   void PrintTo(const BitRun& run, std::ostream* os) { *os << run.ToString(); }
   void PrintTo(const SetBitRun& run, std::ostream* os) { *os << run.ToString(); }
   ```

##########
File path: cpp/src/arrow/util/bit_run_reader.h
##########
@@ -166,7 +167,350 @@ class ARROW_EXPORT BitRunReader {
 using BitRunReader = BitRunReaderLinear;
 #endif
 
-// TODO SetBitRunReader?
+struct SetBitRun {
+  int64_t position;
+  int64_t length;
+
+  bool AtEnd() const { return length == 0; }
+
+  std::string ToString() const {
+    return std::string("{pos=") + std::to_string(position) +
+           ", len=" + std::to_string(length) + "}";
+  }
+
+  bool operator==(const SetBitRun& other) const {
+    return position == other.position && length == other.length;
+  }
+  bool operator!=(const SetBitRun& other) const {
+    return position != other.position || length != other.length;
+  }
+};
+
+template <bool Reverse>
+class BaseSetBitRunReader {
+ public:
+  /// \brief Constructs new SetBitRunReader.
+  ///
+  /// \param[in] bitmap source data
+  /// \param[in] start_offset bit offset into the source data
+  /// \param[in] length number of bits to copy
+  inline BaseSetBitRunReader(const uint8_t* bitmap, int64_t start_offset, int64_t length);
+
+  SetBitRun NextRun() {
+    int64_t pos = 0;
+    int64_t len = 0;
+    if (current_num_bits_) {
+      const auto run = FindCurrentRun();
+      assert(remaining_ >= 0);
+      if (run.length && current_num_bits_) {
+        // The run ends in current_word_
+        return AdjustRun(run);
+      }
+      pos = run.position;
+      len = run.length;
+    }
+    if (!len) {
+      // We didn't get any ones in current_word_, so we can skip any zeros
+      // in the following words
+      SkipNextZeros();
+      if (remaining_ == 0) {
+        return {0, 0};
+      }
+      assert(current_num_bits_);
+      pos = position();
+    } else if (!current_num_bits_) {
+      if (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+        current_word_ = LoadFullWord();
+        current_num_bits_ = 64;
+      } else if (remaining_ > 0) {
+        current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+        current_num_bits_ = static_cast<int32_t>(remaining_);
+      } else {
+        // No bits remaining, perhaps we found a run?
+        return AdjustRun({pos, len});
+      }
+      // If current word starts with a zero, we got a full run
+      if (!(current_word_ & kFirstBit)) {
+        return AdjustRun({pos, len});
+      }
+    }
+    // Current word should now start with a set bit
+    len += CountNextOnes();
+    return AdjustRun({pos, len});
+  }
+
+ protected:
+  int64_t position() const {
+    if (Reverse) {
+      return remaining_;
+    } else {
+      return length_ - remaining_;
+    }
+  }
+
+  SetBitRun AdjustRun(SetBitRun run) {
+    if (Reverse) {
+      assert(run.position >= run.length);
+      run.position -= run.length;
+    }
+    return run;
+  }
+
+  uint64_t LoadFullWord() {
+    uint64_t word;
+    if (Reverse) {
+      bitmap_ -= 8;
+    }
+    memcpy(&word, bitmap_, 8);
+    if (!Reverse) {
+      bitmap_ += 8;
+    }
+    return BitUtil::ToLittleEndian(word);
+  }
+
+  uint64_t LoadPartialWord(int8_t bit_offset, int64_t num_bits) {
+    assert(num_bits > 0);
+    uint64_t word = 0;
+    const int64_t num_bytes = BitUtil::BytesForBits(num_bits);
+    if (Reverse) {
+      // Read in the most significant bytes of the word
+      bitmap_ -= num_bytes;
+      memcpy(reinterpret_cast<char*>(&word) + 8 - num_bytes, bitmap_, num_bytes);
+      // XXX MostSignificantBitmask
+      return (BitUtil::ToLittleEndian(word) << bit_offset) &
+             ~BitUtil::LeastSignificantBitMask(64 - num_bits);
+    } else {
+      memcpy(&word, bitmap_, num_bytes);
+      bitmap_ += num_bytes;
+      return (BitUtil::ToLittleEndian(word) >> bit_offset) &
+             BitUtil::LeastSignificantBitMask(num_bits);
+    }
+  }
+
+  void SkipNextZeros() {
+    assert(current_num_bits_ == 0);
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_zeros = CountFirstZeros(current_word_);
+      if (num_zeros < 64) {

Review comment:
       ```suggestion
         if (ARROW_PREDICT_FALSE(num_zeros < 64)) {
   ```

##########
File path: cpp/src/arrow/compute/kernels/vector_selection.cc
##########
@@ -589,37 +590,9 @@ class PrimitiveFilterImpl {
 
   void ExecNonNull() {
     // Fast filter when values and filter are not null
-    // Bit counters used for both null_selection behaviors
-    BitBlockCounter filter_counter(filter_data_, filter_offset_, values_length_);
-
-    int64_t in_position = 0;
-    BitBlockCount current_block = filter_counter.NextWord();
-    while (in_position < values_length_) {
-      if (current_block.AllSet()) {
-        int64_t run_length = 0;
-        // If we've found a all-true block, then we scan forward until we find
-        // a block that has some false values (or we reach the end
-        while (current_block.length > 0 && current_block.AllSet()) {
-          run_length += current_block.length;
-          current_block = filter_counter.NextWord();
-        }
-        WriteValueSegment(in_position, run_length);
-        in_position += run_length;
-      } else if (current_block.NoneSet()) {
-        // Nothing selected
-        in_position += current_block.length;
-        current_block = filter_counter.NextWord();
-      } else {
-        // Some values selected
-        for (int64_t i = 0; i < current_block.length; ++i) {
-          if (BitUtil::GetBit(filter_data_, filter_offset_ + in_position)) {
-            WriteValue(in_position);
-          }
-          ++in_position;
-        }
-        current_block = filter_counter.NextWord();
-      }
-    }
+    ::arrow::internal::VisitSetBitRunsVoid(
+        filter_data_, filter_offset_, values_length_,
+        [&](int64_t position, int64_t length) { WriteValueSegment(position, length); });

Review comment:
       :rocket: 

##########
File path: cpp/src/arrow/util/bit_run_reader.h
##########
@@ -166,7 +167,350 @@ class ARROW_EXPORT BitRunReader {
 using BitRunReader = BitRunReaderLinear;
 #endif
 
-// TODO SetBitRunReader?
+struct SetBitRun {
+  int64_t position;
+  int64_t length;
+
+  bool AtEnd() const { return length == 0; }
+
+  std::string ToString() const {
+    return std::string("{pos=") + std::to_string(position) +
+           ", len=" + std::to_string(length) + "}";
+  }
+
+  bool operator==(const SetBitRun& other) const {
+    return position == other.position && length == other.length;
+  }
+  bool operator!=(const SetBitRun& other) const {
+    return position != other.position || length != other.length;
+  }
+};
+
+template <bool Reverse>
+class BaseSetBitRunReader {
+ public:
+  /// \brief Constructs new SetBitRunReader.
+  ///
+  /// \param[in] bitmap source data
+  /// \param[in] start_offset bit offset into the source data
+  /// \param[in] length number of bits to copy
+  inline BaseSetBitRunReader(const uint8_t* bitmap, int64_t start_offset, int64_t length);
+
+  SetBitRun NextRun() {
+    int64_t pos = 0;
+    int64_t len = 0;
+    if (current_num_bits_) {
+      const auto run = FindCurrentRun();
+      assert(remaining_ >= 0);
+      if (run.length && current_num_bits_) {
+        // The run ends in current_word_
+        return AdjustRun(run);
+      }
+      pos = run.position;
+      len = run.length;
+    }
+    if (!len) {
+      // We didn't get any ones in current_word_, so we can skip any zeros
+      // in the following words
+      SkipNextZeros();
+      if (remaining_ == 0) {
+        return {0, 0};
+      }
+      assert(current_num_bits_);
+      pos = position();
+    } else if (!current_num_bits_) {
+      if (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+        current_word_ = LoadFullWord();
+        current_num_bits_ = 64;
+      } else if (remaining_ > 0) {
+        current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+        current_num_bits_ = static_cast<int32_t>(remaining_);
+      } else {
+        // No bits remaining, perhaps we found a run?
+        return AdjustRun({pos, len});
+      }
+      // If current word starts with a zero, we got a full run
+      if (!(current_word_ & kFirstBit)) {
+        return AdjustRun({pos, len});
+      }
+    }
+    // Current word should now start with a set bit
+    len += CountNextOnes();
+    return AdjustRun({pos, len});
+  }
+
+ protected:
+  int64_t position() const {
+    if (Reverse) {
+      return remaining_;
+    } else {
+      return length_ - remaining_;
+    }
+  }
+
+  SetBitRun AdjustRun(SetBitRun run) {
+    if (Reverse) {
+      assert(run.position >= run.length);
+      run.position -= run.length;
+    }
+    return run;
+  }
+
+  uint64_t LoadFullWord() {
+    uint64_t word;
+    if (Reverse) {
+      bitmap_ -= 8;
+    }
+    memcpy(&word, bitmap_, 8);
+    if (!Reverse) {
+      bitmap_ += 8;
+    }
+    return BitUtil::ToLittleEndian(word);
+  }
+
+  uint64_t LoadPartialWord(int8_t bit_offset, int64_t num_bits) {
+    assert(num_bits > 0);
+    uint64_t word = 0;
+    const int64_t num_bytes = BitUtil::BytesForBits(num_bits);
+    if (Reverse) {
+      // Read in the most significant bytes of the word
+      bitmap_ -= num_bytes;
+      memcpy(reinterpret_cast<char*>(&word) + 8 - num_bytes, bitmap_, num_bytes);
+      // XXX MostSignificantBitmask
+      return (BitUtil::ToLittleEndian(word) << bit_offset) &
+             ~BitUtil::LeastSignificantBitMask(64 - num_bits);
+    } else {
+      memcpy(&word, bitmap_, num_bytes);
+      bitmap_ += num_bytes;
+      return (BitUtil::ToLittleEndian(word) >> bit_offset) &
+             BitUtil::LeastSignificantBitMask(num_bits);
+    }
+  }
+
+  void SkipNextZeros() {
+    assert(current_num_bits_ == 0);
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_zeros = CountFirstZeros(current_word_);
+      if (num_zeros < 64) {

Review comment:
       IMHO, this is much easier to read as `if (num_zeros >= 64) { remaining_ -= 64; continue; } // rest`

##########
File path: cpp/src/arrow/util/bit_run_reader.h
##########
@@ -166,7 +167,350 @@ class ARROW_EXPORT BitRunReader {
 using BitRunReader = BitRunReaderLinear;
 #endif
 
-// TODO SetBitRunReader?
+struct SetBitRun {
+  int64_t position;
+  int64_t length;
+
+  bool AtEnd() const { return length == 0; }
+
+  std::string ToString() const {
+    return std::string("{pos=") + std::to_string(position) +
+           ", len=" + std::to_string(length) + "}";
+  }
+
+  bool operator==(const SetBitRun& other) const {
+    return position == other.position && length == other.length;
+  }
+  bool operator!=(const SetBitRun& other) const {
+    return position != other.position || length != other.length;
+  }
+};
+
+template <bool Reverse>
+class BaseSetBitRunReader {
+ public:
+  /// \brief Constructs new SetBitRunReader.
+  ///
+  /// \param[in] bitmap source data
+  /// \param[in] start_offset bit offset into the source data
+  /// \param[in] length number of bits to copy
+  inline BaseSetBitRunReader(const uint8_t* bitmap, int64_t start_offset, int64_t length);
+
+  SetBitRun NextRun() {
+    int64_t pos = 0;
+    int64_t len = 0;
+    if (current_num_bits_) {
+      const auto run = FindCurrentRun();
+      assert(remaining_ >= 0);
+      if (run.length && current_num_bits_) {
+        // The run ends in current_word_
+        return AdjustRun(run);
+      }
+      pos = run.position;
+      len = run.length;
+    }
+    if (!len) {
+      // We didn't get any ones in current_word_, so we can skip any zeros
+      // in the following words
+      SkipNextZeros();
+      if (remaining_ == 0) {
+        return {0, 0};
+      }
+      assert(current_num_bits_);
+      pos = position();
+    } else if (!current_num_bits_) {
+      if (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+        current_word_ = LoadFullWord();
+        current_num_bits_ = 64;
+      } else if (remaining_ > 0) {
+        current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+        current_num_bits_ = static_cast<int32_t>(remaining_);
+      } else {
+        // No bits remaining, perhaps we found a run?
+        return AdjustRun({pos, len});
+      }
+      // If current word starts with a zero, we got a full run
+      if (!(current_word_ & kFirstBit)) {
+        return AdjustRun({pos, len});
+      }
+    }
+    // Current word should now start with a set bit
+    len += CountNextOnes();
+    return AdjustRun({pos, len});
+  }
+
+ protected:
+  int64_t position() const {
+    if (Reverse) {
+      return remaining_;
+    } else {
+      return length_ - remaining_;
+    }
+  }
+
+  SetBitRun AdjustRun(SetBitRun run) {
+    if (Reverse) {
+      assert(run.position >= run.length);
+      run.position -= run.length;
+    }
+    return run;
+  }
+
+  uint64_t LoadFullWord() {
+    uint64_t word;
+    if (Reverse) {
+      bitmap_ -= 8;
+    }
+    memcpy(&word, bitmap_, 8);
+    if (!Reverse) {
+      bitmap_ += 8;
+    }
+    return BitUtil::ToLittleEndian(word);
+  }
+
+  uint64_t LoadPartialWord(int8_t bit_offset, int64_t num_bits) {
+    assert(num_bits > 0);
+    uint64_t word = 0;
+    const int64_t num_bytes = BitUtil::BytesForBits(num_bits);
+    if (Reverse) {
+      // Read in the most significant bytes of the word
+      bitmap_ -= num_bytes;
+      memcpy(reinterpret_cast<char*>(&word) + 8 - num_bytes, bitmap_, num_bytes);
+      // XXX MostSignificantBitmask
+      return (BitUtil::ToLittleEndian(word) << bit_offset) &
+             ~BitUtil::LeastSignificantBitMask(64 - num_bits);
+    } else {
+      memcpy(&word, bitmap_, num_bytes);
+      bitmap_ += num_bytes;
+      return (BitUtil::ToLittleEndian(word) >> bit_offset) &
+             BitUtil::LeastSignificantBitMask(num_bits);
+    }
+  }
+
+  void SkipNextZeros() {
+    assert(current_num_bits_ == 0);
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_zeros = CountFirstZeros(current_word_);
+      if (num_zeros < 64) {
+        current_word_ = ConsumeBits(current_word_, num_zeros);
+        current_num_bits_ = 64 - num_zeros;
+        remaining_ -= num_zeros;
+        assert(remaining_ >= 0);
+        assert(current_num_bits_ >= 0);
+        return;
+      }
+      remaining_ -= 64;
+    }
+    if (remaining_ > 0) {
+      current_word_ = LoadPartialWord(/*bit_offset=*/0, remaining_);
+      current_num_bits_ = static_cast<int32_t>(remaining_);
+      const auto num_zeros =
+          std::min<int32_t>(current_num_bits_, CountFirstZeros(current_word_));
+      current_word_ = ConsumeBits(current_word_, num_zeros);
+      current_num_bits_ -= num_zeros;
+      remaining_ -= num_zeros;
+      assert(remaining_ >= 0);
+      assert(current_num_bits_ >= 0);
+    }
+  }
+
+  int64_t CountNextOnes() {
+    assert(current_word_ & kFirstBit);
+
+    int64_t len;
+    if (~current_word_) {
+      const auto num_ones = CountFirstZeros(~current_word_);
+      assert(num_ones <= current_num_bits_);
+      assert(num_ones <= remaining_);
+      remaining_ -= num_ones;
+      current_word_ = ConsumeBits(current_word_, num_ones);
+      current_num_bits_ -= num_ones;
+      if (current_num_bits_) {
+        // There are pending zeros in current_word_
+        return num_ones;
+      }
+      len = num_ones;
+    } else {
+      // current_word_ is all ones
+      remaining_ -= 64;
+      current_num_bits_ = 0;
+      len = 64;
+    }
+
+    while (ARROW_PREDICT_TRUE(remaining_ >= 64)) {
+      current_word_ = LoadFullWord();
+      const auto num_ones = CountFirstZeros(~current_word_);
+      len += num_ones;
+      remaining_ -= num_ones;
+      if (num_ones < 64) {

Review comment:
       ```suggestion
         if (ARROW_PREDICT_FALSE(num_ones < 64)) {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on a change in pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
wesm commented on a change in pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#discussion_r532736857



##########
File path: cpp/src/arrow/compute/kernels/vector_selection.cc
##########
@@ -589,37 +590,9 @@ class PrimitiveFilterImpl {
 
   void ExecNonNull() {
     // Fast filter when values and filter are not null
-    // Bit counters used for both null_selection behaviors
-    BitBlockCounter filter_counter(filter_data_, filter_offset_, values_length_);
-
-    int64_t in_position = 0;
-    BitBlockCount current_block = filter_counter.NextWord();
-    while (in_position < values_length_) {
-      if (current_block.AllSet()) {
-        int64_t run_length = 0;
-        // If we've found a all-true block, then we scan forward until we find
-        // a block that has some false values (or we reach the end
-        while (current_block.length > 0 && current_block.AllSet()) {
-          run_length += current_block.length;
-          current_block = filter_counter.NextWord();
-        }
-        WriteValueSegment(in_position, run_length);
-        in_position += run_length;
-      } else if (current_block.NoneSet()) {
-        // Nothing selected
-        in_position += current_block.length;
-        current_block = filter_counter.NextWord();
-      } else {
-        // Some values selected
-        for (int64_t i = 0; i < current_block.length; ++i) {
-          if (BitUtil::GetBit(filter_data_, filter_offset_ + in_position)) {
-            WriteValue(in_position);
-          }
-          ++in_position;
-        }
-        current_block = filter_counter.NextWord();
-      }
-    }
+    ::arrow::internal::VisitSetBitRunsVoid(
+        filter_data_, filter_offset_, values_length_,
+        [&](int64_t position, int64_t length) { WriteValueSegment(position, length); });

Review comment:
       Very nice




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733919445


   Parquet benchmarks:
   ```
                                                                       benchmark            baseline           contender  change %                                                                                                                                                                                       counters
   199                                     BM_PlainDecodingSpacedFloat/32768/100       5.506 GiB/sec      18.466 GiB/sec   235.349                {'run_name': 'BM_PlainDecodingSpacedFloat/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31377, 'null_percent': 1.0}
   177                                 BM_PlainEncodingSpacedBoolean/32768/10000      11.377 GiB/sec      33.462 GiB/sec   194.106         {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 259382, 'null_percent': 100.0}
   179                                   BM_PlainDecodingSpacedBoolean/32768/100       1.358 GiB/sec       3.572 GiB/sec   163.120              {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 31133, 'null_percent': 1.0}
   282                                     BM_PlainEncodingSpacedFloat/32768/100       6.276 GiB/sec      15.605 GiB/sec   148.621                {'run_name': 'BM_PlainEncodingSpacedFloat/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 35719, 'null_percent': 1.0}
   152                                  BM_ArrowBinaryDict/EncodeLowLevel/262144      82.896 MiB/sec     203.718 MiB/sec   145.751                                     {'run_name': 'BM_ArrowBinaryDict/EncodeLowLevel/262144', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 38}
   298                                  BM_PlainEncodingSpacedDouble/32768/10000     115.446 GiB/sec     262.819 GiB/sec   127.654          {'run_name': 'BM_PlainEncodingSpacedDouble/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 330346, 'null_percent': 100.0}
   247                                    BM_PlainDecodingSpacedDouble/32768/100       9.953 GiB/sec      22.653 GiB/sec   127.594               {'run_name': 'BM_PlainDecodingSpacedDouble/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28669, 'null_percent': 1.0}
   207                                   BM_PlainEncodingSpacedFloat/32768/10000      57.732 GiB/sec     130.097 GiB/sec   125.348           {'run_name': 'BM_PlainEncodingSpacedFloat/32768/10000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 330727, 'null_percent': 100.0}
   203                                       BM_PlainDecodingSpacedFloat/32768/1      21.382 GiB/sec      36.628 GiB/sec    71.298                {'run_name': 'BM_PlainDecodingSpacedFloat/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 122797, 'null_percent': 0.01}
   231                                    BM_PlainEncodingSpacedDouble/32768/100      10.467 GiB/sec      17.029 GiB/sec    62.697               {'run_name': 'BM_PlainEncodingSpacedDouble/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 30042, 'null_percent': 1.0}
   256                                    BM_PlainEncodingSpacedFloat/32768/5000    1008.863 MiB/sec       1.592 GiB/sec    61.538               {'run_name': 'BM_PlainEncodingSpacedFloat/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5619, 'null_percent': 50.0}
   273                                   BM_PlainEncodingSpacedDouble/32768/5000       2.022 GiB/sec       3.201 GiB/sec    58.281              {'run_name': 'BM_PlainEncodingSpacedDouble/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5712, 'null_percent': 50.0}
   162                                    BM_PlainDecodingSpacedFloat/32768/5000     981.563 MiB/sec       1.516 GiB/sec    58.168               {'run_name': 'BM_PlainDecodingSpacedFloat/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5473, 'null_percent': 50.0}
   173                                   BM_PlainDecodingSpacedDouble/32768/5000       1.891 GiB/sec       2.991 GiB/sec    58.157              {'run_name': 'BM_PlainDecodingSpacedDouble/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5401, 'null_percent': 50.0}
   238                                  BM_PlainDecodingSpacedBoolean/32768/5000     250.841 MiB/sec     384.756 MiB/sec    53.386             {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5585, 'null_percent': 50.0}
   264                                  BM_PlainEncodingSpacedBoolean/32768/5000     234.661 MiB/sec     353.009 MiB/sec    50.433             {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/5000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5211, 'null_percent': 50.0}
   286                                    BM_PlainDecodingSpacedFloat/32768/1000       2.531 GiB/sec       3.739 GiB/sec    47.699              {'run_name': 'BM_PlainDecodingSpacedFloat/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14280, 'null_percent': 10.0}
   229                                    BM_PlainEncodingSpacedFloat/32768/1000       2.680 GiB/sec       3.783 GiB/sec    41.159              {'run_name': 'BM_PlainEncodingSpacedFloat/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 15210, 'null_percent': 10.0}
   251                                      BM_PlainDecodingSpacedDouble/32768/1      25.016 GiB/sec      34.929 GiB/sec    39.627                {'run_name': 'BM_PlainDecodingSpacedDouble/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 71628, 'null_percent': 0.01}
   201                                   BM_PlainDecodingSpacedDouble/32768/1000       4.987 GiB/sec       6.893 GiB/sec    38.224             {'run_name': 'BM_PlainDecodingSpacedDouble/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14031, 'null_percent': 10.0}
   244                                   BM_PlainEncodingSpacedBoolean/32768/100     664.198 MiB/sec     913.312 MiB/sec    37.506              {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/100', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14952, 'null_percent': 1.0}
   239                                   BM_PlainEncodingSpacedDouble/32768/1000       5.174 GiB/sec       7.071 GiB/sec    36.652             {'run_name': 'BM_PlainEncodingSpacedDouble/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14708, 'null_percent': 10.0}
   230                                  BM_PlainDecodingSpacedBoolean/32768/1000     678.559 MiB/sec     857.465 MiB/sec    26.366            {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 15333, 'null_percent': 10.0}
   102       BM_WriteInt64Column<Repetition::OPTIONAL, Compression::LZ4>/1048576     332.343 MiB/sec     409.541 MiB/sec    23.228         {'run_name': 'BM_WriteInt64Column<Repetition::OPTIONAL, Compression::LZ4>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 109}
   185                                       BM_PlainEncodingSpacedFloat/32768/1      17.571 GiB/sec      21.578 GiB/sec    22.805                {'run_name': 'BM_PlainEncodingSpacedFloat/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 100736, 'null_percent': 0.01}
   176                                  BM_PlainEncodingSpacedBoolean/32768/1000     450.133 MiB/sec     545.271 MiB/sec    21.136             {'run_name': 'BM_PlainEncodingSpacedBoolean/32768/1000', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 9948, 'null_percent': 10.0}
   187                                     BM_PlainDecodingSpacedBoolean/32768/1       4.496 GiB/sec       5.314 GiB/sec    18.200              {'run_name': 'BM_PlainDecodingSpacedBoolean/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 103077, 'null_percent': 0.01}
   25                                            BM_WriteColumn<false,Int64Type>       1.057 GiB/sec       1.220 GiB/sec    15.454                                              {'run_name': 'BM_WriteColumn<false,Int64Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 10}
   39                                       BM_ReadColumn<true,BooleanType>/5/10     250.138 MiB/sec     283.639 MiB/sec    13.393                                         {'run_name': 'BM_ReadColumn<true,BooleanType>/5/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18}
   75        BM_WriteInt64Column<Repetition::REQUIRED, Compression::LZ4>/1048576       1.084 GiB/sec       1.216 GiB/sec    12.132         {'run_name': 'BM_WriteInt64Column<Repetition::REQUIRED, Compression::LZ4>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 443}
   115                                                    BM_RleEncoding/32768/1     724.153 MiB/sec     810.490 MiB/sec    11.923                                                     {'run_name': 'BM_RleEncoding/32768/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 8141}
   92                                                      BM_RleEncoding/1024/1     706.002 MiB/sec     788.645 MiB/sec    11.706                                                    {'run_name': 'BM_RleEncoding/1024/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 232581}
   106                         BM_WriteInt64Column<Repetition::REPEATED>/1048576     218.593 MiB/sec     243.729 MiB/sec    11.499                            {'run_name': 'BM_WriteInt64Column<Repetition::REPEATED>/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 77}
   109                                                    BM_RleEncoding/65536/1     729.356 MiB/sec     811.227 MiB/sec    11.225                                                     {'run_name': 'BM_RleEncoding/65536/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4076}
   89                                                      BM_RleEncoding/4096/1     725.079 MiB/sec     804.486 MiB/sec    10.952                                                     {'run_name': 'BM_RleEncoding/4096/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 64655}
   7                                        BM_ReadColumn<false,Int32Type>/-1/10       1.811 GiB/sec       2.008 GiB/sec    10.851                                         {'run_name': 'BM_ReadColumn<false,Int32Type>/-1/10', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 32}
   58                                        BM_ReadColumn<false,Int32Type>/-1/1       5.426 GiB/sec       5.999 GiB/sec    10.568                                          {'run_name': 'BM_ReadColumn<false,Int32Type>/-1/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 89}
   87     BM_ReadInt64Column<Repetition::REQUIRED, Compression::ZSTD>/65536/1024      12.156 GiB/sec      11.951 GiB/sec    -1.682    {'run_name': 'BM_ReadInt64Column<Repetition::REQUIRED, Compression::ZSTD>/65536/1024', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 69528}
   [...]
   204                                             BM_PlainEncodingBoolean/65536     802.645 MiB/sec     707.974 MiB/sec   -11.795                                              {'run_name': 'BM_PlainEncodingBoolean/65536', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 8937}
   132                                                    BM_RleEncoding/65536/8     482.792 MiB/sec     424.143 MiB/sec   -12.148                                                     {'run_name': 'BM_RleEncoding/65536/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2684}
   15                                            BM_WriteColumn<true,DoubleType>     622.954 MiB/sec     498.484 MiB/sec   -19.981                                               {'run_name': 'BM_WriteColumn<true,DoubleType>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   16                                             BM_WriteColumn<true,Int64Type>     668.120 MiB/sec     526.837 MiB/sec   -21.146                                                {'run_name': 'BM_WriteColumn<true,Int64Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   34                                             BM_WriteColumn<true,Int32Type>     362.327 MiB/sec     256.361 MiB/sec   -29.246                                                {'run_name': 'BM_WriteColumn<true,Int32Type>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6}
   27                                           BM_WriteColumn<true,BooleanType>     101.732 MiB/sec      69.602 MiB/sec   -31.583                                              {'run_name': 'BM_WriteColumn<true,BooleanType>', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7}
   60                                       BM_ReadColumn<true,BooleanType>/-1/1     480.372 MiB/sec     219.376 MiB/sec   -54.332                                         {'run_name': 'BM_ReadColumn<true,BooleanType>/-1/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 32}
   289                                 BM_ArrowBinaryDict/EncodeLowLevel/1048576     316.208 MiB/sec     131.127 MiB/sec   -58.531                                    {'run_name': 'BM_ArrowBinaryDict/EncodeLowLevel/1048576', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 37}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8770:
URL: https://github.com/apache/arrow/pull/8770#issuecomment-733926480


   Also some speedups on no-null filters:
   * before:
   ```
   FilterInt64FilterNoNulls/524288/0          19993 ns        19988 ns        34897 bytes_per_second=24.4281G/s data null%=0 mask null%=0 select%=99.9 size=524.288k
   FilterInt64FilterNoNulls/524288/1         249210 ns       249153 ns         2805 bytes_per_second=1.95976G/s data null%=0 mask null%=0 select%=50 size=524.288k
   FilterInt64FilterNoNulls/524288/2          31896 ns        31889 ns        21867 bytes_per_second=15.3118G/s data null%=0 mask null%=0 select%=1 size=524.288k
   
   FilterStringFilterNoNulls/524288/0         54006 ns        53996 ns        12804 bytes_per_second=9.04295G/s data null%=0 mask null%=0 select%=99.9 size=524.288k
   FilterStringFilterNoNulls/524288/1        256494 ns       256438 ns         2721 bytes_per_second=1.90409G/s data null%=0 mask null%=0 select%=50 size=524.288k
   FilterStringFilterNoNulls/524288/2         16881 ns        16878 ns        40537 bytes_per_second=28.9302G/s data null%=0 mask null%=0 select%=1 size=524.288k
   ```
   
   * after:
   ```
   FilterInt64FilterNoNulls/524288/0          14416 ns        14413 ns        49758 bytes_per_second=33.8771G/s data null%=0 mask null%=0 select%=99.9 size=524.288k
   FilterInt64FilterNoNulls/524288/1         151899 ns       151870 ns         4514 bytes_per_second=3.21512G/s data null%=0 mask null%=0 select%=50 size=524.288k
   FilterInt64FilterNoNulls/524288/2           8578 ns         8577 ns        82823 bytes_per_second=56.931G/s data null%=0 mask null%=0 select%=1 size=524.288k
   
   FilterStringFilterNoNulls/524288/0         45116 ns        45104 ns        15887 bytes_per_second=10.8256G/s data null%=0 mask null%=0 select%=99.9 size=524.288k
   FilterStringFilterNoNulls/524288/1        146822 ns       146796 ns         4746 bytes_per_second=3.32625G/s data null%=0 mask null%=0 select%=50 size=524.288k
   FilterStringFilterNoNulls/524288/2          5438 ns         5437 ns       126135 bytes_per_second=89.8084G/s data null%=0 mask null%=0 select%=1 size=524.288k
   ```
   
   (couldn't run a diff because of ARROW-10738)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org