You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/01 09:32:37 UTC

[GitHub] [arrow] cyb70289 opened a new pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small…

cyb70289 opened a new pull request #8091:
URL: https://github.com/apache/arrow/pull/8091


   … value range
   
   For int16/32/64 arrays with reasonable length, scan the array to find
   min/max values first. If (max-min) is within some threshold, instead
   of general hashmap, using a value indexed array can improve performance
   significantly.
   
   To be compatible with chunked array, value count array is transferred to
   hashmap before merging with others. This is an overhead for short array.
   Finding min/max may also introduce performance penalty in some cases.
   
   Please note it's hard to benefit all use cases. By applying this patch:
   - about 2x performance uplift for integers in small value range
   - no obvious performance drop for normal cases
   - non-trivial performance drop in some cases
     * 40% drop for short int8 array (8k length)
     * 10% drop for sparse array (few distinct values, big value gap)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 removed a comment on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
cyb70289 removed a comment on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-684666816


   Tested on skylake (knight landing).
   Removed unrelated tests for clarity.
   100% null benchmark is not stable, can be ignored.
   
   ```
   $ archery benchmark diff --suite-filter="arrow-compute-aggregate-benchmark" --cc=clang-9 --cxx=clang++-9
   
                             benchmark         baseline        contender  change %
   1        ModeKernelInt16/1048576/10  122.148 MiB/sec  364.273 MiB/sec   198.222  {'null_percent': 10.0}
   24    ModeKernelInt32/1048576/10000  227.845 MiB/sec  676.265 MiB/sec   196.809  {'null_percent': 0.01}
   14      ModeKernelInt16/1048576/100  116.265 MiB/sec  342.747 MiB/sec   194.798  {'null_percent': 1.0}
   23      ModeKernelInt32/1048576/100  229.216 MiB/sec  669.534 MiB/sec   192.098  {'null_percent': 1.0}
   17       ModeKernelInt32/1048576/10  241.768 MiB/sec  700.500 MiB/sec   189.740  {'null_percent': 10.0}
   16    ModeKernelInt16/1048576/10000  115.484 MiB/sec  324.841 MiB/sec   181.287  {'null_percent': 0.01}
   8     ModeKernelInt64/1048576/10000  481.374 MiB/sec    1.308 GiB/sec   178.214  {'null_percent': 0.01}
   10      ModeKernelInt64/1048576/100  485.696 MiB/sec    1.313 GiB/sec   176.807  {'null_percent': 1.0}
   20       ModeKernelInt64/1048576/10  527.967 MiB/sec    1.383 GiB/sec   168.292  {'null_percent': 10.0}
   12        ModeKernelInt32/1048576/0  257.092 MiB/sec  687.083 MiB/sec   167.251  {'null_percent': 0.0}
   15        ModeKernelInt16/1048576/0  128.398 MiB/sec  342.670 MiB/sec   166.880  {'null_percent': 0.0}
   11        ModeKernelInt64/1048576/0  513.891 MiB/sec    1.325 GiB/sec   164.088  {'null_percent': 0.0}
   4         ModeKernelInt16/1048576/2  162.258 MiB/sec  384.148 MiB/sec   136.752  {'null_percent': 50.0}
   25        ModeKernelInt32/1048576/2  320.094 MiB/sec  650.958 MiB/sec   103.365  {'null_percent': 50.0}
   2         ModeKernelInt64/1048576/2  811.335 MiB/sec    1.403 GiB/sec    77.111  {'null_percent': 50.0}
   0         ModeKernelInt64/1048576/1    7.906 GiB/sec    7.888 GiB/sec    -0.230  {'null_percent': 100.0}
   9       ModeKernelBoolean/1048576/0   26.502 MiB/sec   26.148 MiB/sec    -1.337  {'null_percent': 0.0}
   19      ModeKernelBoolean/1048576/2   27.081 MiB/sec   26.518 MiB/sec    -2.078  {'null_percent': 50.0}
   18  ModeKernelBoolean/1048576/10000   26.126 MiB/sec   25.361 MiB/sec    -2.927  {'null_percent': 0.01}
   5      ModeKernelBoolean/1048576/10   27.987 MiB/sec   27.111 MiB/sec    -3.131  {'null_percent': 10.0}
   27    ModeKernelBoolean/1048576/100   25.729 MiB/sec   24.911 MiB/sec    -3.181  {'null_percent': 1.0}
   28      ModeKernelBoolean/1048576/1  123.582 MiB/sec  118.548 MiB/sec    -4.074  {'null_percent': 100.0}
   6      ModeKernelInt8/1048576/10000  248.084 MiB/sec  232.688 MiB/sec    -6.206  {'null_percent': 0.01}
   21       ModeKernelInt8/1048576/100  253.614 MiB/sec  230.012 MiB/sec    -9.306  {'null_percent': 1.0}
   26         ModeKernelInt8/1048576/0  270.337 MiB/sec  244.879 MiB/sec    -9.417  {'null_percent': 0.0}
   3         ModeKernelInt8/1048576/10  247.946 MiB/sec  223.306 MiB/sec    -9.938  {'null_percent': 10.0}
   13         ModeKernelInt8/1048576/2  198.628 MiB/sec  178.205 MiB/sec   -10.282  {'null_percent': 50.0}
   29         ModeKernelInt8/1048576/1  948.757 MiB/sec  828.408 MiB/sec   -12.685  {'null_percent': 100.0}
   22        ModeKernelInt16/1048576/1    2.852 GiB/sec    2.265 GiB/sec   -20.587  {'null_percent': 100.0}
   7         ModeKernelInt32/1048576/1    5.714 GiB/sec    3.430 GiB/sec   -39.976  {'null_percent': 100.0}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #8091:
URL: https://github.com/apache/arrow/pull/8091


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-684681790


   https://issues.apache.org/jira/browse/ARROW-9873


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#discussion_r481275863



##########
File path: cpp/src/arrow/compute/kernels/aggregate_mode.cc
##########
@@ -26,95 +26,187 @@ namespace aggregate {
 
 namespace {
 
+// Count values and return value:count map
+template <typename T>
+struct ValueCounter {
+  virtual void CountOne(T value) = 0;

Review comment:
       I don't think polymorphism is a good idea for performance. Instead, you should probably use templated code.
   Also, when `null_count == length`, the implementation can be trivial.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small…

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-684666816


   Tested on skylake (knight landing).
   Removed unrelated tests for clarity.
   100% null benchmark is not stable, can be ignored.
   
   ```
   $ archery benchmark diff --suite-filter="arrow-compute-aggregate-benchmark" --cc=clang-9 --cxx=clang++-9
   
                             benchmark         baseline        contender  change %
   1        ModeKernelInt16/1048576/10  122.148 MiB/sec  364.273 MiB/sec   198.222  {'null_percent': 10.0}
   24    ModeKernelInt32/1048576/10000  227.845 MiB/sec  676.265 MiB/sec   196.809  {'null_percent': 0.01}
   14      ModeKernelInt16/1048576/100  116.265 MiB/sec  342.747 MiB/sec   194.798  {'null_percent': 1.0}
   23      ModeKernelInt32/1048576/100  229.216 MiB/sec  669.534 MiB/sec   192.098  {'null_percent': 1.0}
   17       ModeKernelInt32/1048576/10  241.768 MiB/sec  700.500 MiB/sec   189.740  {'null_percent': 10.0}
   16    ModeKernelInt16/1048576/10000  115.484 MiB/sec  324.841 MiB/sec   181.287  {'null_percent': 0.01}
   8     ModeKernelInt64/1048576/10000  481.374 MiB/sec    1.308 GiB/sec   178.214  {'null_percent': 0.01}
   10      ModeKernelInt64/1048576/100  485.696 MiB/sec    1.313 GiB/sec   176.807  {'null_percent': 1.0}
   20       ModeKernelInt64/1048576/10  527.967 MiB/sec    1.383 GiB/sec   168.292  {'null_percent': 10.0}
   12        ModeKernelInt32/1048576/0  257.092 MiB/sec  687.083 MiB/sec   167.251  {'null_percent': 0.0}
   15        ModeKernelInt16/1048576/0  128.398 MiB/sec  342.670 MiB/sec   166.880  {'null_percent': 0.0}
   11        ModeKernelInt64/1048576/0  513.891 MiB/sec    1.325 GiB/sec   164.088  {'null_percent': 0.0}
   4         ModeKernelInt16/1048576/2  162.258 MiB/sec  384.148 MiB/sec   136.752  {'null_percent': 50.0}
   25        ModeKernelInt32/1048576/2  320.094 MiB/sec  650.958 MiB/sec   103.365  {'null_percent': 50.0}
   2         ModeKernelInt64/1048576/2  811.335 MiB/sec    1.403 GiB/sec    77.111  {'null_percent': 50.0}
   0         ModeKernelInt64/1048576/1    7.906 GiB/sec    7.888 GiB/sec    -0.230  {'null_percent': 100.0}
   9       ModeKernelBoolean/1048576/0   26.502 MiB/sec   26.148 MiB/sec    -1.337  {'null_percent': 0.0}
   19      ModeKernelBoolean/1048576/2   27.081 MiB/sec   26.518 MiB/sec    -2.078  {'null_percent': 50.0}
   18  ModeKernelBoolean/1048576/10000   26.126 MiB/sec   25.361 MiB/sec    -2.927  {'null_percent': 0.01}
   5      ModeKernelBoolean/1048576/10   27.987 MiB/sec   27.111 MiB/sec    -3.131  {'null_percent': 10.0}
   27    ModeKernelBoolean/1048576/100   25.729 MiB/sec   24.911 MiB/sec    -3.181  {'null_percent': 1.0}
   28      ModeKernelBoolean/1048576/1  123.582 MiB/sec  118.548 MiB/sec    -4.074  {'null_percent': 100.0}
   6      ModeKernelInt8/1048576/10000  248.084 MiB/sec  232.688 MiB/sec    -6.206  {'null_percent': 0.01}
   21       ModeKernelInt8/1048576/100  253.614 MiB/sec  230.012 MiB/sec    -9.306  {'null_percent': 1.0}
   26         ModeKernelInt8/1048576/0  270.337 MiB/sec  244.879 MiB/sec    -9.417  {'null_percent': 0.0}
   3         ModeKernelInt8/1048576/10  247.946 MiB/sec  223.306 MiB/sec    -9.938  {'null_percent': 10.0}
   13         ModeKernelInt8/1048576/2  198.628 MiB/sec  178.205 MiB/sec   -10.282  {'null_percent': 50.0}
   29         ModeKernelInt8/1048576/1  948.757 MiB/sec  828.408 MiB/sec   -12.685  {'null_percent': 100.0}
   22        ModeKernelInt16/1048576/1    2.852 GiB/sec    2.265 GiB/sec   -20.587  {'null_percent': 100.0}
   7         ModeKernelInt32/1048576/1    5.714 GiB/sec    3.430 GiB/sec   -39.976  {'null_percent': 100.0}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#discussion_r481789852



##########
File path: cpp/src/arrow/compute/kernels/aggregate_mode.cc
##########
@@ -26,95 +26,187 @@ namespace aggregate {
 
 namespace {
 
+// Count values and return value:count map
+template <typename T>
+struct ValueCounter {
+  virtual void CountOne(T value) = 0;

Review comment:
       Changed to template. Did get better performance. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-685539220


   I get similar speedups on AMD Ryzen.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-685472515


   "RTools 35" CI failure is not related


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on pull request #8091: ARROW-9873: [C++][Compute] Optimize mode kernel for integers in small value range

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-685348952


   Latest benchmark result after re-implementation.
   
   ```
   # Tested on skylake (knight landing)
   $ archery benchmark diff --suite-filter="arrow-compute-aggregate-benchmark" --benchmark-filter="^Mode" --cc=clang-9 --cxx=clang++-9
   
                         benchmark         baseline        contender    change %
   // igonre these 100% cases, huge boost due to a simple trick, not very useful
       ModeKernelBoolean/1048576/1  123.216 MiB/sec  847.125 GiB/sec  703915.356 null_percent': 100.0
          ModeKernelInt8/1048576/1  896.330 MiB/sec  617.997 GiB/sec   70502.192 null_percent': 100.0
         ModeKernelInt16/1048576/1    2.886 GiB/sec  965.237 GiB/sec   33340.541 null_percent': 100.0
         ModeKernelInt32/1048576/1    5.732 GiB/sec  960.476 GiB/sec   16657.027 null_percent': 100.0
         ModeKernelInt64/1048576/1    7.925 GiB/sec  974.705 GiB/sec   12198.487 null_percent': 100.0
   
   // big improvement for int16/32/64 with limited value range 
         ModeKernelInt16/1048576/0  128.522 MiB/sec  495.771 MiB/sec     285.749  'null_percent': 0.0
         ModeKernelInt32/1048576/0  257.694 MiB/sec  953.232 MiB/sec     269.909  'null_percent': 0.0
         ModeKernelInt64/1048576/0  516.624 MiB/sec    1.715 GiB/sec     240.027  'null_percent': 0.0
     ModeKernelInt32/1048576/10000  227.404 MiB/sec  690.032 MiB/sec     203.439 'null_percent': 0.01
     ModeKernelInt16/1048576/10000  115.419 MiB/sec  349.055 MiB/sec     202.425 'null_percent': 0.01
       ModeKernelInt32/1048576/100  229.661 MiB/sec  684.149 MiB/sec     197.895  'null_percent': 1.0
       ModeKernelInt16/1048576/100  116.084 MiB/sec  342.620 MiB/sec     195.148  'null_percent': 1.0
     ModeKernelInt64/1048576/10000  481.409 MiB/sec    1.302 GiB/sec     176.913 'null_percent': 0.01
       ModeKernelInt64/1048576/100  486.266 MiB/sec    1.297 GiB/sec     173.114  'null_percent': 1.0
        ModeKernelInt16/1048576/10  121.865 MiB/sec  315.932 MiB/sec     159.247 'null_percent': 10.0
        ModeKernelInt32/1048576/10  242.074 MiB/sec  625.162 MiB/sec     158.252 'null_percent': 10.0
        ModeKernelInt64/1048576/10  527.976 MiB/sec    1.199 GiB/sec     132.580 'null_percent': 10.0
         ModeKernelInt32/1048576/2  320.156 MiB/sec  429.196 MiB/sec      34.058 'null_percent': 50.0
         ModeKernelInt16/1048576/2  162.121 MiB/sec  196.310 MiB/sec      21.089 'null_percent': 50.0
   
   // no obvious difference for bool/int8
        ModeKernelInt8/1048576/100  234.422 MiB/sec  251.464 MiB/sec       7.270  'null_percent': 1.0
         ModeKernelInt8/1048576/10  246.324 MiB/sec  258.110 MiB/sec       4.785 'null_percent': 10.0
      ModeKernelInt8/1048576/10000  239.496 MiB/sec  250.469 MiB/sec       4.582 'null_percent': 0.01
         ModeKernelInt64/1048576/2  812.020 MiB/sec  832.610 MiB/sec       2.536 'null_percent': 50.0
   ModeKernelBoolean/1048576/10000   26.318 MiB/sec   26.509 MiB/sec       0.728 'null_percent': 0.01
     ModeKernelBoolean/1048576/100   26.510 MiB/sec   26.597 MiB/sec       0.327  'null_percent': 1.0
       ModeKernelBoolean/1048576/0   28.271 MiB/sec   28.274 MiB/sec       0.009  'null_percent': 0.0
          ModeKernelInt8/1048576/0  270.401 MiB/sec  269.025 MiB/sec      -0.509  'null_percent': 0.0
          ModeKernelInt8/1048576/2  190.410 MiB/sec  187.876 MiB/sec      -1.331 'null_percent': 50.0
      ModeKernelBoolean/1048576/10   28.007 MiB/sec   27.599 MiB/sec      -1.455 'null_percent': 10.0
       ModeKernelBoolean/1048576/2   27.157 MiB/sec   24.209 MiB/sec     -10.857 'null_percent': 50.0
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org