You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/15 17:10:16 UTC

[GitHub] [arrow] lidavidm commented on a change in pull request #10520: ARROW-12709: [C++] Add binary_join_element_wise

lidavidm commented on a change in pull request #10520:
URL: https://github.com/apache/arrow/pull/10520#discussion_r651990126



##########
File path: cpp/src/arrow/compute/kernels/scalar_string_benchmark.cc
##########
@@ -169,6 +169,48 @@ static void BinaryJoinArrayArray(benchmark::State& state) {
   });
 }
 
+static void BinaryJoinElementWise(benchmark::State& state,
+                                  SeparatorFactory make_separator) {
+  // Unfortunately benchmark is not 1:1 with BinaryJoin since BinaryJoin can join a
+  // varying number of inputs per output
+  const int64_t n_strings = 1000;

Review comment:
       Ah that's fair. I've bumped it up to 65536 rows. 
   
   Old impl:
   
   ```
   -----------------------------------------------------------------------------------------------
   Benchmark                                     Time             CPU   Iterations UserCounters...
   -----------------------------------------------------------------------------------------------
   BinaryJoinArrayScalar                    104542 ns       104540 ns         6671 bytes_per_second=1113.52M/s
   BinaryJoinArrayArray                     114749 ns       114750 ns         6062 bytes_per_second=1014.45M/s
   BinaryJoinElementWiseArrayScalar/2      3017902 ns      3017894 ns          229 bytes_per_second=506.693M/s
   BinaryJoinElementWiseArrayScalar/8     10470391 ns     10470204 ns           67 bytes_per_second=585.289M/s
   BinaryJoinElementWiseArrayScalar/64    77328574 ns     77273626 ns            9 bytes_per_second=634.122M/s
   BinaryJoinElementWiseArrayScalar/128  112622860 ns    112606581 ns            6 bytes_per_second=870.334M/s
   BinaryJoinElementWiseArrayArray/2       3508926 ns      3508898 ns          200 bytes_per_second=435.791M/s
   BinaryJoinElementWiseArrayArray/8      11349990 ns     11349827 ns           61 bytes_per_second=539.928M/s
   BinaryJoinElementWiseArrayArray/64     76564083 ns     76563761 ns            9 bytes_per_second=640.001M/s
   BinaryJoinElementWiseArrayArray/128   111560076 ns    111556809 ns            6 bytes_per_second=878.524M/s
   ```
   
   Current impl:
   ```
   -----------------------------------------------------------------------------------------------
   Benchmark                                     Time             CPU   Iterations UserCounters...
   -----------------------------------------------------------------------------------------------
   BinaryJoinArrayScalar                    107309 ns       107307 ns         6537 bytes_per_second=1084.81M/s
   BinaryJoinArrayArray                     117132 ns       117131 ns         5966 bytes_per_second=993.829M/s
   BinaryJoinElementWiseArrayScalar/2      2624376 ns      2624345 ns          267 bytes_per_second=582.677M/s
   BinaryJoinElementWiseArrayScalar/8      9394312 ns      9394273 ns           73 bytes_per_second=652.322M/s
   BinaryJoinElementWiseArrayScalar/64    77934790 ns     77934514 ns            9 bytes_per_second=628.744M/s
   BinaryJoinElementWiseArrayScalar/128  128550012 ns    128549432 ns            5 bytes_per_second=762.394M/s
   BinaryJoinElementWiseArrayArray/2       3258818 ns      3258767 ns          214 bytes_per_second=469.24M/s
   BinaryJoinElementWiseArrayArray/8      10201948 ns     10201752 ns           66 bytes_per_second=600.69M/s
   BinaryJoinElementWiseArrayArray/64     79305591 ns     79303789 ns            8 bytes_per_second=617.888M/s
   BinaryJoinElementWiseArrayArray/128   129524057 ns    129524481 ns            5 bytes_per_second=756.655M/s
   ```
   
   Still about the same relative to each other.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org