You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/21 00:57:01 UTC

[GitHub] [arrow] wesm opened a new pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

wesm opened a new pull request #9280:
URL: https://github.com/apache/arrow/pull/9280


   These are only preliminary benchmarks but may help in examining microperformance overhead related to `ExecBatch` and its implementation (as a `vector<Datum>`).
   
   It may be desirable to devise an "array reference" data structure with few or no heap-allocated data structures and no `shared_ptr` interactions required to obtain memory addresses and other array information.
   
   On my test machine (macOS i9-9880H 2.3ghz), I see about 472 CPU cycles per field overhead for each ExecBatch produced. These benchmarks take a record batch with 1M rows and 10 columns/fields and iterates through the rows in smaller ExecBatches of the indicated sizes
   
   ```
   BM_ExecBatchIterator/256      8207877 ns      8204914 ns           81 items_per_second=121.878/s
   BM_ExecBatchIterator/512      4421049 ns      4419958 ns          166 items_per_second=226.247/s
   BM_ExecBatchIterator/1024     2056636 ns      2055369 ns          333 items_per_second=486.531/s
   BM_ExecBatchIterator/2048     1056415 ns      1056264 ns          682 items_per_second=946.733/s
   BM_ExecBatchIterator/4096      514276 ns       514136 ns         1246 items_per_second=1.94501k/s
   BM_ExecBatchIterator/8192      262539 ns       262391 ns         2736 items_per_second=3.81111k/s
   BM_ExecBatchIterator/16384     128995 ns       128974 ns         5398 items_per_second=7.75351k/s
   BM_ExecBatchIterator/32768      64987 ns        64970 ns        10811 items_per_second=15.3917k/s
   ```
   
   So for the 1024 case, it takes 2,055,367 ns to iterate through all 1024 batches. That seems a bit expensive to me (?) — I suspect we can do better while also improving compilation times and reducing generated code size by using simpler data structures in our compute internals.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#issuecomment-764142561


   https://issues.apache.org/jira/browse/ARROW-8928


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r562057646



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       We already have `function_benchmark.cc` for such internals benchmarks.

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1000;
+
+  int num_datums = 1000;
+  std::vector<Datum> datums(num_datums);
+  for (int i = 0; i < num_datums; ++i) {
+    datums[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    for (const Datum& datum : datums) {
+      auto slice = datum.array()->Slice(16, 64);

Review comment:
       Is it significantly different from slicing an array? Otherwise, is there a point for the specific "slice a datum" operation?

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());

Review comment:
       "items" is a bit ambiguous in this benchmark, but I would expect something else than the number of iterations. Perhaps the number of arrays yielded in the inner loop above?

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch

Review comment:
       Looks like this was copy-pasted?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r561589279



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;

Review comment:
       remove it?

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1000;
+
+  int num_datums = 1000;

Review comment:
       also define as const?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
wesm commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r562121639



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       I can merge these together then. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
wesm commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r679355446



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());

Review comment:
       I added a comment to explain that iterations-per-second gives easier interpretation of the input-splitting overhead (so 300 iterations/second would mean 3.33ms of input splitting overhead for each use)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r561589436



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1000;
+
+  int num_datums = 1000;

Review comment:
       also define as const?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #9280:
URL: https://github.com/apache/arrow/pull/9280


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r561589279



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;

Review comment:
       remove it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#issuecomment-764142561


   https://issues.apache.org/jira/browse/ARROW-8928


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#issuecomment-889333841


   @cyb70289 @pitrou I addressed your comments, I think, could you take another look and then we can canonize this benchmark to help with the `ExecBatch` performance revamp? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#issuecomment-887924786


   Some updated performance (gcc 9.3 locally on x86):
   
   ```
   -------------------------------------------------------------------------------------
   Benchmark                           Time             CPU   Iterations UserCounters...
   -------------------------------------------------------------------------------------
   BM_ExecBatchIterator/256     11314787 ns     11313272 ns           62 items_per_second=88.3918/s
   BM_ExecBatchIterator/512      5670423 ns      5669598 ns          123 items_per_second=176.379/s
   BM_ExecBatchIterator/1024     2903937 ns      2903272 ns          242 items_per_second=344.439/s
   BM_ExecBatchIterator/2048     1461982 ns      1461711 ns          481 items_per_second=684.13/s
   BM_ExecBatchIterator/4096      739382 ns       739235 ns          951 items_per_second=1.35275k/s
   BM_ExecBatchIterator/8192      370264 ns       370207 ns         1892 items_per_second=2.70119k/s
   BM_ExecBatchIterator/16384     186622 ns       186573 ns         3755 items_per_second=5.35983k/s
   BM_ExecBatchIterator/32768      93581 ns        93567 ns         7437 items_per_second=10.6876k/s
   ```
   
   The way to read this is that breaking `ExecBatch` with 32 primitive array fields into smaller ExecBatches (and then accessing a a data pointer in each batch) has an overhead of approximately:
   
   * 2800 nanoseconds per batch
   * 88.6 nanoseconds per batch per field
   
   So if you wanted to break a batch with 1M elements into batches of size 1024 for finer-grained parallel processing, you would pay  2900 microseconds to do so. On this same machine, I have:
   
   ```
   In [2]: arr = np.random.randn(1 << 20)                                                                                                                                                         
   
   In [3]: timeit arr * 2                                                                                                                                                                         
   395 µs ± 8.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
   ```
   
   This seems problematic if we wish to enable array expression evaluation on smaller batch sizes to keep more data in CPU caches. I'll bring this up on the mailing list to see what people think. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r562057646



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       We already have `function_benchmark.cc` for such internals benchmarks.

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1000;
+
+  int num_datums = 1000;
+  std::vector<Datum> datums(num_datums);
+  for (int i = 0; i < num_datums; ++i) {
+    datums[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    for (const Datum& datum : datums) {
+      auto slice = datum.array()->Slice(16, 64);

Review comment:
       Is it significantly different from slicing an array? Otherwise, is there a point for the specific "slice a datum" operation?

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());

Review comment:
       "items" is a bit ambiguous in this benchmark, but I would expect something else than the number of iterations. Perhaps the number of arrays yielded in the inner loop above?

##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "benchmark/benchmark.h"
+
+#include "arrow/compute/exec_internal.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/util/benchmark_util.h"
+
+namespace arrow {
+namespace compute {
+namespace detail {
+
+constexpr int32_t kSeed = 0xfede4a7e;
+
+void BM_ExecBatchIterator(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch
+  random::RandomArrayGenerator rag(kSeed);
+
+  const int64_t length = 1 << 20;
+  const int num_fields = 10;
+
+  std::vector<Datum> args(num_fields);
+  for (int i = 0; i < num_fields; ++i) {
+    args[i] = rag.Int64(length, 0, 100)->data();
+  }
+
+  for (auto _ : state) {
+    std::unique_ptr<ExecBatchIterator> it =
+        *ExecBatchIterator::Make(args, state.range(0));
+    ExecBatch batch;
+    while (it->Next(&batch)) {
+      for (int i = 0; i < num_fields; ++i) {
+        auto data = batch.values[i].array()->buffers[1]->data();
+        benchmark::DoNotOptimize(data);
+      }
+      continue;
+    }
+    benchmark::DoNotOptimize(batch);
+  }
+
+  state.SetItemsProcessed(state.iterations());
+}
+
+void BM_DatumSlice(benchmark::State& state) {
+  // Measure overhead related to deconstructing vector<Datum> into a sequence of ExecBatch

Review comment:
       Looks like this was copy-pasted?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on a change in pull request #9280: ARROW-8928: [C++] Add microbenchmarks to help measure ExecBatchIterator overhead

Posted by GitBox <gi...@apache.org>.
wesm commented on a change in pull request #9280:
URL: https://github.com/apache/arrow/pull/9280#discussion_r562121639



##########
File path: cpp/src/arrow/compute/internals_benchmark.cc
##########
@@ -0,0 +1,86 @@
+// Licensed to the Apache Software Foundation (ASF) under one

Review comment:
       I can merge these together then. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org