You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "WillAyd (via GitHub)" <gi...@apache.org> on 2023/03/31 06:11:02 UTC
[GitHub] [arrow] WillAyd opened a new pull request, #15041: GH-14937: [C++] String Sort / Rank Benchmarks
WillAyd opened a new pull request, #15041:
URL: https://github.com/apache/arrow/pull/15041
Follow up to #14938
* Closes: #14937
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1371244885
Gotcha - was going back in my head to thinking we wanted to measure the bytes allocated to the null array mask for all types, but if that's not the case makes sense on the fixed vs variable width types. I
So then I think this is in a good spot? Not sure if we wanted to go crazy with templating versus branching in the dedicated string function to set the `bytes_processed`. And within the string function I am branching with a `array->null_bitmap() == nullptr)` condition - I didn't see a clear winner of that versus `array->null_count() > 0` in the code base so not sure that matters
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1370080209
Ah OK - understanding how to read the output a bit more now thanks for the callouts. Think I have some ideas on how to tackle - will get back shortly
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] amol- closed pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- closed pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
URL: https://github.com/apache/arrow/pull/15041
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055664267
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
So my read on the situation is that RegressionSetArgs is basically parametrizing the test with the different CPU cache sizes plus one that exceeds cache, and we are trying to fit an array into those sizes. That seems easy enough to control with the primitive types, but I may be misunderstanding how the variable length random strings would be guaranteed to fit into that same cache even with `(min_length + max_length) / 2` without querying the size of the array at runtime. Also unclear if I need to account for the offsets buffer as part of the cache bounding requirement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1052902450
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -95,6 +95,25 @@ static void ChunkedArraySortFuncInt64Benchmark(benchmark::State& state,
ArraySortFuncBenchmark(state, runner, std::make_shared<ChunkedArray>(chunks));
}
+template <typename Runner>
+static void ChunkedArraySortFuncStringBenchmark(benchmark::State& state,
+ const Runner& runner, int64_t min_length,
+ int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t n_chunks = 10;
+ const int64_t array_size = args.size / n_chunks / sizeof(int64_t);
+ auto rand = random::RandomArrayGenerator(kSeed);
+
+ ArrayVector chunks;
+ for (int64_t i = 0; i < n_chunks; ++i) {
+ chunks.push_back(std::static_pointer_cast<StringArray>(
+ rand.String(array_size, min_length, max_length, args.null_proportion)));
Review Comment:
Doesn't appear so. I originally copied the cast from https://github.com/apache/arrow/blob/4e9b65a4a33f57005472eaf9f2654ae61ff83101/cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc#L133 where it also seems unnecessary - want me to remove there as well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1059189441
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -130,6 +164,18 @@ static void ArraySortIndicesBool(benchmark::State& state) {
ArraySortFuncBoolBenchmark(state, SortRunner(state));
}
+static void ArraySortIndicesString(benchmark::State& state) {
+ const auto min_length = 0;
+ const auto max_length = 32;
Review Comment:
No strong preference, though I think this way if we wanted to try more combinations of benchmarks in the future it would be easier to extend
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1358395294
* Closes: #14937
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055674843
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
As for whether offsets should be taken into account, there's no obviously right answer, but given that we don't count the null bitmap, I'd say we don't _need_ to count the offsets buffer either.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1380352325
I expect I'll have to rewrite / fix those benchmarks, but this is quite low-priority for me. For now the PR isn't ok:
* `~RegressionArgs` calls `SetBytesProcessed`
* the null bitmap shouldn't be taken into account, as it is not for other benchmarks (and we want figures to be comparable)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1053791676
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
Right on - enjoy the time off!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1359069389
Please, double-check that the reported `bytes_per_second` and `items_per_second` correspond to the actual numbers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1053772951
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
Hmm OK. IIUC we are parametrizing a benchmark with a bunch of different sized byte streams, the length of which is available through args.size. So is the suggested fix here to divide that by `sizeof(string)`? Or should it be `int32_t` to match the offset of `StringType`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1057417311
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -130,6 +164,18 @@ static void ArraySortIndicesBool(benchmark::State& state) {
ArraySortFuncBoolBenchmark(state, SortRunner(state));
}
+static void ArraySortIndicesString(benchmark::State& state) {
+ const auto min_length = 0;
+ const auto max_length = 32;
Review Comment:
Define as global const and remove duplicates?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] amol- commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1490644930
Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] amol- commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1507182337
@westonpace can you find the time to review this one? It has been practically sitting there since last year and it would be great to make a decision about its future to avoid keeping @WillAyd in the limbo 🙂
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055069896
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
Divides `sizeof(int64_t)` is not right. If `args.size` is in bytes, it should divides `(min_length + max_length) / 2`, but must also consider `null_proportions`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #15041: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1358393384
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose
Opening GitHub issues ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
Then could you also rename the pull request title in the following format?
GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
In the case of old issues on JIRA the title also supports:
ARROW-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1358394998
@pitrou
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055674117
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
The cache size-based settings are just an approximation, we don't care if the actual size fits exactly (so @cyb70289's proposed formula is fine).
What's more important is to have accurate figures for bytes/sec and items/sec.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1358911144
```sh
./release/arrow-compute-vector-sort-benchmark --benchmark_filter="String"
2022-12-19T22:47:03-08:00
Running ./release/arrow-compute-vector-sort-benchmark
Run on (12 X 4567.83 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 1280 KiB (x6)
L3 Unified 12288 KiB (x1)
Load Average: 7.85, 7.14, 3.35
----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------
ArraySortIndicesString/49152/10000 498 us 498 us 1436 bytes_per_second=94.0741M/s items_per_second=12.3305M/s null_percent=0.01 size=49.152k
ArraySortIndicesString/49152/100 494 us 494 us 1309 bytes_per_second=94.9745M/s items_per_second=12.4485M/s null_percent=1 size=49.152k
ArraySortIndicesString/49152/10 464 us 464 us 1517 bytes_per_second=101.073M/s items_per_second=13.2478M/s null_percent=10 size=49.152k
ArraySortIndicesString/49152/2 246 us 246 us 2850 bytes_per_second=190.649M/s items_per_second=24.9887M/s null_percent=50 size=49.152k
ArraySortIndicesString/49152/1 7.69 us 7.69 us 91466 bytes_per_second=5.95597G/s items_per_second=799.397M/s null_percent=100 size=49.152k
ArraySortIndicesString/49152/0 505 us 505 us 1362 bytes_per_second=92.871M/s items_per_second=12.1728M/s null_percent=0 size=49.152k
ArraySortIndicesString/1048576/100 15890 us 15886 us 44 bytes_per_second=62.9489M/s items_per_second=8.25084M/s null_percent=1 size=1048.58k
ArraySortIndicesString/8388608/100 186745 us 186648 us 4 bytes_per_second=42.8614M/s items_per_second=5.61793M/s null_percent=1 size=8.38861M
ChunkedArraySortIndicesString/49152/10000 611 us 611 us 1123 bytes_per_second=76.7591M/s items_per_second=10.0544M/s null_percent=0.01 size=49.152k
ChunkedArraySortIndicesString/49152/100 615 us 615 us 1135 bytes_per_second=76.2125M/s items_per_second=9.98282M/s null_percent=1 size=49.152k
ChunkedArraySortIndicesString/49152/10 561 us 561 us 1240 bytes_per_second=83.5067M/s items_per_second=10.9383M/s null_percent=10 size=49.152k
ChunkedArraySortIndicesString/49152/2 315 us 314 us 2210 bytes_per_second=149.073M/s items_per_second=19.5266M/s null_percent=50 size=49.152k
ChunkedArraySortIndicesString/49152/1 8.50 us 8.49 us 82709 bytes_per_second=5.3887G/s items_per_second=722.788M/s null_percent=100 size=49.152k
ChunkedArraySortIndicesString/49152/0 612 us 612 us 1136 bytes_per_second=76.622M/s items_per_second=10.0365M/s null_percent=0 size=49.152k
ChunkedArraySortIndicesString/1048576/100 18271 us 18266 us 38 bytes_per_second=54.7465M/s items_per_second=7.17562M/s null_percent=1 size=1048.58k
ChunkedArraySortIndicesString/8388608/100 214537 us 214485 us 3 bytes_per_second=37.2987M/s items_per_second=4.88879M/s null_percent=1 size=8.38861M
ArrayRankString/49152/10000/tiebreaker:0 517674 ns 517561 ns 1344 bytes_per_second=90.5691M/s items_per_second=11.8711M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:2 506983 ns 506881 ns 1375 bytes_per_second=92.4774M/s items_per_second=12.1212M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:3 526033 ns 525948 ns 1351 bytes_per_second=89.1247M/s items_per_second=11.6818M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/100/tiebreaker:0 516791 ns 516690 ns 1322 bytes_per_second=90.7217M/s items_per_second=11.8911M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:2 514438 ns 514341 ns 1350 bytes_per_second=91.136M/s items_per_second=11.9454M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:3 532658 ns 532582 ns 1358 bytes_per_second=88.0147M/s items_per_second=11.5363M/s null_percent=1 size=49.152k
ArrayRankString/49152/10/tiebreaker:0 486885 ns 486760 ns 1390 bytes_per_second=96.3M/s items_per_second=12.6222M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:2 479222 ns 479156 ns 1485 bytes_per_second=97.8283M/s items_per_second=12.8225M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:3 477106 ns 477040 ns 1455 bytes_per_second=98.2622M/s items_per_second=12.8794M/s null_percent=10 size=49.152k
ArrayRankString/49152/2/tiebreaker:0 258415 ns 258376 ns 2730 bytes_per_second=181.422M/s items_per_second=23.7793M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:2 249241 ns 249196 ns 2780 bytes_per_second=188.105M/s items_per_second=24.6553M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:3 261936 ns 261899 ns 2777 bytes_per_second=178.981M/s items_per_second=23.4594M/s null_percent=50 size=49.152k
ArrayRankString/49152/1/tiebreaker:0 8339 ns 8338 ns 83149 bytes_per_second=5.49025G/s items_per_second=736.889M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:2 8485 ns 8484 ns 82981 bytes_per_second=5.3955G/s items_per_second=724.172M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:3 8451 ns 8450 ns 80592 bytes_per_second=5.41745G/s items_per_second=727.118M/s null_percent=100 size=49.152k
ArrayRankString/49152/0/tiebreaker:0 520132 ns 520083 ns 1364 bytes_per_second=90.1299M/s items_per_second=11.8135M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:2 506557 ns 506498 ns 1383 bytes_per_second=92.5473M/s items_per_second=12.1304M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:3 513895 ns 513846 ns 1364 bytes_per_second=91.2238M/s items_per_second=11.9569M/s null_percent=0 size=49.152k
ArrayRankString/1048576/100/tiebreaker:0 16106428 ns 16105459 ns 44 bytes_per_second=62.0907M/s items_per_second=8.13836M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:2 15814357 ns 15811895 ns 44 bytes_per_second=63.2435M/s items_per_second=8.28946M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:3 16040548 ns 16039613 ns 44 bytes_per_second=62.3456M/s items_per_second=8.17177M/s null_percent=1 size=1048.58k
ArrayRankString/8388608/100/tiebreaker:0 196319844 ns 196274245 ns 4 bytes_per_second=40.7593M/s items_per_second=5.3424M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:2 188597361 ns 188574567 ns 4 bytes_per_second=42.4235M/s items_per_second=5.56054M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:3 195841917 ns 195800818 ns 4 bytes_per_second=40.8578M/s items_per_second=5.35532M/s null_percent=1 size=8.38861M
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1369866110
Hmm, this seems wrong:
```
ArraySortIndicesString/49152/1 5.20 us 5.20 us 133594 bytes_per_second=8.79969G/s items_per_second=590.537M/s null_percent=100 size=49.152k
```
For a string array with 100% nulls, the data buffer will probably be empty (meaning we should take the offsets into account after all?).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1370700462
> Wasn't expecting there to be much of a difference across these - I'm guessing the other types may be misreporting their benchmark since they don't take into account the proper sizing of an all null array?
Well, fixed-width types such as Int or Bool types don't need special care: for them, a null entry consumes the same amount of space as a non-null entry. String types are variable-width, though.
The differences look believable to me (different algorithms are used), though I admit the String numbers are better than I expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1371253933
Current benchmarks for reference
```sh
-----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------
ArraySortIndicesInt64Narrow/49152/10000 15.5 us 15.5 us 45676 bytes_per_second=2.94742G/s items_per_second=395.596M/s null_percent=0.01 size=49.152k
ArraySortIndicesInt64Narrow/49152/100 20.2 us 20.2 us 34423 bytes_per_second=2.26398G/s items_per_second=303.866M/s null_percent=1 size=49.152k
ArraySortIndicesInt64Narrow/49152/10 23.7 us 23.7 us 29206 bytes_per_second=1.9344G/s items_per_second=259.631M/s null_percent=10 size=49.152k
ArraySortIndicesInt64Narrow/49152/2 49.6 us 49.5 us 13676 bytes_per_second=947.059M/s items_per_second=124.133M/s null_percent=50 size=49.152k
ArraySortIndicesInt64Narrow/49152/1 8.14 us 8.13 us 88501 bytes_per_second=5.63026G/s items_per_second=755.681M/s null_percent=100 size=49.152k
ArraySortIndicesInt64Narrow/49152/0 15.9 us 15.9 us 43404 bytes_per_second=2.8793G/s items_per_second=386.453M/s null_percent=0 size=49.152k
ArraySortIndicesInt64Narrow/1048576/100 410 us 410 us 1642 bytes_per_second=2.38173G/s items_per_second=319.671M/s null_percent=1 size=1048.58k
ArraySortIndicesInt64Narrow/8388608/100 7132 us 7128 us 109 bytes_per_second=1122.4M/s items_per_second=147.115M/s null_percent=1 size=8.38861M
ArraySortIndicesInt64Wide/49152/10000 335 us 335 us 2057 bytes_per_second=139.945M/s items_per_second=18.3429M/s null_percent=0.01 size=49.152k
ArraySortIndicesInt64Wide/49152/100 335 us 335 us 2076 bytes_per_second=139.889M/s items_per_second=18.3355M/s null_percent=1 size=49.152k
ArraySortIndicesInt64Wide/49152/10 315 us 315 us 2224 bytes_per_second=149.004M/s items_per_second=19.5303M/s null_percent=10 size=49.152k
ArraySortIndicesInt64Wide/49152/2 190 us 190 us 3649 bytes_per_second=246.172M/s items_per_second=32.2662M/s null_percent=50 size=49.152k
ArraySortIndicesInt64Wide/49152/1 8.18 us 8.18 us 87137 bytes_per_second=5.59537G/s items_per_second=750.997M/s null_percent=100 size=49.152k
ArraySortIndicesInt64Wide/49152/0 333 us 333 us 2093 bytes_per_second=140.882M/s items_per_second=18.4657M/s null_percent=0 size=49.152k
ArraySortIndicesInt64Wide/1048576/100 12247 us 12234 us 64 bytes_per_second=81.7388M/s items_per_second=10.7137M/s null_percent=1 size=1048.58k
ArraySortIndicesInt64Wide/8388608/100 153351 us 153266 us 5 bytes_per_second=52.1967M/s items_per_second=6.84153M/s null_percent=1 size=8.38861M
ArraySortIndicesBool/49152/10000 570 us 569 us 1224 bytes_per_second=82.3192M/s items_per_second=690.543M/s null_percent=0.01 size=49.152k
ArraySortIndicesBool/49152/100 645 us 645 us 1050 bytes_per_second=72.6789M/s items_per_second=609.675M/s null_percent=1 size=49.152k
ArraySortIndicesBool/49152/10 956 us 955 us 732 bytes_per_second=49.0653M/s items_per_second=411.59M/s null_percent=10 size=49.152k
ArraySortIndicesBool/49152/2 2086 us 2086 us 334 bytes_per_second=22.4759M/s items_per_second=188.541M/s null_percent=50 size=49.152k
ArraySortIndicesBool/49152/1 205 us 205 us 3488 bytes_per_second=228.577M/s items_per_second=1.91744G/s null_percent=100 size=49.152k
ArraySortIndicesBool/49152/0 2028 us 2027 us 341 bytes_per_second=23.1225M/s items_per_second=193.966M/s null_percent=0 size=49.152k
ArraySortIndicesBool/1048576/100 43119 us 43115 us 16 bytes_per_second=23.1938M/s items_per_second=194.563M/s null_percent=1 size=1048.58k
ArraySortIndicesBool/8388608/100 345296 us 345236 us 2 bytes_per_second=23.1726M/s items_per_second=194.385M/s null_percent=1 size=8.38861M
ArraySortIndicesString/49152/10000 265 us 265 us 2673 bytes_per_second=176.8M/s items_per_second=11.5868M/s null_percent=0.01 size=49.152k
ArraySortIndicesString/49152/100 262 us 262 us 2686 bytes_per_second=179.18M/s items_per_second=11.7427M/s null_percent=1 size=49.152k
ArraySortIndicesString/49152/10 244 us 244 us 2916 bytes_per_second=192.263M/s items_per_second=12.6001M/s null_percent=10 size=49.152k
ArraySortIndicesString/49152/2 118 us 118 us 5974 bytes_per_second=397.285M/s items_per_second=26.0365M/s null_percent=50 size=49.152k
ArraySortIndicesString/49152/1 5.46 us 5.46 us 130624 bytes_per_second=8.38354G/s items_per_second=562.61M/s null_percent=100 size=49.152k
ArraySortIndicesString/49152/0 266 us 266 us 2709 bytes_per_second=176.125M/s items_per_second=11.5425M/s null_percent=0 size=49.152k
ArraySortIndicesString/1048576/100 9262 us 9259 us 74 bytes_per_second=108.001M/s items_per_second=7.07793M/s null_percent=1 size=1048.58k
ArraySortIndicesString/8388608/100 110329 us 110300 us 7 bytes_per_second=72.5294M/s items_per_second=4.75329M/s null_percent=1 size=8.38861M
ChunkedArraySortIndicesInt64Narrow/49152/10000 455 us 455 us 1531 bytes_per_second=102.967M/s items_per_second=13.4873M/s null_percent=0.01 size=49.152k
ChunkedArraySortIndicesInt64Narrow/49152/100 470 us 470 us 1505 bytes_per_second=99.6994M/s items_per_second=13.0593M/s null_percent=1 size=49.152k
ChunkedArraySortIndicesInt64Narrow/49152/10 436 us 436 us 1604 bytes_per_second=107.534M/s items_per_second=14.0855M/s null_percent=10 size=49.152k
ChunkedArraySortIndicesInt64Narrow/49152/2 293 us 293 us 2416 bytes_per_second=159.811M/s items_per_second=20.9331M/s null_percent=50 size=49.152k
ChunkedArraySortIndicesInt64Narrow/49152/1 10.5 us 10.5 us 67293 bytes_per_second=4.34396G/s items_per_second=582.657M/s null_percent=100 size=49.152k
ChunkedArraySortIndicesInt64Narrow/49152/0 463 us 463 us 1510 bytes_per_second=101.328M/s items_per_second=13.2727M/s null_percent=0 size=49.152k
ChunkedArraySortIndicesInt64Narrow/1048576/100 2554 us 2553 us 277 bytes_per_second=391.623M/s items_per_second=51.33M/s null_percent=1 size=1048.58k
ChunkedArraySortIndicesInt64Narrow/8388608/100 56881 us 56865 us 10 bytes_per_second=140.685M/s items_per_second=18.4398M/s null_percent=1 size=8.38861M
ChunkedArraySortIndicesInt64Wide/49152/10000 693 us 693 us 1023 bytes_per_second=67.6395M/s items_per_second=8.85988M/s null_percent=0.01 size=49.152k
ChunkedArraySortIndicesInt64Wide/49152/100 662 us 662 us 972 bytes_per_second=70.8239M/s items_per_second=9.27699M/s null_percent=1 size=49.152k
ChunkedArraySortIndicesInt64Wide/49152/10 609 us 609 us 1138 bytes_per_second=76.9268M/s items_per_second=10.0764M/s null_percent=10 size=49.152k
ChunkedArraySortIndicesInt64Wide/49152/2 389 us 388 us 1949 bytes_per_second=120.674M/s items_per_second=15.8067M/s null_percent=50 size=49.152k
ChunkedArraySortIndicesInt64Wide/49152/1 11.5 us 11.5 us 64381 bytes_per_second=3.96721G/s items_per_second=532.123M/s null_percent=100 size=49.152k
ChunkedArraySortIndicesInt64Wide/49152/0 646 us 646 us 979 bytes_per_second=72.5378M/s items_per_second=9.50148M/s null_percent=0 size=49.152k
ChunkedArraySortIndicesInt64Wide/1048576/100 18976 us 18966 us 40 bytes_per_second=52.7265M/s items_per_second=6.91086M/s null_percent=1 size=1048.58k
ChunkedArraySortIndicesInt64Wide/8388608/100 199157 us 199095 us 3 bytes_per_second=40.1817M/s items_per_second=5.26667M/s null_percent=1 size=8.38861M
ChunkedArraySortIndicesString/49152/10000 315 us 315 us 2193 bytes_per_second=148.941M/s items_per_second=9.75464M/s null_percent=0.01 size=49.152k
ChunkedArraySortIndicesString/49152/100 318 us 318 us 2184 bytes_per_second=147.247M/s items_per_second=9.6437M/s null_percent=1 size=49.152k
ChunkedArraySortIndicesString/49152/10 285 us 285 us 2438 bytes_per_second=164.451M/s items_per_second=10.7704M/s null_percent=10 size=49.152k
ChunkedArraySortIndicesString/49152/2 149 us 149 us 4685 bytes_per_second=315.57M/s items_per_second=20.6677M/s null_percent=50 size=49.152k
ChunkedArraySortIndicesString/49152/1 6.51 us 6.51 us 109635 bytes_per_second=7.03542G/s items_per_second=471.832M/s null_percent=100 size=49.152k
ChunkedArraySortIndicesString/49152/0 316 us 316 us 2194 bytes_per_second=148.195M/s items_per_second=9.70581M/s null_percent=0 size=49.152k
ChunkedArraySortIndicesString/1048576/100 10209 us 10208 us 70 bytes_per_second=97.9634M/s items_per_second=6.41954M/s null_percent=1 size=1048.58k
ChunkedArraySortIndicesString/8388608/100 123473 us 123454 us 6 bytes_per_second=64.8016M/s items_per_second=4.24677M/s null_percent=1 size=8.38861M
RecordBatchSortIndicesInt64Narrow/1048576/100/16 546161727 ns 536132413 ns 1 columns=16 items_per_second=1.95582M/s null_percent=1
RecordBatchSortIndicesInt64Narrow/1048576/4/16 729039256 ns 728891549 ns 1 columns=16 items_per_second=1.43859M/s null_percent=25
RecordBatchSortIndicesInt64Narrow/1048576/0/16 458123788 ns 458102128 ns 2 columns=16 items_per_second=2.28896M/s null_percent=0
RecordBatchSortIndicesInt64Narrow/1048576/100/8 311081842 ns 311045818 ns 2 columns=8 items_per_second=3.37113M/s null_percent=1
RecordBatchSortIndicesInt64Narrow/1048576/4/8 293342274 ns 293184202 ns 2 columns=8 items_per_second=3.57651M/s null_percent=25
RecordBatchSortIndicesInt64Narrow/1048576/0/8 291880160 ns 291805165 ns 2 columns=8 items_per_second=3.59341M/s null_percent=0
RecordBatchSortIndicesInt64Narrow/1048576/100/2 197146592 ns 197141030 ns 3 columns=2 items_per_second=5.31891M/s null_percent=1
RecordBatchSortIndicesInt64Narrow/1048576/4/2 157202426 ns 157176270 ns 5 columns=2 items_per_second=6.67134M/s null_percent=25
RecordBatchSortIndicesInt64Narrow/1048576/0/2 218459061 ns 218320080 ns 4 columns=2 items_per_second=4.80293M/s null_percent=0
RecordBatchSortIndicesInt64Narrow/1048576/100/1 9644101 ns 9628726 ns 65 columns=1 items_per_second=108.901M/s null_percent=1
RecordBatchSortIndicesInt64Narrow/1048576/4/1 14704447 ns 14702499 ns 47 columns=1 items_per_second=71.3196M/s null_percent=25
RecordBatchSortIndicesInt64Narrow/1048576/0/1 7269451 ns 7267066 ns 89 columns=1 items_per_second=144.292M/s null_percent=0
RecordBatchSortIndicesInt64Wide/1048576/100/16 162878934 ns 162861603 ns 4 columns=16 items_per_second=6.43845M/s null_percent=1
RecordBatchSortIndicesInt64Wide/1048576/4/16 206767491 ns 206723881 ns 3 columns=16 items_per_second=5.07235M/s null_percent=25
RecordBatchSortIndicesInt64Wide/1048576/0/16 158269187 ns 158253366 ns 4 columns=16 items_per_second=6.62593M/s null_percent=0
RecordBatchSortIndicesInt64Wide/1048576/100/8 135000467 ns 134965348 ns 5 columns=8 items_per_second=7.76922M/s null_percent=1
RecordBatchSortIndicesInt64Wide/1048576/4/8 141942171 ns 141856892 ns 5 columns=8 items_per_second=7.39179M/s null_percent=25
RecordBatchSortIndicesInt64Wide/1048576/0/8 137735971 ns 137671519 ns 5 columns=8 items_per_second=7.61651M/s null_percent=0
RecordBatchSortIndicesInt64Wide/1048576/100/2 134019244 ns 133990177 ns 5 columns=2 items_per_second=7.82577M/s null_percent=1
RecordBatchSortIndicesInt64Wide/1048576/4/2 139541245 ns 139521988 ns 5 columns=2 items_per_second=7.51549M/s null_percent=25
RecordBatchSortIndicesInt64Wide/1048576/0/2 137039816 ns 137016566 ns 5 columns=2 items_per_second=7.65291M/s null_percent=0
RecordBatchSortIndicesInt64Wide/1048576/100/1 130024193 ns 130002196 ns 5 columns=1 items_per_second=8.06583M/s null_percent=1
RecordBatchSortIndicesInt64Wide/1048576/4/1 103095595 ns 103082158 ns 7 columns=1 items_per_second=10.1722M/s null_percent=25
RecordBatchSortIndicesInt64Wide/1048576/0/1 133555623 ns 133540078 ns 5 columns=1 items_per_second=7.85214M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/16/32 538922286 ns 538814541 ns 1 chunks=32 columns=16 items_per_second=1.94608M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/16/32 607403337 ns 607225427 ns 1 chunks=32 columns=16 items_per_second=1.72683M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/16/32 554842186 ns 554332751 ns 1 chunks=32 columns=16 items_per_second=1.8916M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/8/32 575178126 ns 574971073 ns 1 chunks=32 columns=8 items_per_second=1.8237M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/8/32 656657039 ns 656534559 ns 1 chunks=32 columns=8 items_per_second=1.59714M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/8/32 610865260 ns 610781139 ns 1 chunks=32 columns=8 items_per_second=1.71678M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/2/32 345761583 ns 345582480 ns 2 chunks=32 columns=2 items_per_second=3.03423M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/2/32 303500856 ns 302936128 ns 2 chunks=32 columns=2 items_per_second=3.46138M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/2/32 338505433 ns 338347509 ns 2 chunks=32 columns=2 items_per_second=3.09911M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/32 52370141 ns 52356937 ns 14 chunks=32 columns=1 items_per_second=20.0275M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/32 48395924 ns 48386814 ns 14 chunks=32 columns=1 items_per_second=21.6707M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/32 46487213 ns 46478828 ns 15 chunks=32 columns=1 items_per_second=22.5603M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/16/4 379664057 ns 379550752 ns 2 chunks=4 columns=16 items_per_second=2.76268M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/16/4 434667384 ns 434507921 ns 2 chunks=4 columns=16 items_per_second=2.41325M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/16/4 451171115 ns 451004582 ns 2 chunks=4 columns=16 items_per_second=2.32498M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/8/4 514766509 ns 514689321 ns 1 chunks=4 columns=8 items_per_second=2.0373M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/8/4 517120654 ns 517008925 ns 1 chunks=4 columns=8 items_per_second=2.02816M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/8/4 328218902 ns 328209329 ns 2 chunks=4 columns=8 items_per_second=3.19484M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/2/4 219996516 ns 219992041 ns 3 chunks=4 columns=2 items_per_second=4.76643M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/2/4 182214649 ns 182184224 ns 4 chunks=4 columns=2 items_per_second=5.75558M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/2/4 203589493 ns 203565137 ns 3 chunks=4 columns=2 items_per_second=5.15106M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/4 27873663 ns 27871123 ns 25 chunks=4 columns=1 items_per_second=37.6223M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/4 29274627 ns 29271625 ns 24 chunks=4 columns=1 items_per_second=35.8223M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/4 26093800 ns 26091197 ns 27 chunks=4 columns=1 items_per_second=40.1889M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/16/1 240431040 ns 240383326 ns 3 chunks=1 columns=16 items_per_second=4.3621M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/16/1 244027959 ns 244024461 ns 3 chunks=1 columns=16 items_per_second=4.29701M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/16/1 233650110 ns 233628518 ns 3 chunks=1 columns=16 items_per_second=4.48822M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/8/1 250416410 ns 250292024 ns 3 chunks=1 columns=8 items_per_second=4.18941M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/8/1 247212150 ns 247127546 ns 3 chunks=1 columns=8 items_per_second=4.24306M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/8/1 246898831 ns 246735603 ns 3 chunks=1 columns=8 items_per_second=4.2498M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/2/1 170236300 ns 170103354 ns 4 chunks=1 columns=2 items_per_second=6.16435M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/2/1 129738934 ns 129680884 ns 5 chunks=1 columns=2 items_per_second=8.08582M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/2/1 164768531 ns 164696132 ns 4 chunks=1 columns=2 items_per_second=6.36673M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/1 7142175 ns 7137741 ns 109 chunks=1 columns=1 items_per_second=146.906M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/1 13726694 ns 13724758 ns 53 chunks=1 columns=1 items_per_second=76.4003M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/1 6291511 ns 6289629 ns 111 chunks=1 columns=1 items_per_second=166.715M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/16/32 282154779 ns 282102362 ns 2 chunks=32 columns=16 items_per_second=3.71701M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/16/32 323114414 ns 323038619 ns 2 chunks=32 columns=16 items_per_second=3.24598M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/16/32 283838788 ns 283779149 ns 2 chunks=32 columns=16 items_per_second=3.69504M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/8/32 274442755 ns 274391367 ns 3 chunks=32 columns=8 items_per_second=3.82146M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/8/32 319674204 ns 319640066 ns 2 chunks=32 columns=8 items_per_second=3.28049M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/8/32 283917345 ns 283882309 ns 3 chunks=32 columns=8 items_per_second=3.6937M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/2/32 288733571 ns 288667813 ns 2 chunks=32 columns=2 items_per_second=3.63247M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/2/32 291170632 ns 291109031 ns 2 chunks=32 columns=2 items_per_second=3.602M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/2/32 288625666 ns 288599262 ns 2 chunks=32 columns=2 items_per_second=3.63333M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/1/32 246251220 ns 246229459 ns 3 chunks=32 columns=1 items_per_second=4.25853M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/1/32 194856808 ns 194806344 ns 3 chunks=32 columns=1 items_per_second=5.38266M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/1/32 252999594 ns 252941949 ns 3 chunks=32 columns=1 items_per_second=4.14552M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/16/4 220412846 ns 220254527 ns 4 chunks=4 columns=16 items_per_second=4.76075M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/16/4 197479989 ns 197391243 ns 4 chunks=4 columns=16 items_per_second=5.31217M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/16/4 170714583 ns 170557557 ns 4 chunks=4 columns=16 items_per_second=6.14793M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/8/4 163640957 ns 163561290 ns 4 chunks=4 columns=8 items_per_second=6.41091M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/8/4 186365610 ns 186320992 ns 3 chunks=4 columns=8 items_per_second=5.62779M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/8/4 164496404 ns 164403010 ns 4 chunks=4 columns=8 items_per_second=6.37808M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/2/4 190723210 ns 190560487 ns 4 chunks=4 columns=2 items_per_second=5.50259M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/2/4 187648075 ns 187572701 ns 4 chunks=4 columns=2 items_per_second=5.59024M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/2/4 170689415 ns 170661996 ns 4 chunks=4 columns=2 items_per_second=6.14417M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/1/4 155087135 ns 155063896 ns 5 chunks=4 columns=1 items_per_second=6.76222M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/1/4 123836520 ns 123741591 ns 6 chunks=4 columns=1 items_per_second=8.47392M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/1/4 158552236 ns 158514442 ns 5 chunks=4 columns=1 items_per_second=6.61502M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/16/1 146697643 ns 146658003 ns 5 chunks=1 columns=16 items_per_second=7.1498M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/16/1 144550680 ns 144524302 ns 5 chunks=1 columns=16 items_per_second=7.25536M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/16/1 134424882 ns 134397374 ns 5 chunks=1 columns=16 items_per_second=7.80206M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/8/1 137243734 ns 137219829 ns 5 chunks=1 columns=8 items_per_second=7.64158M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/8/1 136096084 ns 136061325 ns 5 chunks=1 columns=8 items_per_second=7.70664M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/8/1 148885606 ns 148858543 ns 5 chunks=1 columns=8 items_per_second=7.04411M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/2/1 137272404 ns 137257964 ns 5 chunks=1 columns=2 items_per_second=7.63945M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/2/1 130897198 ns 130875750 ns 5 chunks=1 columns=2 items_per_second=8.012M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/2/1 135466136 ns 135438513 ns 5 chunks=1 columns=2 items_per_second=7.74208M/s null_percent=0
TableSortIndicesInt64Wide/1048576/100/1/1 134072903 ns 134064678 ns 5 chunks=1 columns=1 items_per_second=7.82142M/s null_percent=1
TableSortIndicesInt64Wide/1048576/4/1/1 111578330 ns 111556659 ns 6 chunks=1 columns=1 items_per_second=9.39949M/s null_percent=25
TableSortIndicesInt64Wide/1048576/0/1/1 138848183 ns 138800246 ns 6 chunks=1 columns=1 items_per_second=7.55457M/s null_percent=0
ArrayRankInt64Narrow/49152/10000/tiebreaker:0 23040 ns 23036 ns 29851 bytes_per_second=1.98713G/s items_per_second=266.708M/s null_percent=0.01 size=49.152k
ArrayRankInt64Narrow/49152/10000/tiebreaker:2 20612 ns 20608 ns 34086 bytes_per_second=2.22125G/s items_per_second=298.131M/s null_percent=0.01 size=49.152k
ArrayRankInt64Narrow/49152/10000/tiebreaker:3 23091 ns 23088 ns 30172 bytes_per_second=1.98265G/s items_per_second=266.107M/s null_percent=0.01 size=49.152k
ArrayRankInt64Narrow/49152/100/tiebreaker:0 27391 ns 27388 ns 25435 bytes_per_second=1.6714G/s items_per_second=224.331M/s null_percent=1 size=49.152k
ArrayRankInt64Narrow/49152/100/tiebreaker:2 25114 ns 25111 ns 27301 bytes_per_second=1.82293G/s items_per_second=244.669M/s null_percent=1 size=49.152k
ArrayRankInt64Narrow/49152/100/tiebreaker:3 27654 ns 27650 ns 25459 bytes_per_second=1.65555G/s items_per_second=222.204M/s null_percent=1 size=49.152k
ArrayRankInt64Narrow/49152/10/tiebreaker:0 29776 ns 29774 ns 23250 bytes_per_second=1.53745G/s items_per_second=206.353M/s null_percent=10 size=49.152k
ArrayRankInt64Narrow/49152/10/tiebreaker:2 27764 ns 27761 ns 25327 bytes_per_second=1.64894G/s items_per_second=221.317M/s null_percent=10 size=49.152k
ArrayRankInt64Narrow/49152/10/tiebreaker:3 30210 ns 30204 ns 23210 bytes_per_second=1.51559G/s items_per_second=203.42M/s null_percent=10 size=49.152k
ArrayRankInt64Narrow/49152/2/tiebreaker:0 55596 ns 55587 ns 12018 bytes_per_second=843.27M/s items_per_second=110.529M/s null_percent=50 size=49.152k
ArrayRankInt64Narrow/49152/2/tiebreaker:2 51985 ns 51979 ns 13227 bytes_per_second=901.812M/s items_per_second=118.202M/s null_percent=50 size=49.152k
ArrayRankInt64Narrow/49152/2/tiebreaker:3 56451 ns 56438 ns 12399 bytes_per_second=830.551M/s items_per_second=108.862M/s null_percent=50 size=49.152k
ArrayRankInt64Narrow/49152/1/tiebreaker:0 9531 ns 9530 ns 75879 bytes_per_second=4.80362G/s items_per_second=644.731M/s null_percent=100 size=49.152k
ArrayRankInt64Narrow/49152/1/tiebreaker:2 9371 ns 9370 ns 75221 bytes_per_second=4.88546G/s items_per_second=655.715M/s null_percent=100 size=49.152k
ArrayRankInt64Narrow/49152/1/tiebreaker:3 9706 ns 9703 ns 73698 bytes_per_second=4.71774G/s items_per_second=633.204M/s null_percent=100 size=49.152k
ArrayRankInt64Narrow/49152/0/tiebreaker:0 22030 ns 22029 ns 31224 bytes_per_second=2.07799G/s items_per_second=278.903M/s null_percent=0 size=49.152k
ArrayRankInt64Narrow/49152/0/tiebreaker:2 19646 ns 19644 ns 35662 bytes_per_second=2.33027G/s items_per_second=312.763M/s null_percent=0 size=49.152k
ArrayRankInt64Narrow/49152/0/tiebreaker:3 23672 ns 23668 ns 31362 bytes_per_second=1.93407G/s items_per_second=259.586M/s null_percent=0 size=49.152k
ArrayRankInt64Narrow/1048576/100/tiebreaker:0 888323 ns 888120 ns 621 bytes_per_second=1125.97M/s items_per_second=147.584M/s null_percent=1 size=1048.58k
ArrayRankInt64Narrow/1048576/100/tiebreaker:2 619920 ns 619820 ns 1122 bytes_per_second=1.57556G/s items_per_second=211.468M/s null_percent=1 size=1048.58k
ArrayRankInt64Narrow/1048576/100/tiebreaker:3 782186 ns 782118 ns 899 bytes_per_second=1.24861G/s items_per_second=167.586M/s null_percent=1 size=1048.58k
ArrayRankInt64Narrow/8388608/100/tiebreaker:0 20591010 ns 20585729 ns 35 bytes_per_second=388.619M/s items_per_second=50.937M/s null_percent=1 size=8.38861M
ArrayRankInt64Narrow/8388608/100/tiebreaker:2 12675380 ns 12672631 ns 54 bytes_per_second=631.282M/s items_per_second=82.7434M/s null_percent=1 size=8.38861M
ArrayRankInt64Narrow/8388608/100/tiebreaker:3 21603586 ns 21596951 ns 32 bytes_per_second=370.423M/s items_per_second=48.552M/s null_percent=1 size=8.38861M
ArrayRankInt64Wide/49152/10000/tiebreaker:0 362997 ns 362984 ns 1909 bytes_per_second=129.138M/s items_per_second=16.9264M/s null_percent=0.01 size=49.152k
ArrayRankInt64Wide/49152/10000/tiebreaker:2 357339 ns 357323 ns 1954 bytes_per_second=131.184M/s items_per_second=17.1945M/s null_percent=0.01 size=49.152k
ArrayRankInt64Wide/49152/10000/tiebreaker:3 363036 ns 363032 ns 1938 bytes_per_second=129.121M/s items_per_second=16.9241M/s null_percent=0.01 size=49.152k
ArrayRankInt64Wide/49152/100/tiebreaker:0 362013 ns 361995 ns 1937 bytes_per_second=129.491M/s items_per_second=16.9726M/s null_percent=1 size=49.152k
ArrayRankInt64Wide/49152/100/tiebreaker:2 359903 ns 359859 ns 1882 bytes_per_second=130.259M/s items_per_second=17.0733M/s null_percent=1 size=49.152k
ArrayRankInt64Wide/49152/100/tiebreaker:3 362354 ns 362257 ns 1940 bytes_per_second=129.397M/s items_per_second=16.9603M/s null_percent=1 size=49.152k
ArrayRankInt64Wide/49152/10/tiebreaker:0 346142 ns 346024 ns 2053 bytes_per_second=135.467M/s items_per_second=17.756M/s null_percent=10 size=49.152k
ArrayRankInt64Wide/49152/10/tiebreaker:2 329143 ns 329029 ns 2077 bytes_per_second=142.465M/s items_per_second=18.6731M/s null_percent=10 size=49.152k
ArrayRankInt64Wide/49152/10/tiebreaker:3 329129 ns 329057 ns 2104 bytes_per_second=142.452M/s items_per_second=18.6715M/s null_percent=10 size=49.152k
ArrayRankInt64Wide/49152/2/tiebreaker:0 197943 ns 197922 ns 3500 bytes_per_second=236.836M/s items_per_second=31.0426M/s null_percent=50 size=49.152k
ArrayRankInt64Wide/49152/2/tiebreaker:2 208030 ns 207852 ns 3548 bytes_per_second=225.521M/s items_per_second=29.5594M/s null_percent=50 size=49.152k
ArrayRankInt64Wide/49152/2/tiebreaker:3 199754 ns 199705 ns 3501 bytes_per_second=234.722M/s items_per_second=30.7654M/s null_percent=50 size=49.152k
ArrayRankInt64Wide/49152/1/tiebreaker:0 9471 ns 9470 ns 74005 bytes_per_second=4.83392G/s items_per_second=648.798M/s null_percent=100 size=49.152k
ArrayRankInt64Wide/49152/1/tiebreaker:2 9336 ns 9334 ns 74274 bytes_per_second=4.90436G/s items_per_second=658.253M/s null_percent=100 size=49.152k
ArrayRankInt64Wide/49152/1/tiebreaker:3 9536 ns 9533 ns 74531 bytes_per_second=4.8019G/s items_per_second=644.5M/s null_percent=100 size=49.152k
ArrayRankInt64Wide/49152/0/tiebreaker:0 345124 ns 345121 ns 2021 bytes_per_second=135.822M/s items_per_second=17.8024M/s null_percent=0 size=49.152k
ArrayRankInt64Wide/49152/0/tiebreaker:2 347009 ns 346879 ns 2021 bytes_per_second=135.134M/s items_per_second=17.7123M/s null_percent=0 size=49.152k
ArrayRankInt64Wide/49152/0/tiebreaker:3 367599 ns 367395 ns 2013 bytes_per_second=127.587M/s items_per_second=16.7231M/s null_percent=0 size=49.152k
ArrayRankInt64Wide/1048576/100/tiebreaker:0 11470269 ns 11467821 ns 46 bytes_per_second=87.2005M/s items_per_second=11.4295M/s null_percent=1 size=1048.58k
ArrayRankInt64Wide/1048576/100/tiebreaker:2 11284156 ns 11283298 ns 62 bytes_per_second=88.6266M/s items_per_second=11.6165M/s null_percent=1 size=1048.58k
ArrayRankInt64Wide/1048576/100/tiebreaker:3 11370761 ns 11369963 ns 62 bytes_per_second=87.951M/s items_per_second=11.5279M/s null_percent=1 size=1048.58k
ArrayRankInt64Wide/8388608/100/tiebreaker:0 138620083 ns 138584564 ns 5 bytes_per_second=57.7265M/s items_per_second=7.56633M/s null_percent=1 size=8.38861M
ArrayRankInt64Wide/8388608/100/tiebreaker:2 128852309 ns 128837561 ns 5 bytes_per_second=62.0937M/s items_per_second=8.13874M/s null_percent=1 size=8.38861M
ArrayRankInt64Wide/8388608/100/tiebreaker:3 136414474 ns 136392395 ns 5 bytes_per_second=58.6543M/s items_per_second=7.68794M/s null_percent=1 size=8.38861M
ArrayRankString/49152/10000/tiebreaker:0 252891 ns 252872 ns 2767 bytes_per_second=185.371M/s items_per_second=12.1484M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:2 241034 ns 241000 ns 2909 bytes_per_second=194.502M/s items_per_second=12.7469M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:3 262325 ns 262200 ns 2638 bytes_per_second=178.775M/s items_per_second=11.7162M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/100/tiebreaker:0 261780 ns 261610 ns 2654 bytes_per_second=179.179M/s items_per_second=11.7427M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:2 258946 ns 258730 ns 2399 bytes_per_second=181.173M/s items_per_second=11.8734M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:3 251474 ns 251410 ns 2766 bytes_per_second=186.449M/s items_per_second=12.2191M/s null_percent=1 size=49.152k
ArrayRankString/49152/10/tiebreaker:0 250308 ns 249992 ns 3034 bytes_per_second=187.506M/s items_per_second=12.2884M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:2 260481 ns 260124 ns 2736 bytes_per_second=180.202M/s items_per_second=11.8097M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:3 272545 ns 272077 ns 2748 bytes_per_second=172.286M/s items_per_second=11.2909M/s null_percent=10 size=49.152k
ArrayRankString/49152/2/tiebreaker:0 142936 ns 142583 ns 5295 bytes_per_second=328.756M/s items_per_second=21.5453M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:2 123341 ns 123271 ns 4851 bytes_per_second=380.259M/s items_per_second=24.9206M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:3 142482 ns 142343 ns 5180 bytes_per_second=329.31M/s items_per_second=21.5817M/s null_percent=50 size=49.152k
ArrayRankString/49152/1/tiebreaker:0 6656 ns 6635 ns 118222 bytes_per_second=6.89882G/s items_per_second=462.972M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:2 6141 ns 6140 ns 93642 bytes_per_second=7.45517G/s items_per_second=500.308M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:3 5768 ns 5766 ns 127699 bytes_per_second=7.9397G/s items_per_second=532.824M/s null_percent=100 size=49.152k
ArrayRankString/49152/0/tiebreaker:0 289651 ns 289505 ns 2585 bytes_per_second=161.914M/s items_per_second=10.6112M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:2 295212 ns 294382 ns 2482 bytes_per_second=159.232M/s items_per_second=10.4354M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:3 283389 ns 283194 ns 2464 bytes_per_second=165.523M/s items_per_second=10.8477M/s null_percent=0 size=49.152k
ArrayRankString/1048576/100/tiebreaker:0 9886625 ns 9882906 ns 74 bytes_per_second=101.185M/s items_per_second=6.63125M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:2 10081503 ns 10079435 ns 70 bytes_per_second=99.2119M/s items_per_second=6.50195M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:3 10734076 ns 10717697 ns 76 bytes_per_second=93.3036M/s items_per_second=6.11475M/s null_percent=1 size=1048.58k
ArrayRankString/8388608/100/tiebreaker:0 220248563 ns 219022350 ns 3 bytes_per_second=36.526M/s items_per_second=2.39376M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:2 185752042 ns 185437409 ns 4 bytes_per_second=43.1412M/s items_per_second=2.8273M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:3 297635837 ns 296775194 ns 3 bytes_per_second=26.9564M/s items_per_second=1.76662M/s null_percent=1 size=8.38861M
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] amol- commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1508884759
> @amol @westonpace I took the code from this PR and fixed the remaining issues in #34811
Good, so I think we can close this one and continue the discussion in the other one 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] amol- closed pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "amol- (via GitHub)" <gi...@apache.org>.
amol- closed pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
URL: https://github.com/apache/arrow/pull/15041
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1052891524
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -95,6 +95,25 @@ static void ChunkedArraySortFuncInt64Benchmark(benchmark::State& state,
ArraySortFuncBenchmark(state, runner, std::make_shared<ChunkedArray>(chunks));
}
+template <typename Runner>
+static void ChunkedArraySortFuncStringBenchmark(benchmark::State& state,
+ const Runner& runner, int64_t min_length,
+ int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t n_chunks = 10;
+ const int64_t array_size = args.size / n_chunks / sizeof(int64_t);
+ auto rand = random::RandomArrayGenerator(kSeed);
+
+ ArrayVector chunks;
+ for (int64_t i = 0; i < n_chunks; ++i) {
+ chunks.push_back(std::static_pointer_cast<StringArray>(
+ rand.String(array_size, min_length, max_length, args.null_proportion)));
Review Comment:
Is this pointer cast necessary?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1053091881
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
This leads to incorrect bytes/second and items/second reporting. See `RegressionSetArgs`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1053774592
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
No, if we want a bytes/second metric, we need to give GTest the actual byte size of the string data.
Again, I'll let you look a bit more in detail at what `RegressionSetArgs` does. I'm on vacation currently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055664267
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
So my read on the situation is that RegressionSetArgs is basically parametrizing the test with the different CPU cache sizes plus one that exceeds cache, and we are trying to fit an array into those sizes. That seems easy enough to control with the primitive types, but I may be misunderstanding how the variable length random strings would be guaranteed to fit into that same cache even with `(min_length + max_length) / 2` without querying the size of the array after creating. Also unclear if I need to account for the offsets buffer as part of the cache bounding requirement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on a diff in pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055767344
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const Runner& runner,
+ int64_t min_length, int64_t max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
Here are the updated values with that change:
```sh
Running ./release/arrow-compute-vector-sort-benchmark
Run on (12 X 4310.45 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 1280 KiB (x6)
L3 Unified 12288 KiB (x1)
Load Average: 7.90, 5.70, 3.53
----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------
ArraySortIndicesString/49152/10000 249 us 249 us 2985 bytes_per_second=188.346M/s items_per_second=12.3435M/s null_percent=0.01 size=49.152k
ArraySortIndicesString/49152/100 242 us 242 us 2897 bytes_per_second=194.014M/s items_per_second=12.7149M/s null_percent=1 size=49.152k
ArraySortIndicesString/49152/10 221 us 221 us 3165 bytes_per_second=212.233M/s items_per_second=13.9089M/s null_percent=10 size=49.152k
ArraySortIndicesString/49152/2 112 us 112 us 6060 bytes_per_second=418.805M/s items_per_second=27.4468M/s null_percent=50 size=49.152k
ArraySortIndicesString/49152/1 5.20 us 5.20 us 133594 bytes_per_second=8.79969G/s items_per_second=590.537M/s null_percent=100 size=49.152k
ArraySortIndicesString/49152/0 240 us 240 us 2826 bytes_per_second=195.617M/s items_per_second=12.82M/s null_percent=0 size=49.152k
ArraySortIndicesString/1048576/100 8447 us 8447 us 83 bytes_per_second=118.391M/s items_per_second=7.7589M/s null_percent=1 size=1048.58k
ArraySortIndicesString/8388608/100 91818 us 91795 us 7 bytes_per_second=87.151M/s items_per_second=5.71153M/s null_percent=1 size=8.38861M
ChunkedArraySortIndicesString/49152/10000 328 us 328 us 2139 bytes_per_second=142.782M/s items_per_second=9.35129M/s null_percent=0.01 size=49.152k
ChunkedArraySortIndicesString/49152/100 319 us 319 us 2179 bytes_per_second=146.763M/s items_per_second=9.61198M/s null_percent=1 size=49.152k
ChunkedArraySortIndicesString/49152/10 284 us 284 us 2433 bytes_per_second=165.067M/s items_per_second=10.8108M/s null_percent=10 size=49.152k
ChunkedArraySortIndicesString/49152/2 152 us 152 us 4687 bytes_per_second=307.783M/s items_per_second=20.1577M/s null_percent=50 size=49.152k
ChunkedArraySortIndicesString/49152/1 6.70 us 6.70 us 104605 bytes_per_second=6.83105G/s items_per_second=458.126M/s null_percent=100 size=49.152k
ChunkedArraySortIndicesString/49152/0 326 us 325 us 1990 bytes_per_second=144.018M/s items_per_second=9.4322M/s null_percent=0 size=49.152k
ChunkedArraySortIndicesString/1048576/100 10470 us 10466 us 69 bytes_per_second=95.5466M/s items_per_second=6.26117M/s null_percent=1 size=1048.58k
ChunkedArraySortIndicesString/8388608/100 110628 us 110623 us 6 bytes_per_second=72.3174M/s items_per_second=4.73932M/s null_percent=1 size=8.38861M
ArrayRankString/49152/10000/tiebreaker:0 267473 ns 267229 ns 2733 bytes_per_second=175.412M/s items_per_second=11.4958M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:2 245559 ns 245517 ns 2863 bytes_per_second=190.923M/s items_per_second=12.5124M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/10000/tiebreaker:3 266399 ns 266340 ns 2737 bytes_per_second=175.997M/s items_per_second=11.5341M/s null_percent=0.01 size=49.152k
ArrayRankString/49152/100/tiebreaker:0 248463 ns 248425 ns 2845 bytes_per_second=188.689M/s items_per_second=12.3659M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:2 238895 ns 238844 ns 2936 bytes_per_second=196.257M/s items_per_second=12.8619M/s null_percent=1 size=49.152k
ArrayRankString/49152/100/tiebreaker:3 252202 ns 252153 ns 2851 bytes_per_second=185.899M/s items_per_second=12.1831M/s null_percent=1 size=49.152k
ArrayRankString/49152/10/tiebreaker:0 241138 ns 241029 ns 3056 bytes_per_second=194.479M/s items_per_second=12.7454M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:2 223276 ns 223249 ns 3172 bytes_per_second=209.968M/s items_per_second=13.7604M/s null_percent=10 size=49.152k
ArrayRankString/49152/10/tiebreaker:3 246466 ns 246129 ns 2786 bytes_per_second=190.449M/s items_per_second=12.4813M/s null_percent=10 size=49.152k
ArrayRankString/49152/2/tiebreaker:0 125514 ns 125467 ns 5796 bytes_per_second=373.605M/s items_per_second=24.4846M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:2 118292 ns 118211 ns 5230 bytes_per_second=396.538M/s items_per_second=25.9875M/s null_percent=50 size=49.152k
ArrayRankString/49152/2/tiebreaker:3 120854 ns 120825 ns 5912 bytes_per_second=387.958M/s items_per_second=25.4252M/s null_percent=50 size=49.152k
ArrayRankString/49152/1/tiebreaker:0 5764 ns 5753 ns 128137 bytes_per_second=7.95685G/s items_per_second=533.975M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:2 5499 ns 5498 ns 125044 bytes_per_second=8.32608G/s items_per_second=558.754M/s null_percent=100 size=49.152k
ArrayRankString/49152/1/tiebreaker:3 5357 ns 5355 ns 125042 bytes_per_second=8.54873G/s items_per_second=573.696M/s null_percent=100 size=49.152k
ArrayRankString/49152/0/tiebreaker:0 258138 ns 258109 ns 2766 bytes_per_second=181.609M/s items_per_second=11.9019M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:2 250541 ns 250474 ns 2792 bytes_per_second=187.145M/s items_per_second=12.2647M/s null_percent=0 size=49.152k
ArrayRankString/49152/0/tiebreaker:3 253012 ns 252992 ns 2776 bytes_per_second=185.283M/s items_per_second=12.1427M/s null_percent=0 size=49.152k
ArrayRankString/1048576/100/tiebreaker:0 8756054 ns 8755400 ns 79 bytes_per_second=114.215M/s items_per_second=7.48521M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:2 8528650 ns 8528126 ns 81 bytes_per_second=117.259M/s items_per_second=7.68469M/s null_percent=1 size=1048.58k
ArrayRankString/1048576/100/tiebreaker:3 8656824 ns 8655985 ns 80 bytes_per_second=115.527M/s items_per_second=7.57118M/s null_percent=1 size=1048.58k
ArrayRankString/8388608/100/tiebreaker:0 96252452 ns 96229618 ns 7 bytes_per_second=83.1345M/s items_per_second=5.4483M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:2 92078251 ns 92057802 ns 7 bytes_per_second=86.9019M/s items_per_second=5.6952M/s null_percent=1 size=8.38861M
ArrayRankString/8388608/100/tiebreaker:3 94816437 ns 94803384 ns 7 bytes_per_second=84.3852M/s items_per_second=5.53027M/s null_percent=1 size=8.38861M
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1366344521
Don't think the current failure is related
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1365580861
Please fix CI error from "Windows 2019 C++17" job.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by GitBox <gi...@apache.org>.
WillAyd commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1370288509
Latest commit isn't idiomatic but pushing for what I think are the "correct" numbers first. Here's an interesting comparison of the benchmark for the different types:
```sh
willayd@willayd-Lemur-Pro:~/clones/arrow/cpp/release$ ./release/arrow-compute-vector-sort-benchmark --benchmark_filter="^ArraySortIndices.*/49152/1$"
2023-01-03T14:24:23-08:00
Running ./release/arrow-compute-vector-sort-benchmark
Run on (12 X 4700 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 1280 KiB (x6)
L3 Unified 12288 KiB (x1)
Load Average: 0.83, 3.36, 3.21
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------
ArraySortIndicesInt64Narrow/49152/1 8.53 us 8.53 us 81671 bytes_per_second=5.36415G/s items_per_second=719.963M/s null_percent=100 size=49.152k
ArraySortIndicesInt64Wide/49152/1 8.53 us 8.53 us 81569 bytes_per_second=5.36582G/s items_per_second=720.188M/s null_percent=100 size=49.152k
ArraySortIndicesBool/49152/1 195 us 195 us 3555 bytes_per_second=240.604M/s items_per_second=2.01833G/s null_percent=100 size=49.152k
ArraySortIndicesString/49152/1 5.01 us 5.01 us 137257 bytes_per_second=9.13159G/s items_per_second=612.811M/s null_percent=100 size=49.152k
```
Wasn't expecting there to be much of a difference across these - I'm guessing the other types may be misreporting their benchmark since they don't take into account the proper sizing of an all null array?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] felipecrv commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks
Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on PR #15041:
URL: https://github.com/apache/arrow/pull/15041#issuecomment-1507206926
@amol @westonpace I took the code from this PR and fixed the remaining issues in https://github.com/apache/arrow/pull/34811
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org