You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/30 17:37:06 UTC

[GitHub] [arrow] pitrou commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

pitrou commented on a change in pull request #12032:
URL: https://github.com/apache/arrow/pull/12032#discussion_r776815147



##########
File path: cpp/src/arrow/compute/kernels/hash_aggregate_test.cc
##########
@@ -614,6 +616,11 @@ TEST(Grouper, RandomStringInt64DoubleInt32Keys) {
   }
 }
 
+TEST(Grouper, NullKeys) {
+  TestGrouper g({null()});
+  g.ExpectConsume("[[null], [null]]", "[0, 0]");

Review comment:
       Is it possible to add a test with several keys and where one of them has the null type?

##########
File path: cpp/src/arrow/compute/kernels/hash_aggregate.cc
##########
@@ -147,10 +152,13 @@ struct GrouperImpl : Grouper {
       if (it_success.second) {
         // new key; update offsets and key_bytes
         ++num_groups_;
-        auto next_key_offset = static_cast<int32_t>(key_bytes_.size());
-        key_bytes_.resize(next_key_offset + key_length);
-        offsets_.push_back(next_key_offset + key_length);
-        memcpy(key_bytes_.data() + next_key_offset, key.c_str(), key_length);
+        if (key_length != 0) {
+          // Skip if there are no keys

Review comment:
       I'm not sure I understand this comment, do you need to move it up just above the `if`?

##########
File path: cpp/src/arrow/compute/kernels/hash_aggregate.cc
##########
@@ -197,7 +205,7 @@ struct GrouperFastImpl : Grouper {
 #if ARROW_LITTLE_ENDIAN
     for (size_t i = 0; i < keys.size(); ++i) {
       const auto& key = keys[i].type;
-      if (is_large_binary_like(key->id())) {
+      if (is_large_binary_like(key->id()) || key->id() == Type::NA) {

Review comment:
       I'm not sure I understand, why cannot null keys be used here?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org