You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/29 17:51:35 UTC

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7285: ARROW-8843: [C++] Compare bitmaps in words

fsaintjacques commented on a change in pull request #7285:
URL: https://github.com/apache/arrow/pull/7285#discussion_r432527849



##########
File path: cpp/src/arrow/util/bit_util.cc
##########
@@ -246,6 +246,49 @@ bool BitmapEquals(const uint8_t* left, int64_t left_offset, const uint8_t* right
   }
 
   // Unaligned slow case
+  left += left_offset / 8;
+  right += right_offset / 8;
+  left_offset %= 8;
+  right_offset %= 8;
+
+  // process in 64 bits
+  int64_t nwords = bit_length / 64;

Review comment:
       Use snake_case for variables, e.g. `n_words`.

##########
File path: cpp/src/arrow/util/bit_util.cc
##########
@@ -246,6 +246,49 @@ bool BitmapEquals(const uint8_t* left, int64_t left_offset, const uint8_t* right
   }
 
   // Unaligned slow case
+  left += left_offset / 8;
+  right += right_offset / 8;
+  left_offset %= 8;
+  right_offset %= 8;
+
+  // process in 64 bits
+  int64_t nwords = bit_length / 64;
+  if (nwords > 1) {
+    bit_length -= (nwords - 1) * 64;
+
+    uint64_t left_word0 = BitUtil::ToLittleEndian(util::SafeLoadAs<uint64_t>(left));
+    uint64_t right_word0 = BitUtil::ToLittleEndian(util::SafeLoadAs<uint64_t>(right));
+
+    do {

Review comment:
       Make a shift lambda:
   
   ```
   auto shift = [](uint64_t word, uint64_t next, uint8_t shift) -> uint64_t {
     if (shift == 0) return word;
     return (word >> shift) | (next << (64 - shift);
   }; 
   
   auto next = load_word(left);
   auto left_word = shift(left_word0, next, left_offset);
   left_word0 = next;
   ```
   
   I'd also think it would be worth transforming the loop into a fixed for-loop.

##########
File path: cpp/src/arrow/util/bit_util.cc
##########
@@ -246,6 +246,49 @@ bool BitmapEquals(const uint8_t* left, int64_t left_offset, const uint8_t* right
   }
 
   // Unaligned slow case
+  left += left_offset / 8;
+  right += right_offset / 8;
+  left_offset %= 8;
+  right_offset %= 8;
+
+  // process in 64 bits
+  int64_t nwords = bit_length / 64;
+  if (nwords > 1) {
+    bit_length -= (nwords - 1) * 64;
+
+    uint64_t left_word0 = BitUtil::ToLittleEndian(util::SafeLoadAs<uint64_t>(left));

Review comment:
       It would be more readable if you wrap load into a lambda. The compiler will inline it.
   ```c++
   auto load_word = [](uint8_t* bytes) { return BitUtil::ToLittleEndian(util::SafeLoadAs<uint64_t>(bytes)); }
   
   auto left_word0 = load_word(left);
   auto right_word0 = load_word(right);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org