You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/29 07:58:19 UTC

[GitHub] [arrow] sanjibansg opened a new pull request, #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

sanjibansg opened a new pull request, #14758:
URL: https://github.com/apache/arrow/pull/14758

   This PR adds a utility function which is responsible for ensuring that all the buffers of an arrow object are properly aligned. It checks all the buffers in an arrow object for alignment, and if not aligned properly, then allocates a buffer by specifying the required alignment and copies data from the previous buffer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1109067707


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;

Review Comment:
   ```suggestion
           return std::move(object);
   ```
   
   Needed if you follow the previous advice and change to `std::shared_ptr<Buffer> object` instead of `const std::shared_ptr<Buffer>&`



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& buffer, int64_t alignment,
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& array,
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Buffer>> EnsureAlignment(std::shared_ptr<Buffer> object, int64_t alignment,
                                                  MemoryPool* memory_pool) {
   ```
   
   Also, the method should use move appropriate so I can do...
   
   ```
   std::shared_ptr<Buffer> foo = ...
   foo = EnsureAlignment(std::move(foo), 64, default_memory_pool())
   // Should only make a copy of `foo` or the `shared_ptr` if the buffer was not aligned.
   ```



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& array,
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),

Review Comment:
   You still need to handle the dictionary and the children.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,
                                                       int64_t alignment,
                                                       MemoryPool* memory_pool) {
-  ArrayVector chunks_ = object.chunks();
-  for (int i = 0; i < object.num_chunks(); ++i) {
+  ArrayVector chunks_ = object->chunks();
+  for (int i = 0; i < object->num_chunks(); ++i) {
     ARROW_ASSIGN_OR_RAISE(chunks_[i],
-                          EnsureAlignment(*object.chunk(i), alignment, memory_pool));
+                          EnsureAlignment(object->chunk(i), alignment, memory_pool));
   }
-  return ChunkedArray::Make(std::move(chunks_), object.type());
+  return ChunkedArray::Make(std::move(chunks_), object->type());
 }
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const std::shared_ptr<RecordBatch>& object,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const std::shared_ptr<RecordBatch>& batch,
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,
                                                       int64_t alignment,
                                                       MemoryPool* memory_pool) {
-  ArrayVector chunks_ = object.chunks();
-  for (int i = 0; i < object.num_chunks(); ++i) {
+  ArrayVector chunks_ = object->chunks();
+  for (int i = 0; i < object->num_chunks(); ++i) {
     ARROW_ASSIGN_OR_RAISE(chunks_[i],
-                          EnsureAlignment(*object.chunk(i), alignment, memory_pool));
+                          EnsureAlignment(object->chunk(i), alignment, memory_pool));
   }
-  return ChunkedArray::Make(std::move(chunks_), object.type());
+  return ChunkedArray::Make(std::move(chunks_), object->type());
 }
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const std::shared_ptr<RecordBatch>& object,
                                                      int64_t alignment,
                                                      MemoryPool* memory_pool) {
-  ArrayVector columns_ = object.columns();
-  for (int i = 0; i < object.num_columns(); ++i) {
+  ArrayVector columns_ = object->columns();
+  for (int i = 0; i < object->num_columns(); ++i) {
     ARROW_ASSIGN_OR_RAISE(columns_[i],
-                          EnsureAlignment(*object.column(i), alignment, memory_pool));
+                          EnsureAlignment(object->column(i), alignment, memory_pool));
   }
-  return RecordBatch::Make(object.schema(), object.num_rows(), std::move(columns_));
+  return RecordBatch::Make(object->schema(), object->num_rows(), std::move(columns_));
 }
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
+Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& object, int64_t alignment,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& table, int64_t alignment,
   ```



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,
+    MemoryPool* memory_pool);
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(
+    const std::shared_ptr<RecordBatch>& object, int64_t alignment,

Review Comment:
   ```suggestion
       const std::shared_ptr<RecordBatch>& batch, int64_t alignment,
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& array, int64_t alignment,
   ```



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,

Review Comment:
   ```suggestion
       const std::shared_ptr<ChunkedArray>& array, int64_t alignment,
   ```



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& buffer,
   ```



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,
+    MemoryPool* memory_pool);
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(
+    const std::shared_ptr<RecordBatch>& object, int64_t alignment,
+    MemoryPool* memory_pool);
+
+Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& object,

Review Comment:
   ```suggestion
   Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& table,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1483194533

   :warning: GitHub issue #33317 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1330227379

   https://issues.apache.org/jira/browse/ARROW-18119


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1107857289


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -64,5 +66,23 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   return p;
 }
 
+template <typename T>
+Status EnsureAlignment(std::shared_ptr<T> object, int64_t alignment,
+                       MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
+  for (size_t i = 0; i < buffers_.size(); ++i) {
+    if (buffers_[i]) {
+      auto buffer_address = buffers_[i]->address();
+      if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
+        object->data()->buffers[i] = std::move(new_buffer);

Review Comment:
   Made the change, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144935112


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;

Review Comment:
   Moved the checking to the `CheckAlignment` function for `ArrayData`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144971084


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ArrayData& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const Array& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment,
+                                 const int& offset = 0);
+ARROW_EXPORT bool CheckAlignment(const RecordBatch& batch, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment);
+ARROW_EXPORT bool CheckAlignment(const Table& table, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment);

Review Comment:
   Added documentation for CheckAlignment functions. Thanks for the suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144933779


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);

Review Comment:
   Oh yes, sorry, I did that by mistake. Corrected that. Thanks for pointing it out.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140061051


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,

Review Comment:
   Made the change. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147782486


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(std::shared_ptr<Array> array) {
+  BufferVector sliced_buffers(array->data()->buffers.size(), nullptr);
+  for (std::size_t i = 0; i < array->data()->buffers.size(); ++i) {
+    if (array->data()->buffers[i]) {
+      sliced_buffers[i] = array->data()->buffers[i]->CopySlice(0, 2).MoveValueUnsafe();

Review Comment:
   made the change, thanks for the suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144027067


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -19,6 +19,7 @@
 
 #include <algorithm>
 
+#include "arrow/memory_pool.h"

Review Comment:
   It must be getting included transitively but we should explicitly include `arrow/type_fwd.h` since we are relying on it with this new code.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;

Review Comment:
   Why are you checking dictionary and child arrays here instead of in `bool CheckAlignment(const ArrayData& array, const int64_t& alignment)`?



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ArrayData& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const Array& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment,
+                                 const int& offset = 0);
+ARROW_EXPORT bool CheckAlignment(const RecordBatch& batch, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment);
+ARROW_EXPORT bool CheckAlignment(const Table& table, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment);

Review Comment:
   These methods will need to be documented.  Specifically the `needs_alignment` argument is not obvious.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;
+}
+
+bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                    std::vector<bool>& needs_alignment, const int& offset) {
+  needs_alignment.resize(needs_alignment.size() + array.num_chunks(), false);
+  for (auto i = 0; i < array.num_chunks(); ++i) {
+    if (array.chunk(i) && !CheckAlignment(*array.chunk(i), alignment))
+      needs_alignment[i + offset] = true;
+  }
+  return std::find(needs_alignment.begin() + offset,

Review Comment:
   Instead of doing `std::find` at the end (which will loop through `needs_alignment`) could you do something like...
   
   ```
   bool at_least_one_needs_alignment = false;
   for (...) {
     if (...) {
       needs_alignment[i + offset] = true;
       at_least_one_needs_alignment = true;
     }
   }
   return at_least_one_needs_alignment;
   ```



##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -145,6 +150,84 @@ TEST(BitmapWordAlign, UnalignedDataStart) {
   CheckBitmapWordAlign<8>(P, 1017, 64, {63, 1, 1080, A, 0, 0});
   CheckBitmapWordAlign<8>(P, 1017, 128, {63, 1, 1144, A + 128, 64, 1});
 }
-
 }  // namespace internal
+
+TEST(EnsureAlignment, Array) {
+  MemoryPool* pool = default_memory_pool();
+  auto rand = ::arrow::random::RandomArrayGenerator(1923);
+  auto random_array = rand.UInt8(/*size*/ 50, /*min*/ 0, /*max*/ 100,
+                                 /*null_probability*/ 0, /*alignment*/ 512, pool);
+  ASSERT_OK_AND_ASSIGN(auto aligned_array,

Review Comment:
   Maybe assert that `CheckAlignment(..., 1024)` returns `false` before you call `EnsureAlignment`?



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ArrayData& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const Array& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment,
+                                 const int& offset = 0);

Review Comment:
   ```suggestion
   ARROW_EXPORT bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
                                    std::vector<bool>* needs_alignment,
                                    const int& offset = 0);
   ```
   This works but the Arrow style guidelines prefer passing mutable parameters by pointer instead of mutable reference.
   
   For example:
   
   ```
   void Foo(int & value);
   void Bar(int * value);
   
   int x = 7;
   Foo(x); // BAD: not obvious that `x` is being modified
   Bar(&x); // GOOD: the `&` helps us know that `x` is being modified
   ```



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;

Review Comment:
   Could you just return false here and break out of the loop early?



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);

Review Comment:
   ```suggestion
   ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, int64_t alignment);
   ```
   Passing `int64_t` by const reference is not a good idea. Primitive / simple types should be passed by value (here and in all the other functions)



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;
+}
+
+bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                    std::vector<bool>& needs_alignment, const int& offset) {
+  needs_alignment.resize(needs_alignment.size() + array.num_chunks(), false);

Review Comment:
   Why `needs_alignment.size() + array.num_chunks()` and not `offset + array.num_chunks()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1483194473

   * Closes: #33317


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140060860


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {

Review Comment:
   Made the change, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144933081


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -19,6 +19,7 @@
 
 #include <algorithm>
 
+#include "arrow/memory_pool.h"

Review Comment:
   Included the header file explicitly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144935492


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;

Review Comment:
   Made the change, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144950111


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;
+}
+
+bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                    std::vector<bool>& needs_alignment, const int& offset) {
+  needs_alignment.resize(needs_alignment.size() + array.num_chunks(), false);

Review Comment:
   The resize is basically to add space for the extra elements that should be added for the check. 
   For example, if we have a table made up of 2 chunked arrays, which then are made of 2 arrays each. Then, initially `needs_alignment` will be of size 2 (because of 2 chunked arrays). When the `CheckAlignment()` is called for the first chunked array, `needs_alignment` will be resized to 4 `(/*needs_alignment.size()/* 2 + /*array.num_chunks()*/ 2), thus now in needs_alignment, first two bits indicate status of the 2 arrays of the first chunk array, the third one for the ChunkArray as a whole, and the last one for the other chunk array which yet to be checked. The similar operation gets repeated for the other chunked array.
   
   Instead if we use offset, then for the first iteration here, offset will be 0. So, the resize will enforce its size to 2, which will not be enough for storing the alignment  check bits of the first two arrays of the first chunk array, and the second chunk array.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1485869332

   Revision: 221426d3f031853b472796be07350a11402d2466
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-a124ab2af3](https://github.com/ursacomputing/crossbow/branches/all?query=actions-a124ab2af3)
   
   |Task|Status|
   |----|------|
   |amazon-linux-2-amd64|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-a124ab2af3-github-amazon-linux-2-amd64)](https://github.com/ursacomputing/crossbow/actions/runs/4536645848/jobs/7993588551)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1485864011

   Thanks!
   Could you open a new issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140061814


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,

Review Comment:
   Made the change.



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,

Review Comment:
   Made the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144950111


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;
+}
+
+bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                    std::vector<bool>& needs_alignment, const int& offset) {
+  needs_alignment.resize(needs_alignment.size() + array.num_chunks(), false);

Review Comment:
   The resize is basically to add space for the extra elements that should be added for the check. 
   For example, if we have a table made up of 2 chunked arrays, which then are made of 2 arrays each. Then, initially `needs_alignment` will be of size 2 (because of 2 chunked arrays). When the `CheckAlignment()` is called for the first chunked array, `needs_alignment` will be resized to 4 `(/*needs_alignment.size()/* 2 + /*array.num_chunks()*/ 2)`, thus now in needs_alignment, first two bits indicate status of the 2 arrays of the first chunk array, the third one for the ChunkArray as a whole, and the last one for the other chunk array which yet to be checked. The similar operation gets repeated for the other chunked array.
   
   Instead if we use offset, then for the first iteration here, offset will be 0. So, the resize will enforce its size to 2, which will not be enough for storing the alignment  check bits of the first two arrays of the first chunk array, and the second chunk array.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1091291897


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -64,5 +66,23 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   return p;
 }
 
+template <typename T>
+Status EnsureAlignment(std::shared_ptr<T> object, int64_t alignment,

Review Comment:
   This method doesn't recurse into child arrays.  For example, a `StructArray` is going to have `object->children()` that will need to be visited as well.



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -64,5 +66,23 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   return p;
 }
 
+template <typename T>

Review Comment:
   Why is this templated?  Right now this relies on `T::data()` existing.  What other types of `T` do you expect to accept?



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -64,5 +66,23 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   return p;
 }
 
+template <typename T>
+Status EnsureAlignment(std::shared_ptr<T> object, int64_t alignment,
+                       MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
+  for (size_t i = 0; i < buffers_.size(); ++i) {
+    if (buffers_[i]) {
+      auto buffer_address = buffers_[i]->address();
+      if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
+        object->data()->buffers[i] = std::move(new_buffer);

Review Comment:
   It'd be nice (and more consistent with the rest of the code base) if we could do this by returning a new `std::shared_ptr<Array>` instead of modifying the existing array in-place.  You should be able to create a new `ArrayData` with the new buffers and then create a new `Array` from that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140062995


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,
                                                       int64_t alignment,
                                                       MemoryPool* memory_pool) {
-  ArrayVector chunks_ = object.chunks();
-  for (int i = 0; i < object.num_chunks(); ++i) {
+  ArrayVector chunks_ = object->chunks();
+  for (int i = 0; i < object->num_chunks(); ++i) {
     ARROW_ASSIGN_OR_RAISE(chunks_[i],
-                          EnsureAlignment(*object.chunk(i), alignment, memory_pool));
+                          EnsureAlignment(object->chunk(i), alignment, memory_pool));
   }
-  return ChunkedArray::Make(std::move(chunks_), object.type());
+  return ChunkedArray::Make(std::move(chunks_), object->type());
 }
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const std::shared_ptr<RecordBatch>& object,

Review Comment:
   Made the change.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,
                                                       int64_t alignment,
                                                       MemoryPool* memory_pool) {
-  ArrayVector chunks_ = object.chunks();
-  for (int i = 0; i < object.num_chunks(); ++i) {
+  ArrayVector chunks_ = object->chunks();
+  for (int i = 0; i < object->num_chunks(); ++i) {
     ARROW_ASSIGN_OR_RAISE(chunks_[i],
-                          EnsureAlignment(*object.chunk(i), alignment, memory_pool));
+                          EnsureAlignment(object->chunk(i), alignment, memory_pool));
   }
-  return ChunkedArray::Make(std::move(chunks_), object.type());
+  return ChunkedArray::Make(std::move(chunks_), object->type());
 }
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const std::shared_ptr<RecordBatch>& object,
                                                      int64_t alignment,
                                                      MemoryPool* memory_pool) {
-  ArrayVector columns_ = object.columns();
-  for (int i = 0; i < object.num_columns(); ++i) {
+  ArrayVector columns_ = object->columns();
+  for (int i = 0; i < object->num_columns(); ++i) {
     ARROW_ASSIGN_OR_RAISE(columns_[i],
-                          EnsureAlignment(*object.column(i), alignment, memory_pool));
+                          EnsureAlignment(object->column(i), alignment, memory_pool));
   }
-  return RecordBatch::Make(object.schema(), object.num_rows(), std::move(columns_));
+  return RecordBatch::Make(object->schema(), object->num_rows(), std::move(columns_));
 }
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
+Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& object, int64_t alignment,

Review Comment:
   Made the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1473639765

   > In most cases the alignment will already be correct. We want this method to be really fast in that scenario. So this means that methods should look like...
   > 
   > ```
   > std::shared_ptr<Array> Foo(std::shared_ptr<Array> thing) {
   >   if (needs_alignment(...)) {
   >     // a copy needs to be made
   >   } else {
   >     return std::move(thing);
   >   }
   > }
   > ```
   > 
   > This way we can call it like so...
   > 
   > ```
   > std::shared_ptr<Array> aligned = EnsureAlignment(std::move(unaligned), ...);
   > ```
   > 
   > Then, if `unaligned` is already aligned it will be a straight move into `aligned` without ever making any copies (not even copies of `shared_ptr`)
   
   The current implementation might be a bit complicated, so comments and suggestions for optimized approaches will be very helpful. Thanks in advance! 
   
   Currently, we have `Check` functions for various Arrow objects, which check their constituents for alignments. For `ChunkedArray`, `RecordBatch`, and `Table`, we use a boolean vector to track the constituents which require alignment thus avoiding unnecessary function calls.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace merged pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace merged PR #14758:
URL: https://github.com/apache/arrow/pull/14758


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1486389260

   @kou It is passing now! https://github.com/apache/arrow/pull/34754
   
   cc: @westonpace 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144936093


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -0,0 +1,215 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/util/align_util.h"
+
+#include "arrow/array.h"
+#include "arrow/chunked_array.h"
+#include "arrow/record_batch.h"
+#include "arrow/table.h"
+
+namespace arrow {
+
+namespace util {
+
+bool CheckAlignment(const Buffer& buffer, const int64_t& alignment) {
+  return buffer.address() % alignment == 0;
+}
+
+bool CheckAlignment(const ArrayData& array, const int64_t& alignment) {
+  for (const auto& buffer : array.buffers) {
+    if (buffer) {
+      if (!CheckAlignment(*buffer, alignment)) return false;
+    }
+  }
+  return true;
+}
+
+bool CheckAlignment(const Array& array, const int64_t& alignment) {
+  bool align_dictionary = true, align_children = true;
+
+  if (array.type()->id() == Type::DICTIONARY) {
+    align_dictionary = CheckAlignment(*array.data()->dictionary, alignment);
+  }
+
+  for (const auto& child : array.data()->child_data) {
+    if (child) {
+      if (!CheckAlignment(*child, alignment)) align_children = false;
+    }
+  }
+  return CheckAlignment(*array.data(), alignment) && align_dictionary && align_children;
+}
+
+bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                    std::vector<bool>& needs_alignment, const int& offset) {
+  needs_alignment.resize(needs_alignment.size() + array.num_chunks(), false);
+  for (auto i = 0; i < array.num_chunks(); ++i) {
+    if (array.chunk(i) && !CheckAlignment(*array.chunk(i), alignment))
+      needs_alignment[i + offset] = true;
+  }
+  return std::find(needs_alignment.begin() + offset,

Review Comment:
   Made the change, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1144934228


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -63,6 +64,43 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   p.aligned_start = data + (bit_offset + p.leading_bits) / 8;
   return p;
 }
-
 }  // namespace internal
+
+namespace util {
+
+ARROW_EXPORT bool CheckAlignment(const Buffer& buffer, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ArrayData& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const Array& array, const int64_t& alignment);
+ARROW_EXPORT bool CheckAlignment(const ChunkedArray& array, const int64_t& alignment,
+                                 std::vector<bool>& needs_alignment,
+                                 const int& offset = 0);

Review Comment:
   Made the change, thanks for the suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147835236


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(const Array& array) {
+  BufferVector sliced_buffers(array.data()->buffers.size(), nullptr);

Review Comment:
   ```suggestion
     // Slicing by 1 would create an invalid array if the type was wider than
     // 1 byte so double-check that the array is a 1-byte type
     EXPECT_EQ(array.type_id(), Type::UINT8);
     BufferVector sliced_buffers(array.data()->buffers.size(), nullptr);
   ```



##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(const Array& array) {
+  BufferVector sliced_buffers(array.data()->buffers.size(), nullptr);
+  for (std::size_t i = 0; i < array.data()->buffers.size(); ++i) {
+    if (array.data()->buffers[i]) {
+      sliced_buffers[i] = SliceBuffer(array.data()->buffers[i], 1, 2);
+    }
+  }
+  auto sliced_array_data =
+      ArrayData::Make(array.type(), array.length(), std::move(sliced_buffers));

Review Comment:
   ```suggestion
     auto sliced_array_data =
         ArrayData::Make(array.type(), /*length=*/2, std::move(sliced_buffers));
   ```
   
   This works today, but only because you never call `Validate` or any kind of compute function on the returned array.  This could be a time bomb for a future maintainer.  Better to make sure the array is properly formed which means the length of the array must match the length of its buffers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1485866456

   FYI: We can test this case by commenting `@github-actions crossbow submit amazon-linux-2-amd64`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140062709


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,
+    MemoryPool* memory_pool);
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(
+    const std::shared_ptr<RecordBatch>& object, int64_t alignment,
+    MemoryPool* memory_pool);
+
+Result<std::shared_ptr<Table>> EnsureAlignment(const std::shared_ptr<Table>& object,

Review Comment:
   Made the change.



##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),
+                      object->data()->GetNullCount(), object->data()->offset);
   return MakeArray(new_array_data);
 }
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const std::shared_ptr<ChunkedArray>& object,

Review Comment:
   Made the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140062274


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,

Review Comment:
   Made the change.



##########
cpp/src/arrow/util/align_util.h:
##########
@@ -68,19 +68,25 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object,
+                                                int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(const ChunkedArray& object,
-                                                      int64_t alignment,
-                                                      MemoryPool* memory_pool);
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object,
+                                               int64_t alignment,
+                                                MemoryPool* memory_pool);
 
-Result<std::shared_ptr<RecordBatch>> EnsureAlignment(const RecordBatch& object,
-                                                     int64_t alignment,
-                                                     MemoryPool* memory_pool);
+Result<std::shared_ptr<ChunkedArray>> EnsureAlignment(
+    const std::shared_ptr<ChunkedArray>& object, int64_t alignment,
+    MemoryPool* memory_pool);
 
-Result<std::shared_ptr<Table>> EnsureAlignment(const Table& object, int64_t alignment,
-                                               MemoryPool* memory_pool);
+Result<std::shared_ptr<RecordBatch>> EnsureAlignment(
+    const std::shared_ptr<RecordBatch>& object, int64_t alignment,

Review Comment:
   Made the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1485866554

   @github-actions crossbow submit amazon-linux-2-amd64


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1145371872


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -145,6 +150,84 @@ TEST(BitmapWordAlign, UnalignedDataStart) {
   CheckBitmapWordAlign<8>(P, 1017, 64, {63, 1, 1080, A, 0, 0});
   CheckBitmapWordAlign<8>(P, 1017, 128, {63, 1, 1144, A + 128, 64, 1});
 }
-
 }  // namespace internal
+
+TEST(EnsureAlignment, Array) {
+  MemoryPool* pool = default_memory_pool();
+  auto rand = ::arrow::random::RandomArrayGenerator(1923);
+  auto random_array = rand.UInt8(/*size*/ 50, /*min*/ 0, /*max*/ 100,
+                                 /*null_probability*/ 0, /*alignment*/ 512, pool);
+  ASSERT_OK_AND_ASSIGN(auto aligned_array,

Review Comment:
   Made the change, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147137038


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(std::shared_ptr<Array> array) {
+  BufferVector sliced_buffers(array->data()->buffers.size(), nullptr);
+  for (std::size_t i = 0; i < array->data()->buffers.size(); ++i) {
+    if (array->data()->buffers[i]) {
+      sliced_buffers[i] = array->data()->buffers[i]->CopySlice(0, 2).MoveValueUnsafe();
+    }
+  }
+  auto sliced_array_data =
+      ArrayData::Make(array->type(), array->length(), std::move(sliced_buffers));

Review Comment:
   The length should be smaller here correct?  Since your arrays are all `uint8` you should decrease the length to match the slice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1484774769

   > @sanjibansg It seems that this broke our nightly CI. Could you check this?
   > 
   > https://github.com/ursacomputing/crossbow/actions/runs/4518357593/jobs/7958180632#step:6:1375
   > 
   > ```
   > [ 34%] Building CXX object src/arrow/CMakeFiles/arrow_objlib.dir/util/align_util.cc.o
   > cd /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/src/arrow && /usr/lib64/ccache/c++  -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_HDFS -DARROW_WITH_BACKTRACE -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD -DUTF8PROC_STATIC -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/src -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/generated -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/thirdparty/flatbuffers/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/thirdparty/hadoop/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/boost_ep-prefix/src/boost_ep -isystem /root/rpmbuild/BUILD/apache-
 arrow-12.0.0.dev290/cpp/build/orc_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/zstd_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/protobuf_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/utf8proc_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/re2_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/xsimd_ep/src/xsimd_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/jemalloc_ep-prefix/src  -Wno-noexcept-type -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches    -m64 -mtune=generic -fdiagnostics-color=always  -Wall -fno-semantic-interposition -msse4.2  -DNDEBUG -ftree-vectorize -fPIC   -pthread -std=c++1z -o CMakeFiles/arrow_objlib.dir/util/align_util.cc.o -c /root/rpmbuild/BUILD/apac
 he-arrow-12.0.0.dev290/cpp/src/arrow/util/align_util.cc
   > ...
   > /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/arrow/util/align_util.cc: In function 'arrow::Result<std::shared_ptr<arrow::Buffer> > arrow::util::EnsureAlignment(std::shared_ptr<arrow::Buffer>, int64_t, arrow::MemoryPool*)':
   > /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/arrow/util/align_util.cc:104:12: error: could not convert 'new_buffer' from 'std::unique_ptr<arrow::Buffer>' to 'arrow::Result<std::shared_ptr<arrow::Buffer> >'
   >      return new_buffer;
   >             ^~~~~~~~~~
   > make[2]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/util/align_util.cc.o] Error 1
   > ```
   
   Thanks for letting me know. I will make a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147846089


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(const Array& array) {
+  BufferVector sliced_buffers(array.data()->buffers.size(), nullptr);

Review Comment:
   made the change, thanks for the suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140060609


##########
cpp/src/arrow/util/align_util.h:
##########
@@ -64,5 +66,23 @@ inline BitmapWordAlignParams BitmapWordAlign(const uint8_t* data, int64_t bit_of
   return p;
 }
 
+template <typename T>
+Status EnsureAlignment(std::shared_ptr<T> object, int64_t alignment,

Review Comment:
   Now checking the children as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140061314


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;
+    }
+}
+
+Result<std::shared_ptr<Array>> EnsureAlignment(const std::shared_ptr<Array>& object, int64_t alignment,
+                                               MemoryPool* memory_pool) {
+  std::vector<std::shared_ptr<Buffer>> buffers_ = object->data()->buffers;
   for (size_t i = 0; i < buffers_.size(); ++i) {
     if (buffers_[i]) {
-      auto buffer_address = buffers_[i]->address();
-      if ((buffer_address % alignment) != 0) {
-        ARROW_ASSIGN_OR_RAISE(
-            auto new_buffer, AllocateBuffer(buffers_[i]->size(), alignment, memory_pool));
-        std::memcpy(new_buffer->mutable_data(), buffers_[i]->data(), buffers_[i]->size());
-        buffers_[i] = std::move(new_buffer);
+        ARROW_ASSIGN_OR_RAISE(buffers_[i], EnsureAlignment(buffers_[i], alignment, memory_pool));
       }
-    }
   }
   auto new_array_data =
-      ArrayData::Make(object.data()->type, object.data()->length, std::move(buffers_),
-                      object.data()->GetNullCount(), object.data()->offset);
+      ArrayData::Make(object->data()->type, object->data()->length, std::move(buffers_),

Review Comment:
   Handling children and dictionary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1140061585


##########
cpp/src/arrow/util/align_util.cc:
##########
@@ -26,56 +26,63 @@ namespace arrow {
 
 namespace util {
 
-Result<std::shared_ptr<Array>> EnsureAlignment(const Array& object, int64_t alignment,
+Result<std::shared_ptr<Buffer>> EnsureAlignment(const std::shared_ptr<Buffer>& object, int64_t alignment,
                                                MemoryPool* memory_pool) {
-  std::vector<std::shared_ptr<Buffer>> buffers_ = object.data()->buffers;
+    auto buffer_address = object->address();
+    if ((buffer_address % alignment) != 0) {
+        ARROW_ASSIGN_OR_RAISE(
+                auto new_buffer, AllocateBuffer(object->size(), alignment, memory_pool));
+        std::memcpy(new_buffer->mutable_data(), object->data(), object->size());
+        return new_buffer;
+    } else {
+        return object;

Review Comment:
   Made the change, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sanjibansg commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "sanjibansg (via GitHub)" <gi...@apache.org>.
sanjibansg commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147846513


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(const Array& array) {
+  BufferVector sliced_buffers(array.data()->buffers.size(), nullptr);
+  for (std::size_t i = 0; i < array.data()->buffers.size(); ++i) {
+    if (array.data()->buffers[i]) {
+      sliced_buffers[i] = SliceBuffer(array.data()->buffers[i], 1, 2);
+    }
+  }
+  auto sliced_array_data =
+      ArrayData::Make(array.type(), array.length(), std::move(sliced_buffers));

Review Comment:
   Ah, okay, understood. Thanks for letting me know. I have made the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #14758: ARROW-18119: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #14758:
URL: https://github.com/apache/arrow/pull/14758#discussion_r1147135735


##########
cpp/src/arrow/util/align_util_test.cc:
##########
@@ -59,6 +63,18 @@ void CheckBitmapWordAlign(const uint8_t* data, int64_t bit_offset, int64_t lengt
   }
 }
 
+arrow::Result<std::shared_ptr<Array>> CreateUnalignedArray(std::shared_ptr<Array> array) {
+  BufferVector sliced_buffers(array->data()->buffers.size(), nullptr);
+  for (std::size_t i = 0; i < array->data()->buffers.size(); ++i) {
+    if (array->data()->buffers[i]) {
+      sliced_buffers[i] = array->data()->buffers[i]->CopySlice(0, 2).MoveValueUnsafe();

Review Comment:
   Sorry, this comment was meant to accompany the review but it seems Github swallowed it.  `CopySafe` allocates a new buffer and copies the slice into this new buffer.  The newly allocated buffer will have at least 64 byte alignment but, depending on the system allocator (or just bad luck), this alignment could be greater.
   
   Instead you should use [`SliceBuffer`](https://github.com/apache/arrow/blob/main/cpp/src/arrow/buffer.h#L322).  However, `SliceBuffer` returns a view only (this is inevitable here) so you need to make sure to keep the original array alive.  To do that change this method to accept `const Array&` instead of `std::shared_ptr<Array>`.  Then you should be ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1483915977

   @sanjibansg It seems that this breaks our nightly CI. Could you check this?
   
   https://github.com/ursacomputing/crossbow/actions/runs/4518357593/jobs/7958180632#step:6:1378
   
   ```text
   [ 34%] Building CXX object src/arrow/CMakeFiles/arrow_objlib.dir/util/async_util.cc.o
   cd /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/src/arrow && /usr/lib64/ccache/c++  -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_HDFS -DARROW_WITH_BACKTRACE -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD -DUTF8PROC_STATIC -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/src -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src -I/root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/generated -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/thirdparty/flatbuffers/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/thirdparty/hadoop/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/boost_ep-prefix/src/boost_ep -isystem /root/rpmbuild/BUILD/apache-ar
 row-12.0.0.dev290/cpp/build/orc_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/zstd_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/protobuf_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/utf8proc_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/re2_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/xsimd_ep/src/xsimd_ep-install/include -isystem /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/build/jemalloc_ep-prefix/src  -Wno-noexcept-type -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches    -m64 -mtune=generic -fdiagnostics-color=always  -Wall -fno-semantic-interposition -msse4.2  -DNDEBUG -ftree-vectorize -fPIC   -pthread -std=c++1z -o CMakeFiles/arrow_objlib.dir/util/async_util.cc.o -c /root/rpmbuild/BUILD/apache
 -arrow-12.0.0.dev290/cpp/src/arrow/util/async_util.cc
   /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/arrow/util/align_util.cc: In function 'arrow::Result<std::shared_ptr<arrow::Buffer> > arrow::util::EnsureAlignment(std::shared_ptr<arrow::Buffer>, int64_t, arrow::MemoryPool*)':
   /root/rpmbuild/BUILD/apache-arrow-12.0.0.dev290/cpp/src/arrow/util/align_util.cc:104:12: error: could not convert 'new_buffer' from 'std::unique_ptr<arrow::Buffer>' to 'arrow::Result<std::shared_ptr<arrow::Buffer> >'
        return new_buffer;
               ^~~~~~~~~~
   make[2]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/util/align_util.cc.o] Error 1
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #14758: GH-33317: [C++] Utility method to ensure an array object meetings an alignment requirement

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #14758:
URL: https://github.com/apache/arrow/pull/14758#issuecomment-1483771180

   Benchmark runs are scheduled for baseline = f4680cd7eeb6f379da81300afc9bc5daae491ac6 and contender = 1b439b0768776499bc2ad9e3beeba6ca14eb19cd. 1b439b0768776499bc2ad9e3beeba6ca14eb19cd is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/3e0fe4717d874560b6cc7eec0382d047...69f170db3ac1485b8bce6bfac247a6d7/)
   [Failed :arrow_down:0.86% :arrow_up:0.03%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/e9e85e16d79440ec9adaab135f28712f...b127bfc01bd849cfaaafa2725782e879/)
   [Finished :arrow_down:0.51% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/30fc0d555b7649589c28755401b3b2ee...4ffc544ed43f451ba01f4fd3b0c50c24/)
   [Finished :arrow_down:0.25% :arrow_up:0.65%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/136538dbe615418485a32ff4246250b4...b6c1a4934c0f4554b138de952a7fcb18/)
   Buildkite builds:
   [Finished] [`1b439b07` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2576)
   [Finished] [`1b439b07` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2606)
   [Finished] [`1b439b07` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2574)
   [Finished] [`1b439b07` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2597)
   [Finished] [`f4680cd7` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2575)
   [Failed] [`f4680cd7` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2605)
   [Finished] [`f4680cd7` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2573)
   [Finished] [`f4680cd7` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2596)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org