You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/02 15:50:05 UTC

[GitHub] [arrow] pitrou commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

pitrou commented on a change in pull request #7598:
URL: https://github.com/apache/arrow/pull/7598#discussion_r449095262



##########
File path: cpp/src/arrow/array/array_struct_test.cc
##########
@@ -256,16 +256,6 @@ TEST_F(TestStructBuilder, TestAppendNull) {
   ASSERT_OK(builder_->AppendNull());
   ASSERT_EQ(2, static_cast<int>(builder_->num_fields()));
 
-  ListBuilder* list_vb = checked_cast<ListBuilder*>(builder_->field_builder(0));

Review comment:
       Is there any reason you removed these lines? The test doesn't look union-related.

##########
File path: cpp/src/arrow/ipc/util.h
##########
@@ -19,6 +19,8 @@
 
 #include <cstdint>
 
+#include "arrow/type.h"

Review comment:
       Hmm... why?

##########
File path: cpp/src/arrow/array/array_nested.cc
##########
@@ -678,6 +680,10 @@ Result<std::shared_ptr<Array>> DenseUnionArray::Make(
     return Status::TypeError("UnionArray type_ids must be signed int8");
   }
 
+  if (type_ids.null_count() != 0) {
+    return Status::Invalid("Union type ids may not have nulls");
+  }
+
   if (value_offsets.null_count() != 0) {
     return Status::Invalid("Make does not allow NAs in value_offsets");

Review comment:
       Slightly OT, but shouldn't we normalize "NA" vs "null" in our error messages?

##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -395,8 +395,17 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
-  Status AppendNull() final { return Append(false); }
+  /// \brief Append a null value. Automatically appends a null to each child
+  /// builder.

Review comment:
       Do we really want this? It seems it would create null bitmaps in the children even if all nulls are at the top level.

##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -395,8 +395,17 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
-  Status AppendNull() final { return Append(false); }
+  /// \brief Append a null value. Automatically appends a null to each child
+  /// builder.

Review comment:
       We may want to add some ArrayBuilder methods `AppendEmpty` or `AppendZero`/`AppendZeros` to append arbitrary non-null empty/zero-like values.

##########
File path: cpp/src/arrow/array/array_base.h
##########
@@ -86,16 +86,17 @@ class ARROW_EXPORT Array {
   std::shared_ptr<DataType> type() const { return data_->type; }
   Type::type type_id() const { return data_->type->id(); }
 
-  /// Buffer for the null bitmap.
+  /// Buffer for the validity (null) bitmap, if any. Note that Union types
+  /// never have a null bitmap.
   ///
-  /// Note that for `null_count == 0`, this can be null.
-  /// This buffer does not account for any slice offset
+  /// Note that for `null_count == 0` or for null or union types, this will be

Review comment:
       Well, `null_count` would be 0 for a union array anyway, no?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org