You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/03 09:21:37 UTC

[GitHub] [arrow] tianchen92 opened a new pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

tianchen92 opened a new pull request #7887:
URL: https://github.com/apache/arrow/pull/7887


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #7887:
URL: https://github.com/apache/arrow/pull/7887


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486768419



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486737805



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I add documentation for this API. The change meets the [spec](http://arrow.apache.org/docs/format/Columnar.html#struct-layout)(There is no need for the child arrays to be null when the top-level struct slot is null) .
   The difference between append empty and append null value is different.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486737805



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I add documentation for this API. The change meets the [spec](http://arrow.apache.org/docs/format/Columnar.html#struct-layout)(There is no need for the child arrays to be null when the top-level struct slot is null) .
   The difference between append empty and append null value is different.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-714499708


   Working on this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484178023



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -109,6 +109,25 @@ class BaseListBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendEmpty() final {
+    null_bitmap_builder_.Forward(1);

Review comment:
       Done

##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -109,6 +109,25 @@ class BaseListBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendEmpty() final {
+    null_bitmap_builder_.Forward(1);
+    ++length_;
+    ++null_count_;
+    return AppendNextOffset();
+  }
+
+  Status AppendEmpties(int64_t length) final {
+    ARROW_RETURN_NOT_OK(CheckNextOffset());

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r478514839



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       Why exactly are you mutating a null bitmap in a pretty printer?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-682032875


   As for naming, according to various dictionaries, "empty" as a noun (and "empties") is rather informal.
   It seems `AppendEmptyValue` and `AppendEmptyValues` may be more appropriate (and more explicit), but I'm not a native English speaker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r471247138



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmpty() { return AppendNull(); }

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486768419



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r478516181



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -109,6 +109,25 @@ class BaseListBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendEmpty() final {
+    null_bitmap_builder_.Forward(1);

Review comment:
       Shouldn't you `Reserve(1)`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r498616557



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       the fix for arrow 9603 has been merged, could we try reverting change for parquet and seeing what passes/fails.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r490318261



##########
File path: cpp/src/arrow/json/parser_test.cc
##########
@@ -188,7 +189,7 @@ TEST(BlockParserWithSchema, Nested) {
                       field("nuf", struct_({field("ps", utf8())}))},
                      {"[\"thing\", null, \"\xe5\xbf\x8d\", null]",
                       R"([["1", "2", "3"], ["2"], [], null])",
-                      R"([{"ps":null}, null, {"ps":"78"}, {"ps":"90"}])"});
+                      R"([{"ps":null}, {}, {"ps":"78"}, {"ps":"90"}])"});

Review comment:
       This seems fine




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-689569522


   @tianchen92 The Parquet failure is probably a bug in the nested Parquet reader. I've pushed changes to your branch, be careful to pick them up.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484177957



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       Thanks for review, I have some questions, correct me if I am wrong:
   1 now seems AppendNull/AppendNulls set null bitmap and append empty value using data_builder_, if AppendEmpty/AppendEmpties also do this, what is the difference between this two API?
   2 From http://arrow.apache.org/docs/format/Columnar.html#struct-layout that there is no need for the child arrays to be null when the top-level struct slot is null, so I thought only 'empty' value should be appended (without setting null bitmap)。So in PrettyPrint, we need to change child null bitmap to ensure the correct output.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-688588498


   > As for naming, according to various dictionaries, "empty" as a noun (and "empties") is rather informal.
   > It seems `AppendEmptyValue` and `AppendEmptyValues` may be more appropriate (and more explicit), but I'm not a native English speaker.
   
   Fine, fixed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-688620320


   Hi, @emkornfield , please also take a look at this patch when you have time.
   This patch add AppendEmptyValue/AppendEmptyValues used in StructBuilder#AppendNull/AppendNulls, which would set child null bitmap slot to true rather than false( http://arrow.apache.org/docs/format/Columnar.html#struct-layout).
   And now this test ArrowReaderWriter#SingleColumnNullableStruct failed, could you please help fix this since you're export in parquet :) Thanks very much!  error message as below:
   
   >>Failed
   Unequal at absolute position 2
   @@ -1, +1 @@
   -{Backward: 0}
   +{Backward: 10}
   Expected:
     -- is_valid:
         [
         false,
         true
       ]
     -- child 0 type: int64
       [
         0,
         10
       ]
   Actual:
     -- is_valid:
         [
         false,
         true
       ]
     -- child 0 type: int64
       [
         null,
         0
       ]
   [  FAILED  ] ArrowReadWrite.SingleColumnNullableStruct (7 ms)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r470062976



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmpty() { return AppendNull(); }

Review comment:
       Well, we'd like AppendEmpty to precisely append a (arbitrary but defined) non-null value, so this is not correct.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486759286



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484178152



##########
File path: cpp/src/arrow/buffer_builder.h
##########
@@ -292,6 +292,11 @@ class TypedBufferBuilder<bool> {
     return Status::OK();
   }
 
+  void Forward(int64_t num_elements) {

Review comment:
       See comments above~




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-714212972


   @pitrou any objections to merging now (I would expect this might conflict with Decimal256 PR).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r485588382



##########
File path: cpp/src/arrow/array/builder_dict.h
##########
@@ -233,6 +233,20 @@ class DictionaryBuilderBase : public ArrayBuilder {
     return indices_builder_.AppendNulls(length);
   }
 
+  Status AppendEmptyValue() final {
+    length_ += 1;
+    null_count_ += 1;
+
+    return indices_builder_.AppendEmptyValue();
+  }
+
+  Status AppendEmptyValues(int64_t length) final {
+    length_ += length;
+    null_count_ += length;

Review comment:
       Same here.

##########
File path: cpp/src/arrow/array/builder_dict.h
##########
@@ -233,6 +233,20 @@ class DictionaryBuilderBase : public ArrayBuilder {
     return indices_builder_.AppendNulls(length);
   }
 
+  Status AppendEmptyValue() final {
+    length_ += 1;
+    null_count_ += 1;

Review comment:
       Hmm, I don't think we should increase the null count here, or is it necessary?

##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -395,19 +416,35 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
-  /// \brief Append a null value. Automatically appends a null to each child
+  /// \brief Append a null value. Automatically appends a empty to each child
   /// builder.
   Status AppendNull() final {
     for (const auto& field : children_) {
-      ARROW_RETURN_NOT_OK(field->AppendNull());
+      ARROW_RETURN_NOT_OK(field->AppendEmptyValue());
     }
     return Append(false);
   }
 
-  /// \brief Append multiple null values. Automatically appends nulls to each
+  /// \brief Append multiple null values. Automatically appends empties to each

Review comment:
       "empty values"

##########
File path: cpp/src/arrow/array/builder_primitive.h
##########
@@ -43,6 +43,14 @@ class ARROW_EXPORT NullBuilder : public ArrayBuilder {
   /// \brief Append a single null element
   Status AppendNull() final { return AppendNulls(1); }
 
+  Status AppendEmptyValues(int64_t length) final {
+    if (length < 0) return Status::Invalid("length must be positive");
+    length_ += length;

Review comment:
       For the null type, `AppendEmptyValues` should simply be the same as `AppendNulls` (the only possible value is null).

##########
File path: cpp/src/arrow/array/builder_dict.h
##########
@@ -409,6 +423,20 @@ class DictionaryBuilderBase<BuilderType, NullType> : public ArrayBuilder {
     return indices_builder_.AppendNulls(length);
   }
 
+  Status AppendEmptyValue() final {
+    length_ += 1;
+    null_count_ += 1;

Review comment:
       Same here and below.

##########
File path: cpp/src/arrow/array/builder_adaptive.h
##########
@@ -64,6 +64,16 @@ class ARROW_EXPORT AdaptiveIntBuilderBase : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendEmptyValues(int64_t length) final {
+    ARROW_RETURN_NOT_OK(CommitPendingData());
+    ARROW_RETURN_NOT_OK(Reserve(length));
+    memset(data_->mutable_data() + length_ * int_size_, 0, int_size_ * length);
+    UnsafeSetNotNull(length);
+    return Status::OK();
+  }
+
+  Status AppendEmptyValue() final { return AppendEmptyValues(1); }

Review comment:
       This could actually be implemented exactly like `AppendNull()` above, except for the nulls part :-)

##########
File path: cpp/src/arrow/json/parser_test.cc
##########
@@ -188,7 +189,7 @@ TEST(BlockParserWithSchema, Nested) {
                       field("nuf", struct_({field("ps", utf8())}))},
                      {"[\"thing\", null, \"\xe5\xbf\x8d\", null]",
                       R"([["1", "2", "3"], ["2"], [], null])",
-                      R"([{"ps":null}, null, {"ps":"78"}, {"ps":"90"}])"});
+                      R"([{"ps":null}, {}, {"ps":"78"}, {"ps":"90"}])"});

Review comment:
       Hmm... @bkietz Can you validate this change is ok? It seems it only affects internal implementation details.

##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -395,19 +416,35 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
-  /// \brief Append a null value. Automatically appends a null to each child
+  /// \brief Append a null value. Automatically appends a empty to each child

Review comment:
       "an empty value" would be better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-704441085


   @tianchen92 do you want to pick this up again? the bug that @emkornfield mentioned has been fixed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484621822



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       > The difference should be this:
   > 
   > * `AppendNull` appends a null entry in the struct
   > * `AppendEmptyValue` appends a non-null entry in the struct
   
   Thanks for your clarification, agree with you and I have updated this PR, please take a look, thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484347796



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       The difference should be this:
   * `AppendNull` appends a null entry in the struct
   * `AppendEmptyValue` appends a non-null entry in the struct




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484348159



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       > So in PrettyPrint, we need to change child null bitmap to ensure the correct output
   
   We certainly must not *change* the null bitmap. PrettyPrint just shows the child arrays as they are, it does not need to be changed here (such arrays can already be produced otherwise).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486028649



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       these could really use some documentation it isn't entirely clear to me there purpose, and why they are separate from append null and what functionality ends up changing by call this in structbuilder.AppendNull




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r470062219



##########
File path: cpp/src/arrow/buffer_builder.h
##########
@@ -292,6 +292,11 @@ class TypedBufferBuilder<bool> {
     return Status::OK();
   }
 
+  void AppendEmpty(int64_t num_elements) {

Review comment:
       We must not leave elements undefined in an array, because it can reveal previous data. "Empty" values here should still be non-null values of arbitrary value (but still well-defined).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r471248061



##########
File path: cpp/src/arrow/buffer_builder.h
##########
@@ -292,6 +292,11 @@ class TypedBufferBuilder<bool> {
     return Status::OK();
   }
 
+  void AppendEmpty(int64_t num_elements) {

Review comment:
       Hmm,here is not "AppendEmpty" API, just move null_bit_map pointer to avoid set child arrays null when the top-level struct slot is null. I renamed to "Forward" here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486759286



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-706527313


   @emkornfield @nealrichardson @pitrou Sorry for the delay due to National Day Holiday, I reverted parquet related changes and rebased this PR. Seems tests all passed (one unrelated failing). Please help take a look again, thanks!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486759286



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       Yes, the change meets the spec but potentially breaks parquet writing for use-cases going through th builder per https://issues.apache.org/jira/browse/ARROW-9603?filter=-2 (which relies on the behavior of forcing nulls).  I would prefer not to change this behavior before fixing the referenced bug.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-667917796


   https://issues.apache.org/jira/browse/ARROW-9304


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-681356508


   @pitrou does this seem ok now?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r486737805



##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I add documentation for this API. The change meets the [spec](http://arrow.apache.org/docs/format/Columnar.html#struct-layout)(There is no need for the child arrays to be null when the top-level struct slot is null) .
   The difference between append empty and append null value is different.

##########
File path: cpp/src/arrow/array/builder_base.h
##########
@@ -98,6 +98,9 @@ class ARROW_EXPORT ArrayBuilder {
   virtual Status AppendNull() = 0;
   virtual Status AppendNulls(int64_t length) = 0;
 
+  virtual Status AppendEmptyValue() = 0;

Review comment:
       I am ok, let's hold this on.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-689935755


   @tianchen92 sorry I missed the tag.  Is it possible the failure could be related to https://issues.apache.org/jira/browse/ARROW-9603?filter=-2


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484621904



##########
File path: cpp/src/arrow/pretty_print.cc
##########
@@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter {
     children.reserve(array.num_fields());
     for (int i = 0; i < array.num_fields(); ++i) {
       children.emplace_back(array.field(i));
+      // set field array null at index which parent struct is null,
+      // because StructBuilder#AppendNull dosen't set child null bitmap.
+      auto field_null_buf = array.field(i)->null_bitmap();
+      if (field_null_buf != nullptr) {
+        uint8_t* field_null_data = field_null_buf->mutable_data();

Review comment:
       > > So in PrettyPrint, we need to change child null bitmap to ensure the correct output
   > 
   > We certainly must not _change_ the null bitmap. PrettyPrint just shows the child arrays as they are, it does not need to be changed here (such arrays can already be produced otherwise).
   
   Ok, reverted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#issuecomment-689938956


   @pitrou Thanks for your patch and careful review, I updated PR according to the comments above.
   @emkornfield doesn't matter :), pitrou helped disabled the failed test and now seems unblocked.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
tianchen92 commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r484622055



##########
File path: cpp/src/arrow/buffer_builder.h
##########
@@ -292,6 +292,11 @@ class TypedBufferBuilder<bool> {
     return Status::OK();
   }
 
+  void Forward(int64_t num_elements) {

Review comment:
       This method was removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r478516412



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -109,6 +109,25 @@ class BaseListBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendEmpty() final {
+    null_bitmap_builder_.Forward(1);
+    ++length_;
+    ++null_count_;
+    return AppendNextOffset();
+  }
+
+  Status AppendEmpties(int64_t length) final {
+    ARROW_RETURN_NOT_OK(CheckNextOffset());

Review comment:
       Shouldn't you `Reserve(length)`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7887:
URL: https://github.com/apache/arrow/pull/7887#discussion_r478519678



##########
File path: cpp/src/arrow/buffer_builder.h
##########
@@ -292,6 +292,11 @@ class TypedBufferBuilder<bool> {
     return Status::OK();
   }
 
+  void Forward(int64_t num_elements) {

Review comment:
       This will leave some entries undefined unless they're written manually elsewhere, which I don't think happens here.
   I don't think this method is needed, instead you should just `Append(true)` or `Append(length, true)`, IMHO.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org