You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/27 13:12:23 UTC

[GitHub] [arrow] rymurr opened a new pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

rymurr opened a new pull request #7290:
URL: https://github.com/apache/arrow/pull/7290


   This fixes the integration tests for Union Arrays.
   
   Both Dense and Sparse unions now work:
   
   * validity buffer added to Sparse Union
   * both Union types now track logical types rather than MinorType enum ordinal
   
   Dependent on #7289 for duplicate field names.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648873558


   I sent an e-mail to dev@ -- let's discuss there


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-647589883


   Assuming CI is still passing, is this good to merge? What is left to do, or who needs to approve?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jacques-n commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
jacques-n commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648427018


   I'm really struggling with these changes. I don't understand why there is a validity buffer at the union level as well as at the cell level. I'm not sure what it even means that a a union value is valid but the inner value is not. I think there was an inconsistent understanding on my part when that change was proposed in the spec over what the Java implementation was doing. @wesm why would we have validity at both the top level and the inner level? Or is the C++ impl only at the top level and not the inner level (which means that this java code is incorrect since the inner vectors have their own validity vectors.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-643320990


   @jacques-n I think you had some concerns the last time sparse unions were made.  Do these changes raise the same issues?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r451749003



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -268,11 +270,11 @@ public long getDataBufferAddress() {
 
   @Override
   public long getValidityBufferAddress() {
-    return validityBuffer.memoryAddress();
+    return typeBuffer.memoryAddress();

Review comment:
       It seems `NullVector` throws `UnsupportedOperationException` in this case - should we be consistent with that?

##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -268,11 +270,11 @@ public long getDataBufferAddress() {
 
   @Override
   public long getValidityBufferAddress() {
-    return validityBuffer.memoryAddress();
+    return typeBuffer.memoryAddress();
   }
 
   @Override
-  public ArrowBuf getValidityBuffer() { return validityBuffer; }
+  public ArrowBuf getValidityBuffer() { return typeBuffer; }

Review comment:
       Ditto here.

##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -283,7 +285,7 @@ public long getValidityBufferAddress() {
   public ArrowBuf getDataBuffer() { throw new UnsupportedOperationException(); }
 
   public StructVector getStruct(byte typeId) {
-    StructVector structVector = (StructVector) childVectors[typeId];
+    StructVector structVector = typeId < 0 ? null : (StructVector) childVectors[typeId];

Review comment:
       Should negative type IDs throw an explicit error?

##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -812,12 +757,17 @@ public int getValueCount() {
   }
 
   public boolean isNull(int index) {

Review comment:
       This is inconsistent with getNullCount; this vector has nullCount == 0 but can have isNull return true regardless. C++ seems to always return false when there is no validity buffer.

##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -283,7 +285,7 @@ public long getValidityBufferAddress() {
   public ArrowBuf getDataBuffer() { throw new UnsupportedOperationException(); }
 
   public StructVector getStruct(byte typeId) {
-    StructVector structVector = (StructVector) childVectors[typeId];
+    StructVector structVector = typeId < 0 ? null : (StructVector) childVectors[typeId];

Review comment:
       Ah, negative IDs are used as a sentinel for not yet allocated...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm edited a comment on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911


   > @wesm why would we have validity at both the top level and the inner level?
   
   Well, the way the specification is written
   
   * _All_ nested types including union are composed from well-formed child arrays which may have null values. 
   * Additionally, all array types, including all nested array types, have their own validity bitmap
   * In the case of union would indicate that the type of the child is not known. This seems algebraically consistent to me. 
   
   We can decide to stipulate that union types never have non-valid values at the Union cell level, only at the child cell level. But then a union value cannot be "made null" by changing the validity bitmap of the Union. From a purely algebraic / design perspective it isn't great. From the get-go in the project I've been striving for algebraic consistency, e.g. enabling well-formed arrays to be composed to created composite types without alteration. Since Unions are one of the more seldom used part of the project we can decide to explicitly deviate from that, but we need to do it now if we're going to do that and not delay longer in reconciling the issue. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r441052320



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -325,12 +361,45 @@ private void allocateTypeBuffer() {
     typeBuffer.setZero(0, typeBuffer.capacity());
   }
 
+  private void allocateValidityBuffer() {
+    validityBuffer = allocator.buffer(validityBufferAllocationSizeInBytes);
+    validityBuffer.readerIndex(0);
+    validityBuffer.setZero(0, validityBuffer.capacity());
+  }
+
   @Override
   public void reAlloc() {
     internalStruct.reAlloc();
+    reallocValidityBuffer();
     reallocTypeBuffer();
   }
 
+  private void reallocValidityBuffer() {
+    final long currentBufferCapacity = validityBuffer.capacity();
+    long newAllocationSize = currentBufferCapacity * 2;
+    if (newAllocationSize == 0) {
+      if (validityBufferAllocationSizeInBytes > 0) {
+        newAllocationSize = validityBufferAllocationSizeInBytes;
+      } else {
+        newAllocationSize = DataSizeRoundingUtil.divideBy8Ceil(BaseValueVector.INITIAL_VALUE_ALLOCATION) * 2;
+      }

Review comment:
       removed. Probably a copy/paste error




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r441045982



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -104,6 +105,7 @@
    * The index is the type id, and the value is the type field.
    */
   private Field[] typeFields = new Field[Byte.MAX_VALUE + 1];
+  private byte[] typeMapFields = new byte[Byte.MAX_VALUE + 1];

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656636105


   Just rebased to include the memory related changes. Avoid any surprise broken master builds. @liyafan82 think its ready to go?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452103046



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -812,12 +757,17 @@ public int getValueCount() {
   }
 
   public boolean isNull(int index) {

Review comment:
       I set `isNull` to always be false and all the tests seem to pass (with minor modifications). Take a look and let me know what you think.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-652516529


   This has been modified to incorporate the changes to Unions as proposed on the mailing list


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-651630439


   I _think_ this is now inline with the spec. The Union/DenseUnion types now uses logical type ids in Java. Which is the same as in c++. The difference in a java created Union is that the type ids are chosen from a known index (minor type). But the java implementation doesn't rely on minor type ordinals strict ordering like before. Or at least the intention was to be fully in line with spec while maintaining a convenient default choice for type id. Drawing any meaning from type buffers alone in client code would be an error in my opinion. Hope that clarifies my intention w/ this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jacques-n edited a comment on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
jacques-n edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648427018


   I'm really struggling with these changes. I don't understand why there is a validity buffer at the union level as well as at the cell level. I'm not sure what it even means that a a union value is valid but the inner value is not. I think there was an inconsistent understanding on my part when that change was proposed in the format (versus what the Java implementation was already doing). @wesm why would we have validity at both the top level and the inner level? Or is the C++ impl only at the top level and not the inner level (which means that this java code is incorrect since the inner vectors have their own validity vectors.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439335068



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -325,12 +361,45 @@ private void allocateTypeBuffer() {
     typeBuffer.setZero(0, typeBuffer.capacity());
   }
 
+  private void allocateValidityBuffer() {
+    validityBuffer = allocator.buffer(validityBufferAllocationSizeInBytes);
+    validityBuffer.readerIndex(0);
+    validityBuffer.setZero(0, validityBuffer.capacity());
+  }
+
   @Override
   public void reAlloc() {
     internalStruct.reAlloc();
+    reallocValidityBuffer();
     reallocTypeBuffer();
   }
 
+  private void reallocValidityBuffer() {
+    final long currentBufferCapacity = validityBuffer.capacity();
+    long newAllocationSize = currentBufferCapacity * 2;
+    if (newAllocationSize == 0) {
+      if (validityBufferAllocationSizeInBytes > 0) {
+        newAllocationSize = validityBufferAllocationSizeInBytes;
+      } else {
+        newAllocationSize = DataSizeRoundingUtil.divideBy8Ceil(BaseValueVector.INITIAL_VALUE_ALLOCATION) * 2;
+      }
+    }
+
+    newAllocationSize = CommonUtil.nextPowerOfTwo(newAllocationSize);
+    assert newAllocationSize >= 1;
+
+    if (newAllocationSize > BaseValueVector.MAX_ALLOCATION_SIZE) {
+      throw new OversizedAllocationException("Unable to expand the buffer");
+    }
+
+    final ArrowBuf newBuf = allocator.buffer((int)newAllocationSize);
+    newBuf.setBytes(0, validityBuffer, 0, currentBufferCapacity);
+    newBuf.setZero(currentBufferCapacity, newBuf.capacity() - currentBufferCapacity);
+    validityBuffer.getReferenceManager().release(1);
+    validityBuffer = newBuf;
+    validityBufferAllocationSizeInBytes = (int)newAllocationSize;

Review comment:
       there should be a space after the cast, or the style check would fail?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648439435


   FTR I'm OK with dropping the top-level validity bitmap from Union, especially if it helps us move forward


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jacques-n commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
jacques-n commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648428724


   Adding to my previous comments: if only at the top level, I'm not sure what the ramification of that would mean at the Java codebase. I think it would require a fairly massive refactoring.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-657125577


   Rebased since https://github.com/apache/arrow/pull/7685 was merged


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm edited a comment on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911


   > @wesm why would we have validity at both the top level and the inner level?
   
   Well, the way the specification is written
   
   * _All_ nested types including union are composed from well-formed child arrays which may have null values. 
   * Additionally, all array types, including all nested array types, have their own validity bitmap
   * In the case of union, a null at the top level would indicate that the type of the child is not known. This seems algebraically consistent to me. 
   
   We can decide to stipulate that union types never have non-valid values at the Union cell level, only at the child cell level. But then a union value cannot be "made null" by changing the validity bitmap of the Union. From a purely algebraic / design perspective it isn't great. From the get-go in the project I've been striving for algebraic consistency, e.g. enabling well-formed arrays to be composed to created composite types without alteration. Since Unions are one of the more seldom used part of the project we can decide to explicitly deviate from that, but we need to do it now if we're going to do that and not delay longer in reconciling the issue. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r453152953



##########
File path: dev/archery/archery/integration/runner.py
##########
@@ -126,6 +126,8 @@ def _gold_tests(self, gold_dir):
                             if f.name == name).skip
             except StopIteration:
                 skip = set()
+            if name == 'union' and prefix == '0.17.1':

Review comment:
       IMO I think this is OK for now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452934813



##########
File path: dev/archery/archery/integration/runner.py
##########
@@ -126,6 +126,8 @@ def _gold_tests(self, gold_dir):
                             if f.name == name).skip
             except StopIteration:
                 skip = set()
+            if name == 'union' and prefix == '0.17.1':

Review comment:
       This is a minor hack to remove Java from 0.17.1 union gold tests. Not the most elegant way to do it but doing it correctly would be a non-trivial change to archery imho. Thoughts?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-647903916


   It would be great if @jacques-n could confirm he is OK with these changes.  I believe on a previous PR the extra indirection was a concern.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439338218



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -586,7 +686,9 @@ public ValueVector getVectorByType(int typeId) {
   }
 
     public ValueVector getVectorByType(int typeId, ArrowType arrowType) {
-      switch (MinorType.values()[typeId]) {
+      Types.MinorType type = (typeIds[typeId] != null) ?
+          Types.getMinorTypeForArrowType(typeIds[typeId].getType()) : Types.MinorType.values()[typeId];
+      switch (type) {

Review comment:
       Here we are still relying on the MinorType ordinal as the type ID?
   If so, the sparse union vector is still not aligning with the specification, and that causes the integration test to fail?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439325184



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -104,6 +105,7 @@
    * The index is the type id, and the value is the type field.
    */
   private Field[] typeFields = new Field[Byte.MAX_VALUE + 1];
+  private byte[] typeMapFields = new byte[Byte.MAX_VALUE + 1];

Review comment:
       Could you please write some comments about the indices & values of the array?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r441047163



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -493,6 +576,19 @@ public void splitAndTransfer(int startIndex, int length) {
       Preconditions.checkArgument(startIndex >= 0 && length >= 0 && startIndex + length <= valueCount,
           "Invalid parameters startIndex: %s, length: %s for valueCount: %s", startIndex, length, valueCount);
       to.clear();
+      // transfer validity buffer
+      while (to.getValidityBufferValueCapacity() < length) {
+        to.reallocValidityBuffer();
+      }
+      for (int i = 0; i < length; i++) {
+        int validity = BitVectorHelper.get(validityBuffer, startIndex + i);
+        if (validity == 0) {
+          BitVectorHelper.unsetBit(to.validityBuffer, i);
+        } else {
+          BitVectorHelper.setBit(to.validityBuffer, i);
+        }

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm closed pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm closed pull request #7290:
URL: https://github.com/apache/arrow/pull/7290


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-651515749


   @liyafan82 I think that is a good point.  If it supports both modes I think that is a reasonable compromise for now as long as @jacques-n is OK with it.  But we can maybe discuss further if needed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452172011



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -812,12 +757,17 @@ public int getValueCount() {
   }
 
   public boolean isNull(int index) {

Review comment:
       I think from the viewpoint of code generically manipulating vectors, this makes more sense - the entries of the union are never null, but they may reference something that is null. I agree it's less convenient when working with a union array explicitly though.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656664816


   > Seems like Java needs to be excluded from testing the 0.17.1 "golden" reference union file. Also, Java produces data that C++ can't read, but Java can read the C++ files; this fails the IPC tests with Java producing, and the Flight tests with both Java producing and consuming (since Flight tests upload and download).
   
   Wonder if its the MetadataVersion change that went in for c++ today? Looking now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452084718



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -283,7 +285,7 @@ public long getValidityBufferAddress() {
   public ArrowBuf getDataBuffer() { throw new UnsupportedOperationException(); }
 
   public StructVector getStruct(byte typeId) {
-    StructVector structVector = (StructVector) childVectors[typeId];
+    StructVector structVector = typeId < 0 ? null : (StructVector) childVectors[typeId];

Review comment:
       correct, its a bit strange as there is no way to tell between null and not yet allocated. One of hte only benefits of a separate validity buffer. I thought this was a good compromise though I dont love it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656651984


   Looks like unions fail in integration again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656660725


   Seems like Java needs to be excluded from testing the 0.17.1 "golden" reference union file. Also, Java produces data that C++ can't read, but Java can read the C++ files; this fails the IPC tests with Java producing, and the Flight tests with both Java producing and consuming (since Flight tests upload and download).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r441050322



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -586,7 +686,9 @@ public ValueVector getVectorByType(int typeId) {
   }
 
     public ValueVector getVectorByType(int typeId, ArrowType arrowType) {
-      switch (MinorType.values()[typeId]) {
+      Types.MinorType type = (typeIds[typeId] != null) ?
+          Types.getMinorTypeForArrowType(typeIds[typeId].getType()) : Types.MinorType.values()[typeId];
+      switch (type) {

Review comment:
       This switch uses the `typeIds` array to extract the MinorType associated with the logical type. If it does not exist in the array (usually because a logical type wasn't previously assigned) it defaults to the id of the minor type. So Java always uses the MinorType as the type id and allows c++ (and others) to send arbitrary type mappings.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648810719


   Thanks @wesm and @jacques-n for the review. I will leave this up until consensus is reached on the format change. Please let me know if I can help w/ the c++ patch, would be happy to contribute!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439334686



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -325,12 +361,45 @@ private void allocateTypeBuffer() {
     typeBuffer.setZero(0, typeBuffer.capacity());
   }
 
+  private void allocateValidityBuffer() {
+    validityBuffer = allocator.buffer(validityBufferAllocationSizeInBytes);
+    validityBuffer.readerIndex(0);
+    validityBuffer.setZero(0, validityBuffer.capacity());
+  }
+
   @Override
   public void reAlloc() {
     internalStruct.reAlloc();
+    reallocValidityBuffer();
     reallocTypeBuffer();
   }
 
+  private void reallocValidityBuffer() {
+    final long currentBufferCapacity = validityBuffer.capacity();
+    long newAllocationSize = currentBufferCapacity * 2;
+    if (newAllocationSize == 0) {
+      if (validityBufferAllocationSizeInBytes > 0) {
+        newAllocationSize = validityBufferAllocationSizeInBytes;
+      } else {
+        newAllocationSize = DataSizeRoundingUtil.divideBy8Ceil(BaseValueVector.INITIAL_VALUE_ALLOCATION) * 2;
+      }

Review comment:
       do we need to multiply by 2 here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911


   > @wesm why would we have validity at both the top level and the inner level?
   
   Well, the way the specification is written
   
   * _All_ nested types including union are composed from well-formed child arrays which may be nullable. 
   * Additionally, all array types, including all nested array types, have their own validity bitmap
   * In the case of union would indicate that the type of the child is not known. This seems algebraically consistent to me. 
   
   We can decide to stipulate that union types never have non-valid values at the Union cell level, only at the child cell level. But then a union value cannot be "made null" by changing the validity bitmap of the Union. From a purely algebraic / design perspective it isn't great. From the get-go in the project I've been striving for algebraic consistency, e.g. enabling well-formed arrays to be composed to created composite types without alteration. Since Unions are one of the more seldom used part of the project we can decide to explicitly deviate from that, but we need to do it now if we're going to do that and not delay longer in reconciling the issue. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656731970


   It is indeed the V4 patch that has been submitted on the c++ side and not on the Java side. I think this means we need to submit your patch first @lidavidm? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452082717



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -268,11 +270,11 @@ public long getDataBufferAddress() {
 
   @Override
   public long getValidityBufferAddress() {
-    return validityBuffer.memoryAddress();
+    return typeBuffer.memoryAddress();
   }
 
   @Override
-  public ArrowBuf getValidityBuffer() { return validityBuffer; }
+  public ArrowBuf getValidityBuffer() { return typeBuffer; }

Review comment:
       same here
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439336805



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -493,6 +576,19 @@ public void splitAndTransfer(int startIndex, int length) {
       Preconditions.checkArgument(startIndex >= 0 && length >= 0 && startIndex + length <= valueCount,
           "Invalid parameters startIndex: %s, length: %s for valueCount: %s", startIndex, length, valueCount);
       to.clear();
+      // transfer validity buffer
+      while (to.getValidityBufferValueCapacity() < length) {
+        to.reallocValidityBuffer();
+      }
+      for (int i = 0; i < length; i++) {
+        int validity = BitVectorHelper.get(validityBuffer, startIndex + i);
+        if (validity == 0) {
+          BitVectorHelper.unsetBit(to.validityBuffer, i);
+        } else {
+          BitVectorHelper.setBit(to.validityBuffer, i);
+        }

Review comment:
       Maybe we can use 
   ```
   setValidityBit(ArrowBuf validityBuffer, int index, int value)
   ```
   here to make the code more concise?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-657142013


   Tests pass so I'm merging. Hooray!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jacques-n commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
jacques-n commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648507985


   > We can decide to stipulate that union types never have non-valid values at the Union cell level, only at the child cell level. But then a union value cannot be "made null" by changing the validity bitmap of the Union. 
   
   I believe a union can still express this with a child type null type, no? I think that is how we either modeled it or planned to model it no the java side.
   
   > All nested types including union are composed from well-formed child arrays which may have null values.
   
   I'm in agreement on this. Decomposing would be complex.
   
   > In the case of union, a null at the top level would indicate that the type of the child is not known. This seems algebraically consistent to me.
   
   I think it's where the model breaks down because of the weird situation where you actually need to evaluate two validity buffers to determine whether something is valid: the parent and the child. And an inconsistency would be really weird. As such, I'm think it would be better to avoid the top-level validity buffer.
   
   > FTR I'm OK with dropping the top-level validity bitmap from Union, especially if it helps us move forward
   
   That would be my preference. It seems to ultimately reduce the risk of inconsistency and doesn't seem to have any functional loss (given the use of null type to indicate a non-alternatively-typed value). I also think this works well in the most common case of union types, e.g. two files where one has fieldA with schemaA and another where you have fieldA with schemaB. Compositing those two doesn't require some kind of introspection and AND'ing of the individual children to build an additional validity buffer (or simply setting true for all and then having an inconsistency with the child array) and allows a fast set of the type vector for each independent chunk.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-642630745


   It looks like the integration tests are still failing for union arrays between C++ and Java.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 edited a comment on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-651492666


   In addition to the problem of top level validity buffer, I think there is another problem to discuss: Java is using the ordinal of the minor type as the type id, and this is not aligning with the C++ implementation/specification. 
   
   This PR works around the problem by supporting two modes. However, the C++ and Java implementations are not really aligning, IMO. Do we have a plan to mitigate this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-657194095


   Thanks for the rebate @wesm! Glad we have finished this one!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r439336805



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -493,6 +576,19 @@ public void splitAndTransfer(int startIndex, int length) {
       Preconditions.checkArgument(startIndex >= 0 && length >= 0 && startIndex + length <= valueCount,
           "Invalid parameters startIndex: %s, length: %s for valueCount: %s", startIndex, length, valueCount);
       to.clear();
+      // transfer validity buffer
+      while (to.getValidityBufferValueCapacity() < length) {
+        to.reallocValidityBuffer();
+      }
+      for (int i = 0; i < length; i++) {
+        int validity = BitVectorHelper.get(validityBuffer, startIndex + i);
+        if (validity == 0) {
+          BitVectorHelper.unsetBit(to.validityBuffer, i);
+        } else {
+          BitVectorHelper.setBit(to.validityBuffer, i);
+        }

Review comment:
       nit: Maybe we can use 
   ```
   setValidityBit(ArrowBuf validityBuffer, int index, int value)
   ```
   here to make the code more concise?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452082646



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -268,11 +270,11 @@ public long getDataBufferAddress() {
 
   @Override
   public long getValidityBufferAddress() {
-    return validityBuffer.memoryAddress();
+    return typeBuffer.memoryAddress();

Review comment:
       Ahh, yes. I copied UnionVector (which returns `typeBuffer`) but I didn't really like it. Fixed on both




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] liyafan82 commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
liyafan82 commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-651492666


   In addition to the problem of top level validity buffer, I think there is another problem to discuss: Java is using the ordinal of the minor type as the type id, and this is not aligning with the C++ implementation/specification. 
   
   This PR works around the problem by supporting two modes. However, the C++ and Java implementations are not really aligning, IMO. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-634652668


   https://issues.apache.org/jira/browse/ARROW-1692


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on a change in pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#discussion_r452086650



##########
File path: java/vector/src/main/codegen/templates/DenseUnionVector.java
##########
@@ -812,12 +757,17 @@ public int getValueCount() {
   }
 
   public boolean isNull(int index) {

Review comment:
       I am ok with changing this to always return false. However I think this means anyone using a Union vector would have to extract the child and check for null as an extra step. All other nested vectors have a top level validity buffer. I don't know what the ramifications of changing this are?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-656736755


   Hmm - Java doesn't check the metadata version and the error makes it sound like it's layout-related? Oh wait. So now C++'s view of the union layout depends on the metadata version. Ok, then let's merge the Java V5 metadata change first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-642647124


   > It looks like the integration tests are still failing for union arrays between C++ and Java.
   
   yeah, something w/ Flight, though the IPC works fine. Checking what is different in flight now...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7290: ARROW-1692: [Java] UnionArray round trip not working

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648510708


   > That would be my preference.
   
   I'm OK with this. We would need to act quickly to try to pull this off for the release. I can start a DISCUSS thread and work up a patch with the proposed changes to the specification


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org