You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/06/24 13:18:02 UTC

[spark] branch branch-3.0 updated: [SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new abcec13  [SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor
abcec13 is described below

commit abcec13e509d7b6e2f2f249e19f01214a7d286a9
Author: Bryan Cutler <cu...@gmail.com>
AuthorDate: Wed Jun 24 22:13:54 2020 +0900

    [SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor
    
    ### What changes were proposed in this pull request?
    
    This change simplifies the ArrowColumnVector ListArray accessor to use provided Arrow APIs available in v0.15.0 to calculate element indices.
    
    ### Why are the changes needed?
    
    This simplifies the code by avoiding manual calculations on the Arrow offset buffer and makes use of more stable APIs.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    Existing tests
    
    Closes #28915 from BryanCutler/arrow-simplify-ArrowColumnVector-ListArray-SPARK-32080.
    
    Authored-by: Bryan Cutler <cu...@gmail.com>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit df04107934241965199bd5454c62e1016bb3bdd9)
    Signed-off-by: HyukjinKwon <gu...@apache.org>
---
 .../java/org/apache/spark/sql/vectorized/ArrowColumnVector.java    | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
index d2220dc..72fccd4 100644
--- a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
+++ b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
@@ -17,7 +17,6 @@
 
 package org.apache.spark.sql.vectorized;
 
-import io.netty.buffer.ArrowBuf;
 import org.apache.arrow.vector.*;
 import org.apache.arrow.vector.complex.*;
 import org.apache.arrow.vector.holders.NullableVarCharHolder;
@@ -458,10 +457,8 @@ public final class ArrowColumnVector extends ColumnVector {
 
     @Override
     final ColumnarArray getArray(int rowId) {
-      ArrowBuf offsets = accessor.getOffsetBuffer();
-      int index = rowId * ListVector.OFFSET_WIDTH;
-      int start = offsets.getInt(index);
-      int end = offsets.getInt(index + ListVector.OFFSET_WIDTH);
+      int start = accessor.getElementStartIndex(rowId);
+      int end = accessor.getElementEndIndex(rowId);
       return new ColumnarArray(arrayData, start, end - start);
     }
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org