You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/06/24 13:18:02 UTC
[spark] branch branch-3.0 updated: [SPARK-32080][SPARK-31998][SQL]
Simplify ArrowColumnVector ListArray accessor
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new abcec13 [SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor
abcec13 is described below
commit abcec13e509d7b6e2f2f249e19f01214a7d286a9
Author: Bryan Cutler <cu...@gmail.com>
AuthorDate: Wed Jun 24 22:13:54 2020 +0900
[SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor
### What changes were proposed in this pull request?
This change simplifies the ArrowColumnVector ListArray accessor to use provided Arrow APIs available in v0.15.0 to calculate element indices.
### Why are the changes needed?
This simplifies the code by avoiding manual calculations on the Arrow offset buffer and makes use of more stable APIs.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing tests
Closes #28915 from BryanCutler/arrow-simplify-ArrowColumnVector-ListArray-SPARK-32080.
Authored-by: Bryan Cutler <cu...@gmail.com>
Signed-off-by: HyukjinKwon <gu...@apache.org>
(cherry picked from commit df04107934241965199bd5454c62e1016bb3bdd9)
Signed-off-by: HyukjinKwon <gu...@apache.org>
---
.../java/org/apache/spark/sql/vectorized/ArrowColumnVector.java | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
index d2220dc..72fccd4 100644
--- a/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
+++ b/sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java
@@ -17,7 +17,6 @@
package org.apache.spark.sql.vectorized;
-import io.netty.buffer.ArrowBuf;
import org.apache.arrow.vector.*;
import org.apache.arrow.vector.complex.*;
import org.apache.arrow.vector.holders.NullableVarCharHolder;
@@ -458,10 +457,8 @@ public final class ArrowColumnVector extends ColumnVector {
@Override
final ColumnarArray getArray(int rowId) {
- ArrowBuf offsets = accessor.getOffsetBuffer();
- int index = rowId * ListVector.OFFSET_WIDTH;
- int start = offsets.getInt(index);
- int end = offsets.getInt(index + ListVector.OFFSET_WIDTH);
+ int start = accessor.getElementStartIndex(rowId);
+ int end = accessor.getElementEndIndex(rowId);
return new ColumnarArray(arrayData, start, end - start);
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org