You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2022/06/06 10:57:54 UTC

[spark] branch master updated: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 9e920782fd3 [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3
9e920782fd3 is described below

commit 9e920782fd34396dbdf31246d3b4a3c86c16f8f1
Author: sychen <sy...@ctrip.com>
AuthorDate: Mon Jun 6 19:57:35 2022 +0900

    [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3
    
    ### What changes were proposed in this pull request?
    This PR aims to upgrade Apache Hive `hive-storage-api` library from 2.7.2 to 2.7.3.
    
    ### Why are the changes needed?
    
    [HIVE-25190](https://issues.apache.org/jira/browse/HIVE-25190): Fix many small allocations in BytesColumnVector
    
    ```scala
    Caused by: java.lang.RuntimeException: Overflow of newLength. smallBuffer.length=1073741824, nextElemLength=408101
            at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.increaseBufferSpace(BytesColumnVector.java:311)
            at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:182)
            at org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:179)
            at org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:268)
            at org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:223)
            at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:294)
            at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:105)
            at org.apache.spark.sql.hive.execution.HiveOutputWriter.write(HiveFileFormat.scala:157)
            at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.write(FileFormatDataWriter.scala:176)
            at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithMetrics(FileFormatDataWriter.scala:86)
            at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:93)
            at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:312)
            at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1534)
            at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:319)
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Production environment verification, SQL that fails to write to ORC can run successfully after upgrading the version.
    
    Closes #36772 from cxzl25/SPARK-39387.
    
    Authored-by: sychen <sy...@ctrip.com>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +-
 pom.xml                               | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 b/dev/deps/spark-deps-hadoop-2-hive-2.3
index 6f9b068180f..02819f1f6c5 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -102,7 +102,7 @@ hive-shims-0.23/2.3.9//hive-shims-0.23-2.3.9.jar
 hive-shims-common/2.3.9//hive-shims-common-2.3.9.jar
 hive-shims-scheduler/2.3.9//hive-shims-scheduler-2.3.9.jar
 hive-shims/2.3.9//hive-shims-2.3.9.jar
-hive-storage-api/2.7.2//hive-storage-api-2.7.2.jar
+hive-storage-api/2.7.3//hive-storage-api-2.7.3.jar
 hive-vector-code-gen/2.3.9//hive-vector-code-gen-2.3.9.jar
 hk2-api/2.6.1//hk2-api-2.6.1.jar
 hk2-locator/2.6.1//hk2-locator-2.6.1.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 46559b1fa27..d8f8c2025fc 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -92,7 +92,7 @@ hive-shims-0.23/2.3.9//hive-shims-0.23-2.3.9.jar
 hive-shims-common/2.3.9//hive-shims-common-2.3.9.jar
 hive-shims-scheduler/2.3.9//hive-shims-scheduler-2.3.9.jar
 hive-shims/2.3.9//hive-shims-2.3.9.jar
-hive-storage-api/2.7.2//hive-storage-api-2.7.2.jar
+hive-storage-api/2.7.3//hive-storage-api-2.7.3.jar
 hive-vector-code-gen/2.3.9//hive-vector-code-gen-2.3.9.jar
 hk2-api/2.6.1//hk2-api-2.6.1.jar
 hk2-locator/2.6.1//hk2-locator-2.6.1.jar
diff --git a/pom.xml b/pom.xml
index ce7aa0d5d70..4bce557484b 100644
--- a/pom.xml
+++ b/pom.xml
@@ -247,7 +247,7 @@
     -->
     <hadoop.deps.scope>compile</hadoop.deps.scope>
     <hive.deps.scope>compile</hive.deps.scope>
-    <hive.storage.version>2.7.2</hive.storage.version>
+    <hive.storage.version>2.7.3</hive.storage.version>
     <hive.storage.scope>compile</hive.storage.scope>
     <hive.common.scope>compile</hive.common.scope>
     <hive.llap.scope>compile</hive.llap.scope>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org