You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2018/04/18 05:33:19 UTC
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
GitHub user dongjoon-hyun opened a pull request:
https://github.com/apache/spark/pull/21093
[SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4.3
## What changes were proposed in this pull request?
This PR updates Apache ORC dependencies to 1.4.3 released on February 9th. Apache ORC 1.4.2 release removes unnecessary dependencies and 1.4.3 has 5 more patches (https://s.apache.org/Fll8).
Especially, the following ORC-285 is fixed at 1.4.3.
```scala
scala> val df = Seq(Array.empty[Float]).toDF()
scala> df.write.format("orc").save("/tmp/floatarray")
scala> spark.read.orc("/tmp/floatarray")
res1: org.apache.spark.sql.DataFrame = [value: array<float>]
scala> spark.read.orc("/tmp/floatarray").show()
18/02/12 22:09:10 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.io.IOException: Error reading file: file:/tmp/floatarray/part-00000-9c0b461b-4df1-4c23-aac1-3e4f349ac7d6-c000.snappy.orc
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191)
at org.apache.orc.mapreduce.OrcMapreduceRecordReader.ensureBatch(OrcMapreduceRecordReader.java:78)
...
Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 2 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
```
## How was this patch tested?
Pass the Jenkins test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-23340-2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21093.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21093
----
commit fc5d976ffb33ebec996415ac1296196f8458a01f
Author: Dongjoon Hyun <do...@...>
Date: 2018-02-17T08:25:36Z
[SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4.3
This PR updates Apache ORC dependencies to 1.4.3 released on February 9th. Apache ORC 1.4.2 release removes unnecessary dependencies and 1.4.3 has 5 more patches (https://s.apache.org/Fll8).
Especially, the following ORC-285 is fixed at 1.4.3.
```scala
scala> val df = Seq(Array.empty[Float]).toDF()
scala> df.write.format("orc").save("/tmp/floatarray")
scala> spark.read.orc("/tmp/floatarray")
res1: org.apache.spark.sql.DataFrame = [value: array<float>]
scala> spark.read.orc("/tmp/floatarray").show()
18/02/12 22:09:10 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.io.IOException: Error reading file: file:/tmp/floatarray/part-00000-9c0b461b-4df1-4c23-aac1-3e4f349ac7d6-c000.snappy.orc
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191)
at org.apache.orc.mapreduce.OrcMapreduceRecordReader.ensureBatch(OrcMapreduceRecordReader.java:78)
...
Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 2 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
```
Pass the Jenkins test.
Author: Dongjoon Hyun <do...@apache.org>
Closes #20511 from dongjoon-hyun/SPARK-23340.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89507/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2445/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89532 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89532/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21093#discussion_r182814133
--- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala ---
@@ -208,4 +208,14 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSingleton {
}
}
}
+
+ test("SPARK-23340 Empty float/double array columns raise EOFException") {
--- End diff --
nvm. I found the original PR has them too. This is just a backport. Normally, we often refer to the original PR https://github.com/apache/spark/pull/20511 in the PR description
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21093
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun closed the pull request at:
https://github.com/apache/spark/pull/21093
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21093#discussion_r182813427
--- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/HiveOrcQuerySuite.scala ---
@@ -208,4 +208,14 @@ class HiveOrcQuerySuite extends OrcQueryTest with TestHiveSingleton {
}
}
}
+
+ test("SPARK-23340 Empty float/double array columns raise EOFException") {
--- End diff --
Just to confirm that these two tests are in the master branch, right?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89484 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89484/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89489/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2417/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89507/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21093
@gatorsmile . Sorry for late response. I'm currently at Dataworks Summit Berlin.
I took a look. It seems that the last two failures are due to `JsonInferSchema` of `BucketedWriteWithoutHiveSupportSuite`.
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89507/testReport/
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89522/testReport/
```
[info] - write bucketed data *** FAILED *** (4 seconds, 698 milliseconds)
[info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 84.0 failed 1 times, most recent failure: Lost task 0.0 in stage 84.0 (TID 86, localhost, executor driver): java.lang.IllegalStateException: LiveListenerBus is stopped.
[info] at org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
[info] at org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
[info] at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:93)
[info] at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
[info] at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
[info] at scala.Option.getOrElse(Option.scala:121)
[info] at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
[info] at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
[info] at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
[info] at org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
[info] at org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
[info] at org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
[info] at org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
[info] at scala.Option.map(Option.scala:146)
[info] at org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
[info] at org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
[info] at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
[info] at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
[info] at org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
[info] at org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
[info] at org.apache.spark.sql.execution.datasources.json.JsonInferSchema$.compatibleType(JsonInferSchema.scala:271)
[info] at org.apache.spark.sql.execution.datasources.json.JsonInferSchema$$anonfun$org$apache$spark$sql$execution$datasources$json$JsonInferSchema$$compatibleRootType$1.apply(JsonInferSchema.scala:262)
[info] at
```
I compared with `branch-2.3` itself. Unfortunately, recent`branch-2.3` itself is unstable. There is no success during last 18 runs for SBT builds.
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.6/
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89489/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2454/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89484/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89522/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun closed the pull request at:
https://github.com/apache/spark/pull/21093
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2422/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21093
LGTM
Also cc @omalley
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21093
Oops. I mistakenly click `close and comments` button.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21093
Thank you all! :D
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89507/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89489/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21093
`BucketedWriteWithoutHiveSupportSuite` is testing for `Seq("parquet", "json")`, and the testsuite fails after `insertInto` and at `write bucketed data`. So, the flakiness seems to be irrelevant to this patch. Also, the final one passed 8 hours ago.
```
[info] BucketedWriteWithoutHiveSupportSuite:
[info] - bucketed by non-existing column (28 milliseconds)
[info] - numBuckets be greater than 0 but less than 100000 (10 milliseconds)
[info] - specify sorting columns without bucketing columns (8 milliseconds)
[info] - sorting by non-orderable column (34 milliseconds)
[info] - write bucketed data using save() (9 milliseconds)
[info] - write bucketed data using insertInto() (9 milliseconds)
... Error starts
[info] - write bucketed data *** FAILED *** (4 seconds, 601 milliseconds)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21093
@dongjoon-hyun Do you have the bandwidth and see why these tests are flaky?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21093
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC...
Posted by dongjoon-hyun <gi...@git.apache.org>.
GitHub user dongjoon-hyun reopened a pull request:
https://github.com/apache/spark/pull/21093
[SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4.3
## What changes were proposed in this pull request?
This PR updates Apache ORC dependencies to 1.4.3 released on February 9th. Apache ORC 1.4.2 release removes unnecessary dependencies and 1.4.3 has 5 more patches (https://s.apache.org/Fll8).
Especially, the following ORC-285 is fixed at 1.4.3.
```scala
scala> val df = Seq(Array.empty[Float]).toDF()
scala> df.write.format("orc").save("/tmp/floatarray")
scala> spark.read.orc("/tmp/floatarray")
res1: org.apache.spark.sql.DataFrame = [value: array<float>]
scala> spark.read.orc("/tmp/floatarray").show()
18/02/12 22:09:10 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.io.IOException: Error reading file: file:/tmp/floatarray/part-00000-9c0b461b-4df1-4c23-aac1-3e4f349ac7d6-c000.snappy.orc
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191)
at org.apache.orc.mapreduce.OrcMapreduceRecordReader.ensureBatch(OrcMapreduceRecordReader.java:78)
...
Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 2 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
```
## How was this patch tested?
Pass the Jenkins test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-23340-2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21093.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21093
----
commit fc5d976ffb33ebec996415ac1296196f8458a01f
Author: Dongjoon Hyun <do...@...>
Date: 2018-02-17T08:25:36Z
[SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4.3
This PR updates Apache ORC dependencies to 1.4.3 released on February 9th. Apache ORC 1.4.2 release removes unnecessary dependencies and 1.4.3 has 5 more patches (https://s.apache.org/Fll8).
Especially, the following ORC-285 is fixed at 1.4.3.
```scala
scala> val df = Seq(Array.empty[Float]).toDF()
scala> df.write.format("orc").save("/tmp/floatarray")
scala> spark.read.orc("/tmp/floatarray")
res1: org.apache.spark.sql.DataFrame = [value: array<float>]
scala> spark.read.orc("/tmp/floatarray").show()
18/02/12 22:09:10 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.io.IOException: Error reading file: file:/tmp/floatarray/part-00000-9c0b461b-4df1-4c23-aac1-3e4f349ac7d6-c000.snappy.orc
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1191)
at org.apache.orc.mapreduce.OrcMapreduceRecordReader.ensureBatch(OrcMapreduceRecordReader.java:78)
...
Caused by: java.io.EOFException: Read past EOF for compressed stream Stream for column 2 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
```
Pass the Jenkins test.
Author: Dongjoon Hyun <do...@apache.org>
Closes #20511 from dongjoon-hyun/SPARK-23340.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21093
LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89532/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89522 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89522/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2433/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21093
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89522 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89522/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21093
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21093
Thank you for review, @cloud-fan and @gatorsmile .
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89532 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89532/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21093
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21093
**[Test build #89484 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89484/testReport)** for PR 21093 at commit [`fc5d976`](https://github.com/apache/spark/commit/fc5d976ffb33ebec996415ac1296196f8458a01f).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21093: [SPARK-23340][SQL][BRANCH-2.3] Upgrade Apache ORC to 1.4...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21093
Thanks! Merged to 2.3
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org