You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by smurakozi <gi...@git.apache.org> on 2017/12/06 08:17:56 UTC
[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...
GitHub user smurakozi opened a pull request:
https://github.com/apache/spark/pull/19906
[SPARK-22516][SQL] Bump up Univocity version to 2.5.9
## What changes were proposed in this pull request?
There was a bug in Univocity Parser that causes the issue in SPARK-22516. This was fixed by upgrading from 2.5.4 to 2.5.9 version of the library :
**Executing**
```
spark.read.option("header","true").option("inferSchema", "true").option("multiLine", "true").option("comment", "g").csv("test_file_without_eof_char.csv").show()
```
**Before**
```
ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 6)
com.univocity.parsers.common.TextParsingException: java.lang.IllegalArgumentException - Unable to skip 1 lines from line 2. End of input reached
...
Internal state when error was thrown: line=3, column=0, record=2, charIndex=31
at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
at com.univocity.parsers.common.AbstractParser.parseNext(AbstractParser.java:475)
at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anon$1.next(UnivocityParser.scala:281)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
```
**After**
```
+-------+-------+
|column1|column2|
+-------+-------+
| abc| def|
+-------+-------+
```
## How was this patch tested?
The already existing `CSVSuite.commented lines in CSV data` test was extended to parse the file also in multiline mode. The test input file was modified to also include a comment in the last line.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/smurakozi/spark SPARK-22516
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19906.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19906
----
commit 8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2
Author: smurakozi <sm...@gmail.com>
Date: 2017-11-27T08:30:25Z
[SPARK-22516][SQL] Bump up Univocity version to 2.5.9
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19906#discussion_r155184221
--- Diff: sql/core/src/test/resources/test-data/comments.csv ---
@@ -4,3 +4,4 @@
6,7,8,9,0,2015-08-21 16:58:01
~0,9,8,7,6,2015-08-22 17:59:02
1,2,3,4,5,2015-08-23 18:00:42
+~ comment in last line to test SPARK-22516 - do not add empty line at the end of this file!
--- End diff --
nice
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19906
**[Test build #84539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84539/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19906
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19906
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19906
**[Test build #84549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84549/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/19906
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19906
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84539/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19906
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84549/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19906
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19906
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by smurakozi <gi...@git.apache.org>.
Github user smurakozi commented on the issue:
https://github.com/apache/spark/pull/19906
Thanks for your help and reviews @HyukjinKwon, @vanzin
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19906
**[Test build #84539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84539/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19906
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19906
**[Test build #84549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84549/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19906
LGTM, merging to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19906
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org