You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by smurakozi <gi...@git.apache.org> on 2017/12/06 08:17:56 UTC

[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...

GitHub user smurakozi opened a pull request:

    https://github.com/apache/spark/pull/19906

    [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

    ## What changes were proposed in this pull request?
    
    There was a bug in Univocity Parser that causes the issue in SPARK-22516. This was fixed by upgrading from 2.5.4 to 2.5.9 version of the library :
    
    **Executing**
    ```
    spark.read.option("header","true").option("inferSchema", "true").option("multiLine", "true").option("comment", "g").csv("test_file_without_eof_char.csv").show()
    ```
    **Before**
    ```
    ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 6)
    com.univocity.parsers.common.TextParsingException: java.lang.IllegalArgumentException - Unable to skip 1 lines from line 2. End of input reached
    ...
    Internal state when error was thrown: line=3, column=0, record=2, charIndex=31
    	at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    	at com.univocity.parsers.common.AbstractParser.parseNext(AbstractParser.java:475)
    	at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anon$1.next(UnivocityParser.scala:281)
    	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    ```
    **After**
    ```
    +-------+-------+
    |column1|column2|
    +-------+-------+
    |    abc|    def|
    +-------+-------+
    ```
    
    ## How was this patch tested?
    The already existing `CSVSuite.commented lines in CSV data` test was extended to parse the file also in multiline mode. The test input file was modified to also include a comment in the last line.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/smurakozi/spark SPARK-22516

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19906.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19906
    
----
commit 8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2
Author: smurakozi <sm...@gmail.com>
Date:   2017-11-27T08:30:25Z

    [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19906#discussion_r155184221
  
    --- Diff: sql/core/src/test/resources/test-data/comments.csv ---
    @@ -4,3 +4,4 @@
     6,7,8,9,0,2015-08-21 16:58:01
     ~0,9,8,7,6,2015-08-22 17:59:02
     1,2,3,4,5,2015-08-23 18:00:42
    +~ comment in last line to test SPARK-22516 - do not add empty line at the end of this file!
    --- End diff --
    
    nice


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    **[Test build #84539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84539/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    **[Test build #84549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84549/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19906: [SPARK-22516][SQL] Bump up Univocity version to 2...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19906


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84539/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84549/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by smurakozi <gi...@git.apache.org>.
Github user smurakozi commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Thanks for your help and reviews @HyukjinKwon, @vanzin 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    **[Test build #84539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84539/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    **[Test build #84549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84549/testReport)** for PR 19906 at commit [`8bc6a9c`](https://github.com/apache/spark/commit/8bc6a9ce9f6eeb854261d26dabaf04052eb8b5b2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    LGTM, merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19906: [SPARK-22516][SQL] Bump up Univocity version to 2.5.9

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19906
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org