You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/13 11:29:31 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

HyukjinKwon opened a new pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561
 
 
   ### What changes were proposed in this pull request?
   
   This PR fixes `DataFrameReader.csv(dataset: Dataset[String])` API to take a `Dataset[String]` originated from a column name different from `value`.
   
   `CSVUtils.filterCommentAndEmpty` assumed the `Dataset[String]` to be originated with `value` column. This PR changes to use the first column name in the schema.
   
   ### Why are the changes needed?
   
   For  `DataFrameReader.csv(dataset: Dataset[String])` to support any `Dataset[String]` as the signature indicates.
   
   ### Does this PR introduce any user-facing change?
   Yes,
   
   ```scala
   val ds = spark.range(2).selectExpr("concat('a,b,', id) AS text").as[String]
   spark.read.option("header", true).option("inferSchema", true).csv(ds).show()
   ```
   
   Before:
   
   ```
   org.apache.spark.sql.AnalysisException: cannot resolve '`value`' given input columns: [text];;
   'Filter (length(trim('value, None)) > 0)
   +- Project [concat(a,b,, cast(id#0L as string)) AS text#2]
      +- Range (0, 2, step=1, splits=Some(2))
   ```
   
   After:
   
   ```
   +---+---+---+
   |  a|  b|  0|
   +---+---+---+
   |  a|  b|  1|
   +---+---+---+
   ```
   
   
   ### How was this patch tested?
   
   Unittest was added.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585860075
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586135212
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586072585
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23143/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586204573
 
 
   Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586071737
 
 
   **[Test build #118386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118386/testReport)** for PR 27561 at commit [`4e9ddf1`](https://github.com/apache/spark/commit/4e9ddf1ecc82a9d43d50c261dadf2be2d2b60395).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586135217
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118386/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585693755
 
 
   **[Test build #118356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118356/testReport)** for PR 27561 at commit [`adb9b22`](https://github.com/apache/spark/commit/adb9b2251faa4e847dbdc954b5f8f04b84d8c8aa).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585695812
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585856921
 
 
   **[Test build #118356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118356/testReport)** for PR 27561 at commit [`adb9b22`](https://github.com/apache/spark/commit/adb9b2251faa4e847dbdc954b5f8f04b84d8c8aa).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585695870
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23113/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586072585
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23143/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585695812
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585860084
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118356/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585695870
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23113/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586135212
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586135217
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118386/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586072578
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585693755
 
 
   **[Test build #118356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118356/testReport)** for PR 27561 at commit [`adb9b22`](https://github.com/apache/spark/commit/adb9b2251faa4e847dbdc954b5f8f04b84d8c8aa).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585860075
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-585860084
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118356/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#discussion_r379225857
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVUtils.scala
 ##########
 @@ -33,11 +33,12 @@ object CSVUtils {
     // with the one below, `filterCommentAndEmpty` but execution path is different. One of them
     // might have to be removed in the near future if possible.
     import lines.sqlContext.implicits._
-    val nonEmptyLines = lines.filter(length(trim($"value")) > 0)
 
 Review comment:
   @MaxGekk and @cloud-fan, I came up with a better idea to avoid relying on string format in `col`. Can you take a look again? I think this way is safer.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586071737
 
 
   **[Test build #118386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118386/testReport)** for PR 27561 at commit [`4e9ddf1`](https://github.com/apache/spark/commit/4e9ddf1ecc82a9d43d50c261dadf2be2d2b60395).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586072578
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586134493
 
 
   **[Test build #118386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118386/testReport)** for PR 27561 at commit [`4e9ddf1`](https://github.com/apache/spark/commit/4e9ddf1ecc82a9d43d50c261dadf2be2d2b60395).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27561: [SPARK-30810][SQL] Parses and convert a CSV Dataset having different column from 'value' in csv(dataset) API
URL: https://github.com/apache/spark/pull/27561#issuecomment-586196442
 
 
   LGTM, merging to master/3.0!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org