You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mikkokupsu <gi...@git.apache.org> on 2017/05/13 06:49:07 UTC

[GitHub] spark pull request #17973: SPARK-20731][SQL] Add ability to change or omit ....

GitHub user mikkokupsu opened a pull request:

    https://github.com/apache/spark/pull/17973

    SPARK-20731][SQL] Add ability to change or omit .csv file extension in CSV Data Source

    ## What changes were proposed in this pull request?
    
    Add new option to CSV Data Source to make changing hard coded ".csv" file extension to something else.
    https://issues.apache.org/jira/browse/SPARK-20731
    
    ## How was this patch tested?
    
    Unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mikkokupsu/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17973
    
----
commit 6dacbb5f4725cef4b26f0569ec527b5e0df8f897
Author: Mikko Kupsu <mi...@gmail.com>
Date:   2017-04-11T18:51:09Z

    Merge remote-tracking branch 'upstream/master'

commit df82d510f140651e5b1c6f8101a66f84c2575b56
Author: Mikko Kupsu <mi...@gmail.com>
Date:   2017-04-11T19:02:35Z

    Add option to specify custom file extension with csv

commit 3839cf440097dc560a8a55f0874703e75441edf8
Author: Mikko Kupsu <mi...@gmail.com>
Date:   2017-04-18T19:47:38Z

    Merge remote-tracking branch 'upstream/master'

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by mikkokupsu <gi...@git.apache.org>.
Github user mikkokupsu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116353919
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    Sorry, I have really hard time understanding what you asking here. Could you be more specific?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116358020
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    I would like to suggest to leave this out if there is no better reason for now. Downside of this is, it looks this allows arbitrary name and it does not gurantee the extention is, say, tsv when the delmiter is a tab. It is purely up to the user.
    
    I added those extentions long ago and one of the motivation was auto detection of datasource like Haddop does (which we ended up with not adding it yet due to the cost of listing files and etc). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by mikkokupsu <gi...@git.apache.org>.
Github user mikkokupsu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116354610
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    @srowen and @gatorsmile, the reason why I pushed this pull request is actually to omit the file extension completely. I guess we can discuss the semantics of different delimiters and file formats but the whole point of this pull request was to give users the option to change a hard coded value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116408851
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    Just curious what is the reason you need to omit the extension?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17973


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by mikkokupsu <gi...@git.apache.org>.
Github user mikkokupsu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116359395
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    Hi @dongjoon-huyn
    Yes, the original goal was to remove the file extension but I decided to allow user decide.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    **[Test build #76896 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76896/testReport)** for PR 17973 at commit [`3839cf4`](https://github.com/apache/spark/commit/3839cf440097dc560a8a55f0874703e75441edf8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class SerializationStream extends Closeable `
      * `abstract class DeserializationStream extends Closeable `
      * `case class UnresolvedMapObjects(`
      * `case class OrderedJoin(`
      * `case class JoinGraphInfo (starJoins: Set[Int], nonStarJoins: Set[Int])`
      * `case class NumericRange(min: Decimal, max: Decimal) extends Range `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116359183
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    Hi, @mikkokupsu 
    Is the original goal to support the existing many files (without `.csv` extension)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116353035
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    When `delimiter` is set to `\t`, is it still a CSV? : )
    
    Ref: https://en.wikipedia.org/wiki/Tab-separated_values


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76896/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by mikkokupsu <gi...@git.apache.org>.
Github user mikkokupsu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116548858
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    @gatorsmile, We provide data files to our clients and specify the file format to TAB separated. I want to avoid all confusion where someone receives just the dataset and confuses the data to be COMMA separated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116370085
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    What is the reason why Hive introduced the conf `hive.output.file.extension`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116354253
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    I guess the whole API and implementation is called "CSV" still. Of course, you can already read/write files with different names and different delimiters. Does this matter enough to make a new option? what if I delimit with, eh, null characters?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    **[Test build #76896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76896/testReport)** for PR 17973 at commit [`3839cf4`](https://github.com/apache/spark/commit/3839cf440097dc560a8a55f0874703e75441edf8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by mikkokupsu <gi...@git.apache.org>.
Github user mikkokupsu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116388332
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    @gatorsmile, Yes, my use case is to omit the extension, but I decided to make the implementation flexible i.e. `.option("fileExtension", "")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116372498
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    Also, what is your usage scenario? It sounds like you want to omit the extension?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17973: [SPARK-20731][SQL] Add ability to change or omit .csv fi...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17973
  
    @mikkokupsu We do not plan to support this after an offline discussion with the other Committers. Thanks for your contribution. Could you close this PR please?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org