You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2016/01/18 11:39:55 UTC

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/10805

    [SPARK-12871][SQL] Support to specify the option for compression codec.

    https://issues.apache.org/jira/browse/SPARK-12871
    This PR added an option to support to specify compression codec.
    This adds the option `codec` as an alias `compression` as filed in [SPARK-12668 ](https://issues.apache.org/jira/browse/SPARK-12668).
    
    Note that I did not add configurations for Hadoop 1.x as this `CsvRelation` is using Hadoop 2.x API.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-12420

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10805
    
----
commit 06774ad552f8cc39a437a1caa1ef2aceb174b0de
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-01-18T09:53:35Z

    Support for compress option

commit 5e1611d74f3ccc33705c3aa134f41382e2150508
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-01-18T10:26:01Z

    Correct Scala style and add an alias for codec as compression.

commit d154f025ccfb46c809598e1d688754e1e9603f0e
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-01-18T10:29:31Z

    Move back some codes changed unintentionally.

commit 5b57fc246b11f90f082104dc04e53f194123ab35
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-01-18T10:33:46Z

    Remove Hadoop 1.x configurations

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172766510
  
    Supported shorten names for compression codecs are below (case insensitive):
    
    `bzip2` -> `org.apache.hadoop.io.compress.BZip2Codec`
    `gzip` -> `org.apache.hadoop.io.compress.GzipCodec`
    `lz4` -> `org.apache.hadoop.io.compress.Lz4Codec`
    `snappy` -> `org.apache.hadoop.io.compress.SnappyCodec`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172944028
  
    Oh one thing: this doesn't support reading with compression yet, does it? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50081670
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -44,6 +46,13 @@ private[sql] case class CSVParameters(@transient parameters: Map[String, String]
         }
       }
     
    +  // Available compression codec list
    +  val shortCompressionCodecNames = Map(
    --- End diff --
    
    this should go into the object rather than in the case class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50085424
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -107,3 +117,28 @@ private[csv] object ParseModes {
         true // We default to permissive is the mode string is not valid
       }
     }
    +
    +private[csv] object CSVCompressionCodecs {
    +  val shortCompressionCodecNames = Map(
    +    "bzip2" -> classOf[BZip2Codec].getName,
    +    "gzip" -> classOf[GzipCodec].getName,
    +    "lz4" -> classOf[Lz4Codec].getName,
    +    "snappy" -> classOf[SnappyCodec].getName)
    +
    +  /**
    +   * Return the full version of the given codec class.
    +   * If it is already a class name, just return it.
    +   */
    +  def getCodecClassName(name: String): String = {
    +    val codecName = shortCompressionCodecNames.getOrElse(name.toLowerCase, name)
    +    val codecClassName = try {
    +      // Validate the codec name
    +      Utils.classForName(codecName)
    +      Some(codecName)
    +    } catch {
    +      case e: ClassNotFoundException => None
    +    }
    +    codecClassName.getOrElse(throw new IllegalArgumentException(s"Codec [$codecName] " +
    +      s"is not available. Available codecs are ${shortCompressionCodecNames.keys.mkString(",")}."))
    --- End diff --
    
    add a space after the comma


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173065443
  
    **[Test build #49750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49750/consoleFull)** for PR 10805 at commit [`cd9f742`](https://github.com/apache/spark/commit/cd9f7429cfb2de4f0e82cc5134fe221fb135c01f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50085474
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -107,3 +117,28 @@ private[csv] object ParseModes {
         true // We default to permissive is the mode string is not valid
       }
     }
    +
    +private[csv] object CSVCompressionCodecs {
    +  val shortCompressionCodecNames = Map(
    --- End diff --
    
    private?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50085643
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -73,6 +76,14 @@ private[sql] case class CSVParameters(@transient parameters: Map[String, String]
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val compressionCodecName =
    --- End diff --
    
    don't expose this variable. e.g. you can do
    ```scala
    val compressionCodec: Option[String] = {
      val name = parameters.getOrElse("compression", parameters.get("codec"))
      name.map(CSVCompressionCodecs.getCodecClassName)
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172709720
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49643/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172813603
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49676/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50206079
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -107,3 +114,28 @@ private[csv] object ParseModes {
         true // We default to permissive is the mode string is not valid
       }
     }
    +
    +private[csv] object CSVCompressionCodecs {
    +  private val shortCompressionCodecNames = Map(
    +    "bzip2" -> classOf[BZip2Codec].getName,
    +    "gzip" -> classOf[GzipCodec].getName,
    +    "lz4" -> classOf[Lz4Codec].getName,
    +    "snappy" -> classOf[SnappyCodec].getName)
    +
    +  /**
    +   * Return the full version of the given codec class.
    +   * If it is already a class name, just return it.
    +   */
    +  def getCodecClassName(name: String): String = {
    +    val codecName = shortCompressionCodecNames.getOrElse(name.toLowerCase, name)
    +    val codecClassName = try {
    +      // Validate the codec name
    +      Utils.classForName(codecName)
    +      Some(codecName)
    +    } catch {
    +      case e: ClassNotFoundException => None
    +    }
    +    codecClassName.getOrElse(throw new IllegalArgumentException(s"Codec [$codecName] " +
    +      s"is not available. Available codecs are ${shortCompressionCodecNames.keys.mkString(", ")}."))
    --- End diff --
    
    Probably change Available -> Known (since there may be more available)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172790523
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172789851
  
    **[Test build #49671 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49671/consoleFull)** for PR 10805 at commit [`adb9eb2`](https://github.com/apache/spark/commit/adb9eb22a256895ad4bad11893222c485c7afa37).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172709347
  
    **[Test build #49643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49643/consoleFull)** for PR 10805 at commit [`e7ebddd`](https://github.com/apache/spark/commit/e7ebddd68c3b772b7476432b4e1ba30d6ba2eb22).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172780088
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172747397
  
    I will resolve conflicts and update this soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172780089
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49674/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172787690
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49678/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172777734
  
    Although `CSVCompressionCodecs` might be shared with JSON datasource, I will make that share this at the separate PR for JSON.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50085687
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala ---
    @@ -99,6 +100,15 @@ private[csv] class CSVRelation(
       }
     
       override def prepareJobForWrite(job: Job): OutputWriterFactory = {
    +    val conf = job.getConfiguration
    +    Option(params.compressionCodec).foreach { codec =>
    --- End diff --
    
    as suggested above, I'd make "compressionCodec" itself an Option, rather than nullable.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173068930
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50074461
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -71,6 +71,8 @@ private[sql] case class CSVParameters(parameters: Map[String, String]) extends L
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val codec = parameters.getOrElse("compression", parameters.getOrElse("codec", null))
    --- End diff --
    
    the other thing is that i'd create short-form names for the common options, e.g. "gzip" should become GzipCodec. You'd need to look into what the commonly supported formats are and come up with their short names. We should also make sure this is case insensitive.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50081902
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -73,6 +82,12 @@ private[sql] case class CSVParameters(@transient parameters: Map[String, String]
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val compressionCodec: String = {
    --- End diff --
    
    maybe we should do some data validation here, i.e. throwing exceptions if class not found.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172691164
  
    **[Test build #49643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49643/consoleFull)** for PR 10805 at commit [`e7ebddd`](https://github.com/apache/spark/commit/e7ebddd68c3b772b7476432b4e1ba30d6ba2eb22).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50053254
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -71,6 +71,8 @@ private[sql] case class CSVParameters(parameters: Map[String, String]) extends L
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val codec = parameters.getOrElse("codec", parameters.getOrElse("compression", null))
    --- End diff --
    
    I will vorrect this to look up for compression first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172813154
  
    **[Test build #49676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49676/consoleFull)** for PR 10805 at commit [`6400b76`](https://github.com/apache/spark/commit/6400b767bd3ad6235b3b9f2291e4135a09f5d7ae).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172816804
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49679/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172787688
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172785578
  
    **[Test build #49679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49679/consoleFull)** for PR 10805 at commit [`0245eea`](https://github.com/apache/spark/commit/0245eea2508f9cde30b91e2619d79dbafa18f845).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173016289
  
    Oh yes it does.  Actually I am reading compressed files in the test I added [here](https://github.com/HyukjinKwon/spark/blob/SPARK-12420/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala#L361-L373).
    
    As you know it recognise the compression codec by file extension so if you meant manually setting compression codec for reading, it does not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50206066
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -107,3 +114,28 @@ private[csv] object ParseModes {
         true // We default to permissive is the mode string is not valid
       }
     }
    +
    +private[csv] object CSVCompressionCodecs {
    +  private val shortCompressionCodecNames = Map(
    +    "bzip2" -> classOf[BZip2Codec].getName,
    +    "gzip" -> classOf[GzipCodec].getName,
    +    "lz4" -> classOf[Lz4Codec].getName,
    +    "snappy" -> classOf[SnappyCodec].getName)
    +
    +  /**
    +   * Return the full version of the given codec class.
    +   * If it is already a class name, just return it.
    +   */
    +  def getCodecClassName(name: String): String = {
    +    val codecName = shortCompressionCodecNames.getOrElse(name.toLowerCase, name)
    +    val codecClassName = try {
    +      // Validate the codec name
    +      Utils.classForName(codecName)
    +      Some(codecName)
    +    } catch {
    +      case e: ClassNotFoundException => None
    +    }
    +    codecClassName.getOrElse(throw new IllegalArgumentException(s"Codec [$codecName] " +
    --- End diff --
    
    should just put this throw inside the catch block and not bother with the Option stuff


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172745372
  
    Yup we are dropping Hadoop 1.x support, so it is OK to have it only for Hadoop 2.x.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50081941
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -73,6 +82,12 @@ private[sql] case class CSVParameters(@transient parameters: Map[String, String]
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val compressionCodec: String = {
    +    val maybeCodecName =
    +      Option(parameters.getOrElse("compression", parameters.getOrElse("codec", null)))
    +    maybeCodecName.map(_.toLowerCase).map(shortCompressionCodecNames).orNull
    --- End diff --
    
    this logic is becoming confusing with so many level of nesting. we should rewrite it (even with more loc) to make it more readable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173080189
  
    **[Test build #49750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49750/consoleFull)** for PR 10805 at commit [`cd9f742`](https://github.com/apache/spark/commit/cd9f7429cfb2de4f0e82cc5134fe221fb135c01f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10805#discussion_r50074385
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala ---
    @@ -71,6 +71,8 @@ private[sql] case class CSVParameters(parameters: Map[String, String]) extends L
     
       val nullValue = parameters.getOrElse("nullValue", "")
     
    +  val codec = parameters.getOrElse("compression", parameters.getOrElse("codec", null))
    --- End diff --
    
    for this one i'd name the internally name compression or compressionCodec since codec can mean a lot of different things. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172517812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49592/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173080338
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172816803
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173085873
  
    I've merged this in master. Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173022112
  
    I see. I will anyway try to figure this out though. I somehow this might be a bit too much as almost all files would have proper extensions and I think the (almost) only exception might be files initially uploaded by users to a file system. 
    
    Maybe I am missing something though. I don't think users would not give wrong extensions for the files but set compression codec for reading properly, in particular, when they use HDFS because AKAIK Hadoop supports reading compressed files by extension. I feel like they might have to give proper extensions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172816508
  
    **[Test build #49679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49679/consoleFull)** for PR 10805 at commit [`0245eea`](https://github.com/apache/spark/commit/0245eea2508f9cde30b91e2619d79dbafa18f845).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172790525
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49671/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172496706
  
    **[Test build #49592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49592/consoleFull)** for PR 10805 at commit [`5b57fc2`](https://github.com/apache/spark/commit/5b57fc246b11f90f082104dc04e53f194123ab35).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172517811
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172517601
  
    **[Test build #49592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49592/consoleFull)** for PR 10805 at commit [`5b57fc2`](https://github.com/apache/spark/commit/5b57fc246b11f90f082104dc04e53f194123ab35).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10805


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172813600
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173017720
  
    Yea I'm thinking we should also support specifying options, and it is "auto" by default which decides based on extensions.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172768286
  
    **[Test build #49671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49671/consoleFull)** for PR 10805 at commit [`adb9eb2`](https://github.com/apache/spark/commit/adb9eb22a256895ad4bad11893222c485c7afa37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-173080339
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49750/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172780590
  
    **[Test build #49676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49676/consoleFull)** for PR 10805 at commit [`6400b76`](https://github.com/apache/spark/commit/6400b767bd3ad6235b3b9f2291e4135a09f5d7ae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10805#issuecomment-172709719
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org