You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2016/11/07 11:02:58 UTC

[jira] [Commented] (SPARK-18017) Changing Hadoop parameter through sparkSession.sparkContext.hadoopConfiguration doesn't work

    [ https://issues.apache.org/jira/browse/SPARK-18017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643864#comment-15643864 ] 

Steve Loughran commented on SPARK-18017:
----------------------------------------

you can check what's been picked up by grabbing a copy of the filesystem instance and then logging the value returned in {{getDefaultBlockSize()}}.

If you switch to S3a, which you should be, calling toString() on the FS instance is generally sufficient to dump the block size and lots of other useful bits of information. It's relevant property is {{fs.s3a.block.size}}.



> Changing Hadoop parameter through sparkSession.sparkContext.hadoopConfiguration doesn't work
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-18017
>                 URL: https://issues.apache.org/jira/browse/SPARK-18017
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>         Environment: Scala version 2.11.8; Java 1.8.0_91; com.databricks:spark-csv_2.11:1.2.0
>            Reporter: Yuehua Zhang
>
> My Spark job tries to read csv files on S3. I need to control the number of partitions created so I set Hadoop parameter fs.s3n.block.size. However, it stopped working after we upgrade Spark from 1.6.1 to 2.0.0. Not sure if it is related to https://issues.apache.org/jira/browse/SPARK-15991. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org