You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sree Vaddi (JIRA)" <ji...@apache.org> on 2015/04/12 16:55:12 UTC

[jira] [Commented] (SPARK-6151) schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size

    [ https://issues.apache.org/jira/browse/SPARK-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491502#comment-14491502 ] 

Sree Vaddi commented on SPARK-6151:
-----------------------------------

[~cnstar9988]
The HDFS Block Size is set once when you first install Hadoop.
It is possible to change the HDFS block size in your hadoop configuration and restart your hadoop for the change to take effect. (read literature and feel comfortable, before you make this change).
Then, you can run saveAsParquetFile().  Which will now use the new HDFS block size.


> schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-6151
>                 URL: https://issues.apache.org/jira/browse/SPARK-6151
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.2.1
>            Reporter: Littlestar
>            Priority: Trivial
>
> How schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size. may be Configuration need.
> related question by others.
> http://apache-spark-user-list.1001560.n3.nabble.com/HDFS-block-size-for-parquet-output-tt21183.html
> http://qnalist.com/questions/5054892/spark-sql-parquet-and-impala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org