You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "Ulanov, Alexander" <al...@hp.com> on 2015/06/23 03:44:29 UTC
Force Spark save parquet files with replication factor other than 3
(default one)
Hi,
My Hadoop is configured to have replication ratio = 2. I've added $HADOOP_HOME/config to the PATH as suggested in http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-td289.html. Spark (1.4) does rdd.saveAsTextFile with replication=2. However DataFrame.saveAsParquet is done with replication = 3. How can I force Spark Dataframe to save parquet files with replication factor other than 3 (default one)?
Best regards, Alexander