You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by valgrind_girl <12...@qq.com> on 2014/07/15 05:06:55 UTC

Re: hdfs replication on saving RDD

eager to know this issue too,does any one knows how?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-tp289p9700.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: hdfs replication on saving RDD

Posted by Kan Zhang <kz...@apache.org>.

Andrew, there are overloaded versions of saveAsHadoopFile or
saveAsNewAPIHadoopFile that allow you to pass in a per-job Hadoop conf.
saveAsTextFile is just a convenience wrapper on top of saveAsHadoopFile.


On Mon, Jul 14, 2014 at 11:22 PM, Andrew Ash <an...@andrewash.com> wrote:

> In general it would be nice to be able to configure replication on a
> per-job basis.  Is there a way to do that without changing the config
> values in the Hadoop conf/ directory between jobs?  Maybe by modifying
> OutputFormats or the JobConf ?
>
>
> On Mon, Jul 14, 2014 at 11:12 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
>
>> You can change this setting through SparkContext.hadoopConfiguration, or
>> put the conf/ directory of your Hadoop installation on the CLASSPATH when
>> you launch your app so that it reads the config values from there.
>>
>> Matei
>>
>> On Jul 14, 2014, at 8:06 PM, valgrind_girl <12...@qq.com> wrote:
>>
>> > eager to know this issue too,does any one knows how?
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-tp289p9700.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>>
>

Re: hdfs replication on saving RDD

Posted by Andrew Ash <an...@andrewash.com>.

In general it would be nice to be able to configure replication on a
per-job basis.  Is there a way to do that without changing the config
values in the Hadoop conf/ directory between jobs?  Maybe by modifying
OutputFormats or the JobConf ?

On Mon, Jul 14, 2014 at 11:12 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> You can change this setting through SparkContext.hadoopConfiguration, or
> put the conf/ directory of your Hadoop installation on the CLASSPATH when
> you launch your app so that it reads the config values from there.
>
> Matei
>
> On Jul 14, 2014, at 8:06 PM, valgrind_girl <12...@qq.com> wrote:
>
> > eager to know this issue too,does any one knows how?
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-tp289p9700.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>

Re: hdfs replication on saving RDD

Posted by Matei Zaharia <ma...@gmail.com>.

You can change this setting through SparkContext.hadoopConfiguration, or put the conf/ directory of your Hadoop installation on the CLASSPATH when you launch your app so that it reads the config values from there.

Matei

On Jul 14, 2014, at 8:06 PM, valgrind_girl <12...@qq.com> wrote:

> eager to know this issue too,does any one knows how?
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-tp289p9700.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.