You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by WANG Shicai <Ev...@yahoo.cn> on 2010/06/11 04:48:08 UTC

which parameters can jobs use differently in the same cluster?

Hi,

I see "Shuffle and Sort Configuration Tuning" in "Hadoop---The Definitive Guide", which told me that each job in the same cluster can use different parameters below without restart the cluster. But some of my partner told me not. For some reason I have no Linux cluster at hand. I wonder whether it is possible to use different parameters below in different jobs without restart the cluster. There is a simple example below to explain what I mean.

If true, which parameters can be used differently? Can all the parameters below? Is there any more?  Thank you!

eg. I started a Hadoop Cluster normally and submit Job A with "io.sort.mb" equaling 100, "io.sort.record.percent" equaling 0.05, etc. Before Job A finished, I want to submit Job B in the same cluster with  "io.sort.mb" equaling 120, "io.sort.record.percent" equaling 0.08, etc.

parameters:
io.sort.mb
io.sort.record.percent
io.sort.spill.percent
io.sort.factor
min.num.spills.for.combine
mapred.compress.map.output
mapred.map.output.compression.codec
mapred.reduce.parallel.copies
mapred.reduce.copy.backoff
io.sort.factor
mapred.job.shuffle.input.buffer.percent
mapred.job.shuffle.merge.percent
mapred.inmem.merge.threshold
mapred.job.reduce.input.buffer.percent

Best regards,

Evan

__________________________________________________
�Ͽ�ע���Ż����������������?
http://cn.mail.yahoo.com