You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Hrishikesh Gadre <ga...@gmail.com> on 2010/11/03 00:26:41 UTC

map-reduce across data centers

Hello everyone,

I am curious to know if anyone has tried using map-reduce across multiple
data centers? The use case that I have in my mind where the dataset is
geographically distributed across multiple data centers and it may be not be
cost effective to move the data to a single site (e.g. due to limitation of
network bandwidth across sites etc.) How such scenario is taken care today?

As per my understanding, there is a feature request filed against HDFS to be
distributed across data centers (e.g. for disaster recovery etc.). For
details, please refer to following link
https://issues.apache.org/jira/browse/HDFS-1432

Can anyone share any thoughts regarding pros and cons of this approach?

Thanks
Hrishikesh

Hadoop Job Configuration XMLs

Posted by Deepika Khera <De...@avg.com>.
I am using hadoop v0.20.2.

Is there any configuration that I could set so the job tracker cleans the job configuration xml files in the hadoop logs directory. Or do we need to delete them manually everytime?

Deepika