You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Brian Jeltema <br...@digitalenvoy.net> on 2014/09/12 17:56:23 UTC

long startup time for MR job

Running an Hadoop 2.4/HBase 0.98 MR Job on a 12-node cluster, I’m seeing a long startup delay (about 2.5 minutes):

14/09/12 11:46:05 INFO client.RMProxy: Connecting to ResourceManager at prod-hdfs-14.hdfs.digitalenvoy.net/192.168.25.14:8050
14/09/12 11:48:31 INFO mapreduce.JobSubmitter: number of splits:650

this seems like a long time. Is this due to the overhead of moving all of the JAR files into place, or is there
other overhead involved? I’m using a -libjars option with a list of JAR files that is automatically generated by
a home-grown tool, and is not optimized. I’m wondering if I need to make it smarter.

Thanks
Brian