You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Kay Ousterhout <ke...@eecs.berkeley.edu> on 2013/05/29 05:00:29 UTC

5 second minimum shuffle time

Hi,

I'm running v0.23 in a large cluster, and have found that the shuffle time
for reduce tasks is always at least 5 seconds, even when the amount of data
read by the reduce task is tiny (e.g., just 18 bytes).  This shuffle time
floor suggests that there's a heartbeat interval or something that has to
elapse before the shuffle begins, but I can't find any sign of such a delay
in the code base.  Can anyone shed some light on why this is occurring?

Thanks,
Kay