You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nick Jones <ni...@amd.com> on 2010/03/10 15:07:21 UTC
Uneven DBInputFormat Splits
Hi all,
I've setup a job that pulls say 250 records from MySQL and splits them
across several mappers. Each mapper (with the exception of attempt_*_0)
gets roughly 250/(n mappers) records. However, attempt 0 always ends up
with ~5x the workload of the others. Is there something I'm missing or
is this normal?
Thanks
Nick Jones