You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Vasilis Liaskovitis <vl...@gmail.com> on 2009/09/27 03:37:34 UTC

default job scheduler behaviour

Hi,

given a single cluster running with the default job scheduler: Is only
one job executing on the cluster, regardless of how many task
map/reduce slots it can keep busy?
In other words, If a job does not use all task slots, would the
default scheduler consider scheduling map/reduce from other jobs that
have already been submitted to the system?

I am using an 8-node cluster to run some test jobs based on gridmix
(the synthetic benchmark found in the hadoop distribution under
src/banchmarks/gridmix). The gridmix workload submits many different
jobs in parallel - 5 different kinds of jobs of varying sizes for each
kind: small, medium, large. While running, I am noticing that at any
time only one job is making progress - at least according to the
jobtracker web ui. I think this is happening even for small-size jobs,
which don't take up all slots of the cluster's tasktrackers/nodes.

If the default scheduler is not capable of scheduling tasks for
multiple jobs, would I have to use the capacity scheduler? Or
something else?

thanks for any help,

- Vasilis