You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Shay Rojansky <ro...@roji.org> on 2014/09/07 17:20:32 UTC

Fair scheduler queue and preemption question

Hi everyone. I'm pretty much a Hadoop newbie and want to make sure I
understand things correctly.

I set up my Hadoop cluster with the fair schedular and 3 queues, where each
queue has the same weight as the other - the goal is for 3 users to get the
same share of the cluster.

Preemption is enabled, but I'm seeing some non-intuitive behavior. If one
user submits a job A that takes up the entire cluster, and then another
user submits job B, 33% of job A's containers are preempted and their
capacity transferred to job B. I had expected it to be 50% - since the two
*active* queues have the same weight.

What seems problematic here, is that if the two users submitted their jobs
at the same time, they would receive 50% each, right? It seems very strange
that the *stable* scheduling situation of long-running jobs would be
influenced by a race condition such as the exact submission time. Or in
other words, that the scheduling policy for allocating new/empty containers
is different from the scheduling policy for preempting already-running ones.

I do understand that this is how the fair scheduler works. I was wondering
if I'm missing something, or whether some other setup could provide my
expected behavior (perhaps with the capacity scheduler?).

Any input here would be greatly appreciated!

Shay