You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Evert Lammerts <Ev...@sara.nl> on 2012/08/14 12:39:33 UTC

Pending reducers

Hi list,

I have a cluster running Hadoop 0.20.205 with Kerberos enabled, exposing 528 map slots and 528 reduce slots. Currently somebody is running a NORMAL priority job with 7 mappers and 400 reducers. The mappers have finished and the system is processing the reducers. Another user is running a NORMAL priority job with 1 mapper and 26 reducers. The mapper has finished, but the reducers won't come out of "pending" state. There are no other jobs running right now. We've not yet installed a different scheduler, so right now the system is using the default scheduler. How can this behavior be explained? I see mappers of multiple jobs run concurrently, and I *thought* I've seen reducers of multiple jobs run concurrently, but I'm not completely sure. Any idea?

Evert

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

Alright, thanks, we're already busy rolling out the config for the capacity scheduler. Still, interesting behavior. The fifo scheduler looks at the load on the cores? Seems unnecessary, the kernel is quite good at context switching.

Evert
________________________________________
From: Harsh J [harsh@cloudera.com]
Sent: Tuesday, August 14, 2012 3:12 PM
To: user@hadoop.apache.org
Subject: Re: Pending reducers

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



--
Harsh J

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

Alright, thanks, we're already busy rolling out the config for the capacity scheduler. Still, interesting behavior. The fifo scheduler looks at the load on the cores? Seems unnecessary, the kernel is quite good at context switching.

Evert
________________________________________
From: Harsh J [harsh@cloudera.com]
Sent: Tuesday, August 14, 2012 3:12 PM
To: user@hadoop.apache.org
Subject: Re: Pending reducers

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



--
Harsh J

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

Alright, thanks, we're already busy rolling out the config for the capacity scheduler. Still, interesting behavior. The fifo scheduler looks at the load on the cores? Seems unnecessary, the kernel is quite good at context switching.

Evert
________________________________________
From: Harsh J [harsh@cloudera.com]
Sent: Tuesday, August 14, 2012 3:12 PM
To: user@hadoop.apache.org
Subject: Re: Pending reducers

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



--
Harsh J

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

Alright, thanks, we're already busy rolling out the config for the capacity scheduler. Still, interesting behavior. The fifo scheduler looks at the load on the cores? Seems unnecessary, the kernel is quite good at context switching.

Evert
________________________________________
From: Harsh J [harsh@cloudera.com]
Sent: Tuesday, August 14, 2012 3:12 PM
To: user@hadoop.apache.org
Subject: Re: Pending reducers

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



--
Harsh J

Re: Pending reducers

Posted by Harsh J <ha...@cloudera.com>.

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Harsh J

Re: Pending reducers

Posted by Harsh J <ha...@cloudera.com>.

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Harsh J

Re: Pending reducers

Posted by Harsh J <ha...@cloudera.com>.

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Harsh J

Re: Pending reducers

Posted by Harsh J <ha...@cloudera.com>.

I guess this is the regular behavior of the default FIFO task
scheduler. It takes into account the reducer load and that may be why
it refused to schedule the rest up immediately. You may have better
luck using either Fair or Capacity schedulers.

On Tue, Aug 14, 2012 at 5:56 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> whats the memory/cpu stats on the machines ? are they exhausted
>
> No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.
>
> Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.
>
> Evert
>
>>
>> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> >> reducers of multiple jobs do run con-currently as long as they have
>> >> the resources available.
>> >
>> > Yep, and that's what's not happening in my situation. 528 reduce
>> slots, 400 taken by one job, 26 of another job remain in pending state.
>> What could explain this behavior?
>> >
>> > Evert
>> >
>> >>
>> >> If you want to limit someone overtaking the cluster, then you can
>> >> create different job queues and assign quota to each queue. You also
>> >> have the flexibility of allocating max quota per user in a queue as
>> >> well.
>> >>
>> >>
>> >>
>> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> >> <Ev...@sara.nl> wrote:
>> >> > Hi list,
>> >> >
>> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> >> mappers have finished and the system is processing the reducers.
>> >> Another user is running a NORMAL priority job with 1 mapper and 26
>> >> reducers. The mapper has finished, but the reducers won't come out
>> of
>> >> "pending" state. There are no other jobs running right now. We've
>> not
>> >> yet installed a different scheduler, so right now the system is
>> using
>> >> the default scheduler. How can this behavior be explained? I see
>> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> >> reducers of multiple jobs run concurrently, but I'm not completely
>> >> sure. Any idea?
>> >> >
>> >> > Evert
>> >>
>> >>
>> >>
>> >> --
>> >> Nitin Pawar
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Harsh J

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> whats the memory/cpu stats on the machines ? are they exhausted

No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.

Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.

Evert

> 
> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> >> reducers of multiple jobs do run con-currently as long as they have
> >> the resources available.
> >
> > Yep, and that's what's not happening in my situation. 528 reduce
> slots, 400 taken by one job, 26 of another job remain in pending state.
> What could explain this behavior?
> >
> > Evert
> >
> >>
> >> If you want to limit someone overtaking the cluster, then you can
> >> create different job queues and assign quota to each queue. You also
> >> have the flexibility of allocating max quota per user in a queue as
> >> well.
> >>
> >>
> >>
> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> >> <Ev...@sara.nl> wrote:
> >> > Hi list,
> >> >
> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
> >> mappers have finished and the system is processing the reducers.
> >> Another user is running a NORMAL priority job with 1 mapper and 26
> >> reducers. The mapper has finished, but the reducers won't come out
> of
> >> "pending" state. There are no other jobs running right now. We've
> not
> >> yet installed a different scheduler, so right now the system is
> using
> >> the default scheduler. How can this behavior be explained? I see
> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
> >> reducers of multiple jobs run concurrently, but I'm not completely
> >> sure. Any idea?
> >> >
> >> > Evert
> >>
> >>
> >>
> >> --
> >> Nitin Pawar
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> whats the memory/cpu stats on the machines ? are they exhausted

No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.

Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.

Evert

> 
> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> >> reducers of multiple jobs do run con-currently as long as they have
> >> the resources available.
> >
> > Yep, and that's what's not happening in my situation. 528 reduce
> slots, 400 taken by one job, 26 of another job remain in pending state.
> What could explain this behavior?
> >
> > Evert
> >
> >>
> >> If you want to limit someone overtaking the cluster, then you can
> >> create different job queues and assign quota to each queue. You also
> >> have the flexibility of allocating max quota per user in a queue as
> >> well.
> >>
> >>
> >>
> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> >> <Ev...@sara.nl> wrote:
> >> > Hi list,
> >> >
> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
> >> mappers have finished and the system is processing the reducers.
> >> Another user is running a NORMAL priority job with 1 mapper and 26
> >> reducers. The mapper has finished, but the reducers won't come out
> of
> >> "pending" state. There are no other jobs running right now. We've
> not
> >> yet installed a different scheduler, so right now the system is
> using
> >> the default scheduler. How can this behavior be explained? I see
> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
> >> reducers of multiple jobs run concurrently, but I'm not completely
> >> sure. Any idea?
> >> >
> >> > Evert
> >>
> >>
> >>
> >> --
> >> Nitin Pawar
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> whats the memory/cpu stats on the machines ? are they exhausted

No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.

Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.

Evert

> 
> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> >> reducers of multiple jobs do run con-currently as long as they have
> >> the resources available.
> >
> > Yep, and that's what's not happening in my situation. 528 reduce
> slots, 400 taken by one job, 26 of another job remain in pending state.
> What could explain this behavior?
> >
> > Evert
> >
> >>
> >> If you want to limit someone overtaking the cluster, then you can
> >> create different job queues and assign quota to each queue. You also
> >> have the flexibility of allocating max quota per user in a queue as
> >> well.
> >>
> >>
> >>
> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> >> <Ev...@sara.nl> wrote:
> >> > Hi list,
> >> >
> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
> >> mappers have finished and the system is processing the reducers.
> >> Another user is running a NORMAL priority job with 1 mapper and 26
> >> reducers. The mapper has finished, but the reducers won't come out
> of
> >> "pending" state. There are no other jobs running right now. We've
> not
> >> yet installed a different scheduler, so right now the system is
> using
> >> the default scheduler. How can this behavior be explained? I see
> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
> >> reducers of multiple jobs run concurrently, but I'm not completely
> >> sure. Any idea?
> >> >
> >> > Evert
> >>
> >>
> >>
> >> --
> >> Nitin Pawar
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> whats the memory/cpu stats on the machines ? are they exhausted

No, they're not. The nodes themselves have more than enough memory available, and the load on the cores sits between 0.8 and 0.9.

Is current load in terms other than available slots even taken into account in the default scheduler? That would surprise me, actually... But it might explain this behavior.

Evert

> 
> On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> >> reducers of multiple jobs do run con-currently as long as they have
> >> the resources available.
> >
> > Yep, and that's what's not happening in my situation. 528 reduce
> slots, 400 taken by one job, 26 of another job remain in pending state.
> What could explain this behavior?
> >
> > Evert
> >
> >>
> >> If you want to limit someone overtaking the cluster, then you can
> >> create different job queues and assign quota to each queue. You also
> >> have the flexibility of allocating max quota per user in a queue as
> >> well.
> >>
> >>
> >>
> >> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> >> <Ev...@sara.nl> wrote:
> >> > Hi list,
> >> >
> >> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> >> exposing 528 map slots and 528 reduce slots. Currently somebody is
> >> running a NORMAL priority job with 7 mappers and 400 reducers. The
> >> mappers have finished and the system is processing the reducers.
> >> Another user is running a NORMAL priority job with 1 mapper and 26
> >> reducers. The mapper has finished, but the reducers won't come out
> of
> >> "pending" state. There are no other jobs running right now. We've
> not
> >> yet installed a different scheduler, so right now the system is
> using
> >> the default scheduler. How can this behavior be explained? I see
> >> mappers of multiple jobs run concurrently, and I *thought* I've seen
> >> reducers of multiple jobs run concurrently, but I'm not completely
> >> sure. Any idea?
> >> >
> >> > Evert
> >>
> >>
> >>
> >> --
> >> Nitin Pawar
> 
> 
> 
> --
> Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

whats the memory/cpu stats on the machines ? are they exhausted

On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> reducers of multiple jobs do run con-currently as long as they have the
>> resources available.
>
> Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?
>
> Evert
>
>>
>> If you want to limit someone overtaking the cluster, then you can
>> create different job queues and assign quota to each queue. You also
>> have the flexibility of allocating max quota per user in a queue as
>> well.
>>
>>
>>
>> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> > Hi list,
>> >
>> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> mappers have finished and the system is processing the reducers.
>> Another user is running a NORMAL priority job with 1 mapper and 26
>> reducers. The mapper has finished, but the reducers won't come out of
>> "pending" state. There are no other jobs running right now. We've not
>> yet installed a different scheduler, so right now the system is using
>> the default scheduler. How can this behavior be explained? I see
>> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> reducers of multiple jobs run concurrently, but I'm not completely
>> sure. Any idea?
>> >
>> > Evert
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

whats the memory/cpu stats on the machines ? are they exhausted

On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> reducers of multiple jobs do run con-currently as long as they have the
>> resources available.
>
> Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?
>
> Evert
>
>>
>> If you want to limit someone overtaking the cluster, then you can
>> create different job queues and assign quota to each queue. You also
>> have the flexibility of allocating max quota per user in a queue as
>> well.
>>
>>
>>
>> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> > Hi list,
>> >
>> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> mappers have finished and the system is processing the reducers.
>> Another user is running a NORMAL priority job with 1 mapper and 26
>> reducers. The mapper has finished, but the reducers won't come out of
>> "pending" state. There are no other jobs running right now. We've not
>> yet installed a different scheduler, so right now the system is using
>> the default scheduler. How can this behavior be explained? I see
>> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> reducers of multiple jobs run concurrently, but I'm not completely
>> sure. Any idea?
>> >
>> > Evert
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

whats the memory/cpu stats on the machines ? are they exhausted

On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> reducers of multiple jobs do run con-currently as long as they have the
>> resources available.
>
> Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?
>
> Evert
>
>>
>> If you want to limit someone overtaking the cluster, then you can
>> create different job queues and assign quota to each queue. You also
>> have the flexibility of allocating max quota per user in a queue as
>> well.
>>
>>
>>
>> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> > Hi list,
>> >
>> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> mappers have finished and the system is processing the reducers.
>> Another user is running a NORMAL priority job with 1 mapper and 26
>> reducers. The mapper has finished, but the reducers won't come out of
>> "pending" state. There are no other jobs running right now. We've not
>> yet installed a different scheduler, so right now the system is using
>> the default scheduler. How can this behavior be explained? I see
>> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> reducers of multiple jobs run concurrently, but I'm not completely
>> sure. Any idea?
>> >
>> > Evert
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

whats the memory/cpu stats on the machines ? are they exhausted

On Tue, Aug 14, 2012 at 5:20 PM, Evert Lammerts <Ev...@sara.nl> wrote:
>> reducers of multiple jobs do run con-currently as long as they have the
>> resources available.
>
> Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?
>
> Evert
>
>>
>> If you want to limit someone overtaking the cluster, then you can
>> create different job queues and assign quota to each queue. You also
>> have the flexibility of allocating max quota per user in a queue as
>> well.
>>
>>
>>
>> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
>> <Ev...@sara.nl> wrote:
>> > Hi list,
>> >
>> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
>> exposing 528 map slots and 528 reduce slots. Currently somebody is
>> running a NORMAL priority job with 7 mappers and 400 reducers. The
>> mappers have finished and the system is processing the reducers.
>> Another user is running a NORMAL priority job with 1 mapper and 26
>> reducers. The mapper has finished, but the reducers won't come out of
>> "pending" state. There are no other jobs running right now. We've not
>> yet installed a different scheduler, so right now the system is using
>> the default scheduler. How can this behavior be explained? I see
>> mappers of multiple jobs run concurrently, and I *thought* I've seen
>> reducers of multiple jobs run concurrently, but I'm not completely
>> sure. Any idea?
>> >
>> > Evert
>>
>>
>>
>> --
>> Nitin Pawar



-- 
Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> reducers of multiple jobs do run con-currently as long as they have the
> resources available.

Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?

Evert

> 
> If you want to limit someone overtaking the cluster, then you can
> create different job queues and assign quota to each queue. You also
> have the flexibility of allocating max quota per user in a queue as
> well.
> 
> 
> 
> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> > Hi list,
> >
> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> exposing 528 map slots and 528 reduce slots. Currently somebody is
> running a NORMAL priority job with 7 mappers and 400 reducers. The
> mappers have finished and the system is processing the reducers.
> Another user is running a NORMAL priority job with 1 mapper and 26
> reducers. The mapper has finished, but the reducers won't come out of
> "pending" state. There are no other jobs running right now. We've not
> yet installed a different scheduler, so right now the system is using
> the default scheduler. How can this behavior be explained? I see
> mappers of multiple jobs run concurrently, and I *thought* I've seen
> reducers of multiple jobs run concurrently, but I'm not completely
> sure. Any idea?
> >
> > Evert
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> reducers of multiple jobs do run con-currently as long as they have the
> resources available.

Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?

Evert

> 
> If you want to limit someone overtaking the cluster, then you can
> create different job queues and assign quota to each queue. You also
> have the flexibility of allocating max quota per user in a queue as
> well.
> 
> 
> 
> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> > Hi list,
> >
> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> exposing 528 map slots and 528 reduce slots. Currently somebody is
> running a NORMAL priority job with 7 mappers and 400 reducers. The
> mappers have finished and the system is processing the reducers.
> Another user is running a NORMAL priority job with 1 mapper and 26
> reducers. The mapper has finished, but the reducers won't come out of
> "pending" state. There are no other jobs running right now. We've not
> yet installed a different scheduler, so right now the system is using
> the default scheduler. How can this behavior be explained? I see
> mappers of multiple jobs run concurrently, and I *thought* I've seen
> reducers of multiple jobs run concurrently, but I'm not completely
> sure. Any idea?
> >
> > Evert
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> reducers of multiple jobs do run con-currently as long as they have the
> resources available.

Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?

Evert

> 
> If you want to limit someone overtaking the cluster, then you can
> create different job queues and assign quota to each queue. You also
> have the flexibility of allocating max quota per user in a queue as
> well.
> 
> 
> 
> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> > Hi list,
> >
> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> exposing 528 map slots and 528 reduce slots. Currently somebody is
> running a NORMAL priority job with 7 mappers and 400 reducers. The
> mappers have finished and the system is processing the reducers.
> Another user is running a NORMAL priority job with 1 mapper and 26
> reducers. The mapper has finished, but the reducers won't come out of
> "pending" state. There are no other jobs running right now. We've not
> yet installed a different scheduler, so right now the system is using
> the default scheduler. How can this behavior be explained? I see
> mappers of multiple jobs run concurrently, and I *thought* I've seen
> reducers of multiple jobs run concurrently, but I'm not completely
> sure. Any idea?
> >
> > Evert
> 
> 
> 
> --
> Nitin Pawar

RE: Pending reducers

Posted by Evert Lammerts <Ev...@sara.nl>.

> reducers of multiple jobs do run con-currently as long as they have the
> resources available.

Yep, and that's what's not happening in my situation. 528 reduce slots, 400 taken by one job, 26 of another job remain in pending state. What could explain this behavior?

Evert

> 
> If you want to limit someone overtaking the cluster, then you can
> create different job queues and assign quota to each queue. You also
> have the flexibility of allocating max quota per user in a queue as
> well.
> 
> 
> 
> On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts
> <Ev...@sara.nl> wrote:
> > Hi list,
> >
> > I have a cluster running Hadoop 0.20.205 with Kerberos enabled,
> exposing 528 map slots and 528 reduce slots. Currently somebody is
> running a NORMAL priority job with 7 mappers and 400 reducers. The
> mappers have finished and the system is processing the reducers.
> Another user is running a NORMAL priority job with 1 mapper and 26
> reducers. The mapper has finished, but the reducers won't come out of
> "pending" state. There are no other jobs running right now. We've not
> yet installed a different scheduler, so right now the system is using
> the default scheduler. How can this behavior be explained? I see
> mappers of multiple jobs run concurrently, and I *thought* I've seen
> reducers of multiple jobs run concurrently, but I'm not completely
> sure. Any idea?
> >
> > Evert
> 
> 
> 
> --
> Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

reducers of multiple jobs do run con-currently as long as they have
the resources available.

If you want to limit someone overtaking the cluster, then you can
create different job queues and assign quota to each queue. You also
have the flexibility of allocating max quota per user in a queue as
well.



On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts <Ev...@sara.nl> wrote:
> Hi list,
>
> I have a cluster running Hadoop 0.20.205 with Kerberos enabled, exposing 528 map slots and 528 reduce slots. Currently somebody is running a NORMAL priority job with 7 mappers and 400 reducers. The mappers have finished and the system is processing the reducers. Another user is running a NORMAL priority job with 1 mapper and 26 reducers. The mapper has finished, but the reducers won't come out of "pending" state. There are no other jobs running right now. We've not yet installed a different scheduler, so right now the system is using the default scheduler. How can this behavior be explained? I see mappers of multiple jobs run concurrently, and I *thought* I've seen reducers of multiple jobs run concurrently, but I'm not completely sure. Any idea?
>
> Evert



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

reducers of multiple jobs do run con-currently as long as they have
the resources available.

If you want to limit someone overtaking the cluster, then you can
create different job queues and assign quota to each queue. You also
have the flexibility of allocating max quota per user in a queue as
well.



On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts <Ev...@sara.nl> wrote:
> Hi list,
>
> I have a cluster running Hadoop 0.20.205 with Kerberos enabled, exposing 528 map slots and 528 reduce slots. Currently somebody is running a NORMAL priority job with 7 mappers and 400 reducers. The mappers have finished and the system is processing the reducers. Another user is running a NORMAL priority job with 1 mapper and 26 reducers. The mapper has finished, but the reducers won't come out of "pending" state. There are no other jobs running right now. We've not yet installed a different scheduler, so right now the system is using the default scheduler. How can this behavior be explained? I see mappers of multiple jobs run concurrently, and I *thought* I've seen reducers of multiple jobs run concurrently, but I'm not completely sure. Any idea?
>
> Evert



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

reducers of multiple jobs do run con-currently as long as they have
the resources available.

If you want to limit someone overtaking the cluster, then you can
create different job queues and assign quota to each queue. You also
have the flexibility of allocating max quota per user in a queue as
well.



On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts <Ev...@sara.nl> wrote:
> Hi list,
>
> I have a cluster running Hadoop 0.20.205 with Kerberos enabled, exposing 528 map slots and 528 reduce slots. Currently somebody is running a NORMAL priority job with 7 mappers and 400 reducers. The mappers have finished and the system is processing the reducers. Another user is running a NORMAL priority job with 1 mapper and 26 reducers. The mapper has finished, but the reducers won't come out of "pending" state. There are no other jobs running right now. We've not yet installed a different scheduler, so right now the system is using the default scheduler. How can this behavior be explained? I see mappers of multiple jobs run concurrently, and I *thought* I've seen reducers of multiple jobs run concurrently, but I'm not completely sure. Any idea?
>
> Evert



-- 
Nitin Pawar

Re: Pending reducers

Posted by Nitin Pawar <ni...@gmail.com>.

reducers of multiple jobs do run con-currently as long as they have
the resources available.

If you want to limit someone overtaking the cluster, then you can
create different job queues and assign quota to each queue. You also
have the flexibility of allocating max quota per user in a queue as
well.



On Tue, Aug 14, 2012 at 4:09 PM, Evert Lammerts <Ev...@sara.nl> wrote:
> Hi list,
>
> I have a cluster running Hadoop 0.20.205 with Kerberos enabled, exposing 528 map slots and 528 reduce slots. Currently somebody is running a NORMAL priority job with 7 mappers and 400 reducers. The mappers have finished and the system is processing the reducers. Another user is running a NORMAL priority job with 1 mapper and 26 reducers. The mapper has finished, but the reducers won't come out of "pending" state. There are no other jobs running right now. We've not yet installed a different scheduler, so right now the system is using the default scheduler. How can this behavior be explained? I see mappers of multiple jobs run concurrently, and I *thought* I've seen reducers of multiple jobs run concurrently, but I'm not completely sure. Any idea?
>
> Evert



-- 
Nitin Pawar