You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "W.P. McNeill" <bi...@gmail.com> on 2010/12/23 20:15:17 UTC

How frequently can I set status?

I have a loop that runs over a large number of iterations (order of 100,000)
very quickly.  It is nice to do context.setStatus() with an indication of
where I am in the loop.  Currently I'm only calling setStatus() every 10,000
iterations because I don't want to overwhelm the task trackers with lots of
status messages.  Is this something I should be worried, about or is Hadoop
designed to handle a high volume of status messages?  If so, I'll just call
setStatus() every iteration.

Re: How frequently can I set status?

Posted by "W.P. McNeill" <bi...@gmail.com>.
I figured that was the case and it's okay if I don't see every status
message, as long as it doesn't hurt anything to send them.

On Thu, Dec 23, 2010 at 12:51 PM, Ken <ke...@gmail.com> wrote:

> If I remember correctly, status is only sent on heartbeat. Which means if
> you are setting inside a fast running loop, you won't see every status
> message, only the status message that was current when the heartbeat was
> sent to the jobtracker.
>
> Sent from my iPad
>
> On Dec 23, 2010, at 11:41 AM, Ted Dunning <td...@maprtech.com> wrote:
>
> > It is reasonable to update counters often, but I think you are right to
> > limit the number status updates.
> >
> > On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill <bi...@gmail.com>
> wrote:
> >
> >> I have a loop that runs over a large number of iterations (order of
> >> 100,000)
> >> very quickly.  It is nice to do context.setStatus() with an indication
> of
> >> where I am in the loop.  Currently I'm only calling setStatus() every
> >> 10,000
> >> iterations because I don't want to overwhelm the task trackers with lots
> of
> >> status messages.  Is this something I should be worried, about or is
> Hadoop
> >> designed to handle a high volume of status messages?  If so, I'll just
> call
> >> setStatus() every iteration.
> >>
>

Re: How frequently can I set status?

Posted by Ken <ke...@gmail.com>.
If I remember correctly, status is only sent on heartbeat. Which means if you are setting inside a fast running loop, you won't see every status message, only the status message that was current when the heartbeat was sent to the jobtracker. 

Sent from my iPad

On Dec 23, 2010, at 11:41 AM, Ted Dunning <td...@maprtech.com> wrote:

> It is reasonable to update counters often, but I think you are right to
> limit the number status updates.
> 
> On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill <bi...@gmail.com> wrote:
> 
>> I have a loop that runs over a large number of iterations (order of
>> 100,000)
>> very quickly.  It is nice to do context.setStatus() with an indication of
>> where I am in the loop.  Currently I'm only calling setStatus() every
>> 10,000
>> iterations because I don't want to overwhelm the task trackers with lots of
>> status messages.  Is this something I should be worried, about or is Hadoop
>> designed to handle a high volume of status messages?  If so, I'll just call
>> setStatus() every iteration.
>> 

Re: How frequently can I set status?

Posted by Ted Dunning <td...@maprtech.com>.
It is reasonable to update counters often, but I think you are right to
limit the number status updates.

On Thu, Dec 23, 2010 at 11:15 AM, W.P. McNeill <bi...@gmail.com> wrote:

> I have a loop that runs over a large number of iterations (order of
> 100,000)
> very quickly.  It is nice to do context.setStatus() with an indication of
> where I am in the loop.  Currently I'm only calling setStatus() every
> 10,000
> iterations because I don't want to overwhelm the task trackers with lots of
> status messages.  Is this something I should be worried, about or is Hadoop
> designed to handle a high volume of status messages?  If so, I'll just call
> setStatus() every iteration.
>