You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2010/09/27 21:18:17 UTC

Client hanging 20 seconds after job's over (WAS: Re: Can I run HBase 0.20.6 on Hadoop 0.21?)

(adding mapreduce-user@ and re-scoping title)

Can you jstack the client while it's waiting 20 seconds? Is it still
waiting for the job to come back or it's something else? Is the job
itself done cleaning 20 seconds before the call returns on the client
side (check the web ui)?

J-D

On Mon, Sep 27, 2010 at 12:10 PM, Pete Tyler <pe...@gmail.com> wrote:
> Thanks for the offer, much appreciated I have a very simple mapreduce job on a pseudo distributed system. I have a very small amount of persisted data.
>
> Running locally the mapreduce job runs very quickly, less than three seconds.
>
> When I run the job against the pseudo distributed hadoop, still on the same machine, as the client then I see the following,
> - the map and reduce classes run very quickly, a matter of mills in total ... sweet
> - the client, blocks waiting for the job to finish for about 20 seconds ... very slow
>
> I'm trying to understand why I have this 20 second overhead and what I can do about it.
>
> My map and reduce classes are in my Hadoop classpath.
>
> On Sep 27, 2010, at 11:32 AM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>
>> Using 0.21.0 may reveal newer bugs rather than fixing your older ones.
>> Maybe we can help you debugging 0.20.2, what are you seeing?
>>
>> J-D
>>

Re: Client hanging 20 seconds after job's over (WAS: Re: Can I run HBase 0.20.6 on Hadoop 0.21?)

Posted by Pete Tyler <pe...@gmail.com>.

Oops .... Sorry, hit send by mistake.

The stack is pretty large and as the job tracker webui shows cleanup completes about 1 sec before the client finishes this does not look like a client issue. Is that a reasonable assumption?

The jobtracker webui shows map takes 2 secs but reduce from 9 to 12 secs.

Summary:

0 secs: client submits job
+0 secs: jobtracker web ui shows job started
+6 secs: web ui shows map started
+9 secs: web ui shows map complete
+9 secs: web ui shows reduce started
+21 secs: web ui shows reduce complete
+24 secs: web ui shows job cleanup successful

If I change the map class so that it passes on zero records to the reduce, ie reduce input records = 0, the reduce step still takes 9 seconds.

On Sep 27, 2010, at 7:23 PM, Pete Tyler <pe...@gmail.com> wrote:

> The stack us pretty large and as the job tracker webui shows cleanup completes about 1 sec before the client finishes this does not look like a client issue. 
> 
> The jobtracker webui shows map

Re: Client hanging 20 seconds after job's over (WAS: Re: Can I run HBase 0.20.6 on Hadoop 0.21?)

Posted by Pete Tyler <pe...@gmail.com>.

The stack us pretty large and as the job tracker webui shows cleanup completes about 1 sec before the client finishes this does not look like a client issue. 

The jobtracker webui shows map 

On Sep 27, 2010, at 12:18 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:

> (adding mapreduce-user@ and re-scoping title)
> 
> Can you jstack the client while it's waiting 20 seconds? Is it still
> waiting for the job to come back or it's something else? Is the job
> itself done cleaning 20 seconds before the call returns on the client
> side (check the web ui)?
> 
> J-D
> 
> On Mon, Sep 27, 2010 at 12:10 PM, Pete Tyler <pe...@gmail.com> wrote:
>> Thanks for the offer, much appreciated I have a very simple mapreduce job on a pseudo distributed system. I have a very small amount of persisted data.
>> 
>> Running locally the mapreduce job runs very quickly, less than three seconds.
>> 
>> When I run the job against the pseudo distributed hadoop, still on the same machine, as the client then I see the following,
>> - the map and reduce classes run very quickly, a matter of mills in total ... sweet
>> - the client, blocks waiting for the job to finish for about 20 seconds ... very slow
>> 
>> I'm trying to understand why I have this 20 second overhead and what I can do about it.
>> 
>> My map and reduce classes are in my Hadoop classpath.
>> 
>> On Sep 27, 2010, at 11:32 AM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>> 
>>> Using 0.21.0 may reveal newer bugs rather than fixing your older ones.
>>> Maybe we can help you debugging 0.20.2, what are you seeing?
>>> 
>>> J-D
>>>

Re: Client hanging 20 seconds after job's over (WAS: Re: Can I run HBase 0.20.6 on Hadoop 0.21?)

Posted by Andrey Stepachev <oc...@gmail.com>.

Perhaps the reason of those slowdowns are:
1. copy and unpack job jar.
2. start child java process