You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shuja Rehman <sh...@gmail.com> on 2011/06/30 17:07:48 UTC

Task attempt_201106301204_0007_m_000001_0 failed to report status for 600 seconds. Killing!

Hi,

I am doing  bulk insertion into Hbase using Map reduce reading from lot of
small(10MB approximation) files, resulting mappers = no of files. I am also
monitoring the performance using Ganglia. The machines are c1.xlarge for
processing the files(task trackers+data nodes) and m1.xlarge for hbase
cluster(region servers+data nodes). The CPU usage remain 75%-100% for almost
all of the servers. The ram usage also below 5 GB. But the job fails due to
killing of lot of maps. If i run the same job without insertion then
processing complete in 9-10 minutes. So the question is why it is  killing
so many maps? Any clue?


-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: Task attempt_201106301204_0007_m_000001_0 failed to report status for 600 seconds. Killing!

Posted by Stack <st...@duboce.net>.
On Thu, Jun 30, 2011 at 8:07 AM, Shuja Rehman <sh...@gmail.com> wrote:
> I am doing  bulk insertion into Hbase using Map reduce reading from lot of
> small(10MB approximation) files, resulting mappers = no of files. I am also
> monitoring the performance using Ganglia. The machines are c1.xlarge for
> processing the files(task trackers+data nodes) and m1.xlarge for hbase
> cluster(region servers+data nodes). The CPU usage remain 75%-100% for almost
> all of the servers. The ram usage also below 5 GB. But the job fails due to
> killing of lot of maps. If i run the same job without insertion then
> processing complete in 9-10 minutes. So the question is why it is  killing
> so many maps? Any clue?
>

Can you figure which region the map tasks are failing against?  And
once you have the region, what was the regionserver that was hosting
this region (grep master log to figure this).  Thereafter, check the
RS logs around the time of the map task timeout.  See anything?  Long
GC?  A region split?  600 seconds is a long time for the server-side
to be hung up.

What version of hbase?

Thanks,
St.Ack