You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by jonh111 <jh...@gmail.com> on 2011/03/17 16:32:26 UTC

Task failed to report status exceptions

Hi,

I'm running jaql over a cluster of 6 machines.

When i run my jobs on small data it runs smoothly.

However, when i use larger data (~4G) the following occurs:

I can see that alot of tasks which have been completed, go back to "pending"
state. 
When this happens i get exceptions that look like:

"Task attempt_201103161639_0002_m_000000_0 failed to report status for 607
seconds. Killing!"

Most of the time the cluster gets stuck ater a while , apperantly from
memory loss, and should be restarted.

What can possibly be wrong?
Are there any parameters i should change?

Thanks,
Jonathan
-- 
View this message in context: http://old.nabble.com/Task-failed-to-report-status-exceptions-tp31173622p31173622.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: Task failed to report status exceptions

Posted by Stack <st...@duboce.net>.
Can you add logging to your tasks?

Is it that hbase goes unavailable or is it that you are not pulling on
the progessable w/i the ten minute timeout.

You are not trying to post 4G of data to a single cell (I know you are
not but asking just in case).

St.Ack

On Thu, Mar 17, 2011 at 8:32 AM, jonh111 <jh...@gmail.com> wrote:
>
> Hi,
>
> I'm running jaql over a cluster of 6 machines.
>
> When i run my jobs on small data it runs smoothly.
>
> However, when i use larger data (~4G) the following occurs:
>
> I can see that alot of tasks which have been completed, go back to "pending"
> state.
> When this happens i get exceptions that look like:
>
> "Task attempt_201103161639_0002_m_000000_0 failed to report status for 607
> seconds. Killing!"
>
> Most of the time the cluster gets stuck ater a while , apperantly from
> memory loss, and should be restarted.
>
> What can possibly be wrong?
> Are there any parameters i should change?
>
> Thanks,
> Jonathan
> --
> View this message in context: http://old.nabble.com/Task-failed-to-report-status-exceptions-tp31173622p31173622.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>

Re: Task failed to report status exceptions

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Pardon my ignorance about jaql, but where in your job is HBase used?

J-D

On Thu, Mar 17, 2011 at 8:32 AM, jonh111 <jh...@gmail.com> wrote:
>
> Hi,
>
> I'm running jaql over a cluster of 6 machines.
>
> When i run my jobs on small data it runs smoothly.
>
> However, when i use larger data (~4G) the following occurs:
>
> I can see that alot of tasks which have been completed, go back to "pending"
> state.
> When this happens i get exceptions that look like:
>
> "Task attempt_201103161639_0002_m_000000_0 failed to report status for 607
> seconds. Killing!"
>
> Most of the time the cluster gets stuck ater a while , apperantly from
> memory loss, and should be restarted.
>
> What can possibly be wrong?
> Are there any parameters i should change?
>
> Thanks,
> Jonathan
> --
> View this message in context: http://old.nabble.com/Task-failed-to-report-status-exceptions-tp31173622p31173622.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>