You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by Roshan Punnoose <ro...@gmail.com> on 2014/01/02 14:08:13 UTC

Commits held

We have a consistent(10-20 per second), large document(~100MB), ingest that
has been running for a few weeks. However, every few days the tablet
servers seem to be falling over with this error:

ERROR: Internal error processing closeUpdate
org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits are
held

It may be the large documents ingesting are causing too many splits to
occur and freezing the ingest? Maybe increasing the split threshold for the
tablet would fix it. Any ideas?

Re: Commits held

Posted by Eric Newton <er...@gmail.com>.

You could be experiencing a low-memory state in your VM, and it may be
spending so much time GC'ing that it isn't doing much work on your behalf
just before it gives up and crashes.  The error from the jvm will go to
stderr.

-Eric



On Thu, Jan 2, 2014 at 1:41 PM, Roshan Punnoose <ro...@gmail.com> wrote:

> Hadoop 2.1.0.2.0.5.0-67
> Accumulo 1.5.0
>
> The client is failing to ingest because the tablet servers are crashing.
> However, the only log message seems to be the "Commits held" exception
> repeated over and over again.
>
> As for the .out/.err files, I can't seem to find them. Just the .log
> files. We have an init.d script that starts accumulo and redirects the
> stdout/err to /dev/null. If you think these might shed more light onto the
> problems, I can redirect them properly and wait for the issues to arise
> again.
>
> Roshan
>
>
> On Thu, Jan 2, 2014 at 1:27 PM, Eric Newton <er...@gmail.com> wrote:
>
>> Hadoop and accumulo version?
>>
>> Your client is failing to ingest because commits are being held for a
>> long time?  Or, are your tablet servers crashing?
>>
>> Are you seeing any warnings/errors in the tablet server log (and .out
>> .err) files?
>>
>> -Eric
>>
>>
>>
>> On Thu, Jan 2, 2014 at 8:08 AM, Roshan Punnoose <ro...@gmail.com>wrote:
>>
>>> We have a consistent(10-20 per second), large document(~100MB), ingest
>>> that has been running for a few weeks. However, every few days the tablet
>>> servers seem to be falling over with this error:
>>>
>>> ERROR: Internal error processing closeUpdate
>>> org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits
>>> are held
>>>
>>> It may be the large documents ingesting are causing too many splits to
>>> occur and freezing the ingest? Maybe increasing the split threshold for the
>>> tablet would fix it. Any ideas?
>>>
>>
>>
>

Re: Commits held

Posted by Roshan Punnoose <ro...@gmail.com>.

Hadoop 2.1.0.2.0.5.0-67
Accumulo 1.5.0

The client is failing to ingest because the tablet servers are crashing.
However, the only log message seems to be the "Commits held" exception
repeated over and over again.

As for the .out/.err files, I can't seem to find them. Just the .log files.
We have an init.d script that starts accumulo and redirects the stdout/err
to /dev/null. If you think these might shed more light onto the problems, I
can redirect them properly and wait for the issues to arise again.

Roshan

On Thu, Jan 2, 2014 at 1:27 PM, Eric Newton <er...@gmail.com> wrote:

> Hadoop and accumulo version?
>
> Your client is failing to ingest because commits are being held for a long
> time?  Or, are your tablet servers crashing?
>
> Are you seeing any warnings/errors in the tablet server log (and .out
> .err) files?
>
> -Eric
>
>
>
> On Thu, Jan 2, 2014 at 8:08 AM, Roshan Punnoose <ro...@gmail.com> wrote:
>
>> We have a consistent(10-20 per second), large document(~100MB), ingest
>> that has been running for a few weeks. However, every few days the tablet
>> servers seem to be falling over with this error:
>>
>> ERROR: Internal error processing closeUpdate
>> org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits are
>> held
>>
>> It may be the large documents ingesting are causing too many splits to
>> occur and freezing the ingest? Maybe increasing the split threshold for the
>> tablet would fix it. Any ideas?
>>
>
>

Re: Commits held

Posted by Eric Newton <er...@gmail.com>.

Hadoop and accumulo version?

Your client is failing to ingest because commits are being held for a long
time?  Or, are your tablet servers crashing?

Are you seeing any warnings/errors in the tablet server log (and .out .err)
files?

-Eric

On Thu, Jan 2, 2014 at 8:08 AM, Roshan Punnoose <ro...@gmail.com> wrote:

> We have a consistent(10-20 per second), large document(~100MB), ingest
> that has been running for a few weeks. However, every few days the tablet
> servers seem to be falling over with this error:
>
> ERROR: Internal error processing closeUpdate
> org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits are
> held
>
> It may be the large documents ingesting are causing too many splits to
> occur and freezing the ingest? Maybe increasing the split threshold for the
> tablet would fix it. Any ideas?
>