You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Jared winick <ja...@gmail.com> on 2011/10/28 21:14:28 UTC

Overloading Tablet Servers

We are running into an issue where it appears we are attempting to
ingest more data than one of our tablet servers can handle and we are
trying to understand how can a client maximize ingest without
“overloading” the cluster. A few minutes after starting our ingest, we
see the “Hold Time” for the tablet server begin to increase. It
continues to increase until we stop our ingest, and then it takes
several minutes for the “Hold Time” to clear. We also end up seeing a
HoldTimeoutException: Commits are held. We have some custom Iterators
that are running at minor compaction time that are likely slowing the
performance of the minor compaction.  While I know my cluster may be
under provisioned or not configured ideally, is there a way a client
can intelligently throttle their ingest to maximize what the servers
can handle without exceeding that? Thanks a lot.

Jared

Re: Overloading Tablet Servers

Posted by Jared winick <ja...@gmail.com>.
> If you are unsure if your iterators are slowing things down, then try
> removing them from the table and ingesting.  Hopefully you can then
> tell wether its the iterators or something else that is causing
> problems.
>

Yeah, the Aggregators I had running seem to be the cause of the very
long minor compactions. I think some of my initial confusion has been
cleared up from what you guys said and looking at the
TabletServerBatchWriter.  Unit tests of my Aggregators don't show them
to be abnormally slow, but I think I just need to better understand
how Accumulo is instantiating and using them internally so I can
figure out what is going on. I may also be able to take Eric's
suggestion of not running them at minc time but that might cause my
queries to slow down too much, so I will have to see. Thanks for the
help.

Re: Overloading Tablet Servers

Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Oct 28, 2011 at 3:14 PM, Jared winick <ja...@gmail.com> wrote:
> HoldTimeoutException: Commits are held. We have some custom Iterators
> that are running at minor compaction time that are likely slowing the
> performance of the minor compaction.
>

If you are unsure if your iterators are slowing things down, then try
removing them from the table and ingesting.  Hopefully you can then
tell wether its the iterators or something else that is causing
problems.

Re: Overloading Tablet Servers

Posted by Eric Newton <er...@gmail.com>.
Hold is the mechanism by which ingest is throttled.

Can you move your iterators to scan / majc only?

You might be experiencing other write-related issues... sometimes a single
node can work, and yet work so slowly that the rest of the cluster suffers.

Recently, we found a node that had many networking errors, none of which
resulted in failed connections, just poor performance.  Removing that node
resulted in ingest speeds tripling for several minutes.

Is the same node always held?  Watch the "Tablet Servers" view on the
monitoring page, and not the tables view.

-Eric

On Fri, Oct 28, 2011 at 3:14 PM, Jared winick <ja...@gmail.com> wrote:

> We are running into an issue where it appears we are attempting to
> ingest more data than one of our tablet servers can handle and we are
> trying to understand how can a client maximize ingest without
> “overloading” the cluster. A few minutes after starting our ingest, we
> see the “Hold Time” for the tablet server begin to increase. It
> continues to increase until we stop our ingest, and then it takes
> several minutes for the “Hold Time” to clear. We also end up seeing a
> HoldTimeoutException: Commits are held. We have some custom Iterators
> that are running at minor compaction time that are likely slowing the
> performance of the minor compaction.  While I know my cluster may be
> under provisioned or not configured ideally, is there a way a client
> can intelligently throttle their ingest to maximize what the servers
> can handle without exceeding that? Thanks a lot.
>
> Jared
>