You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Hider, Sandy" <Sa...@jhuapl.edu> on 2013/06/26 01:06:04 UTC

Minor compaction occurring often with fairly long delays during ingest.

I recently setup Accumulo 1.4.2 on a rack of boxes that each has 24 processors and 43 GB of RAM.  I set them up using the 3GB example templates but then increased the max size of the Tserver and a few other components to 5GB.   

Doing some initial tests importing roughly 7000 records, each record has approximately 7 small fields and 1 large field holding data between 200Kb to 1Mb in size.  While ingesting I am seeing the server hold and start minor compactions which seem to take quite a while after 2000-3000 records, and then occurring again fairly frequently

I am wondering what options I have to try and minimize the frequency of minor compactions during ingest.    What components memory sizes and config properties would help me avoid this problem?  If anyone has other ideas for me to try and fix this please let me know.

Thanks in advance,

Sandy




Re: Minor compaction occurring often with fairly long delays during ingest.

Posted by Eric Newton <er...@gmail.com>.
Increase the size of the in-memory map (24-30G), and break your table down
into multiple tablets (if you can predict your split points).  This will
allow long minor compactions to start in parallel.

We have found the sweet-spot for small-record ingest to be "50-200" tablets
per server for live ingest.

You could increase the number of minor compaction threads, assuming you
have enough disks to support multiple writers.

Look for patterns: you could have an ingest hot-spot, or a slow disk/node.

-Eric



On Tue, Jun 25, 2013 at 7:06 PM, Hider, Sandy <Sa...@jhuapl.edu>wrote:

> I recently setup Accumulo 1.4.2 on a rack of boxes that each has 24
> processors and 43 GB of RAM.  I set them up using the 3GB example templates
> but then increased the max size of the Tserver and a few other components
> to 5GB.
>
> Doing some initial tests importing roughly 7000 records, each record has
> approximately 7 small fields and 1 large field holding data between 200Kb
> to 1Mb in size.  While ingesting I am seeing the server hold and start
> minor compactions which seem to take quite a while after 2000-3000 records,
> and then occurring again fairly frequently
>
> I am wondering what options I have to try and minimize the frequency of
> minor compactions during ingest.    What components memory sizes and config
> properties would help me avoid this problem?  If anyone has other ideas for
> me to try and fix this please let me know.
>
> Thanks in advance,
>
> Sandy
>
>
>
>

Re: Minor compaction occurring often with fairly long delays during ingest.

Posted by Keith Turner <ke...@deenlo.com>.
Its possible that you could be running into ACCUMULO-1062 [1], although I
am thinking your large values may be at issue.  Erics suggestion of having
more tablets can help work around this issue, in addition to allowing more
minc parallelism.

[1] : https://issues.apache.org/jira/browse/ACCUMULO-1062


On Tue, Jun 25, 2013 at 7:06 PM, Hider, Sandy <Sa...@jhuapl.edu>wrote:

> I recently setup Accumulo 1.4.2 on a rack of boxes that each has 24
> processors and 43 GB of RAM.  I set them up using the 3GB example templates
> but then increased the max size of the Tserver and a few other components
> to 5GB.
>
> Doing some initial tests importing roughly 7000 records, each record has
> approximately 7 small fields and 1 large field holding data between 200Kb
> to 1Mb in size.  While ingesting I am seeing the server hold and start
> minor compactions which seem to take quite a while after 2000-3000 records,
> and then occurring again fairly frequently
>
> I am wondering what options I have to try and minimize the frequency of
> minor compactions during ingest.    What components memory sizes and config
> properties would help me avoid this problem?  If anyone has other ideas for
> me to try and fix this please let me know.
>
> Thanks in advance,
>
> Sandy
>
>
>
>