You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Eric Newton (JIRA)" <ji...@apache.org> on 2012/04/30 18:35:48 UTC

[jira] [Resolved] (ACCUMULO-294) tablet servers are losing zookeeper locks due to garbage collection even when there is lots of free memory

     [ https://issues.apache.org/jira/browse/ACCUMULO-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Newton resolved ACCUMULO-294.
----------------------------------

    Resolution: Not A Problem
    
> tablet servers are losing zookeeper locks due to garbage collection even when there is lots of free memory
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-294
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-294
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.3.5
>         Environment: tablet servers on a large cluster are losing their locks
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Minor
>
> Noticed that 5 tablet servers stopped on a large cluster.  Found that each server had lost its lock due to a zookeeper session timeout. The zookeeper timeout is set to 40 seconds. In all the cases, this lost lock was preceded by the ejection of blocks from the block cache, and a garbage collection that recovered >4G of memory.  The tablet servers were running with 8G, and were generally running with 4G free.  There was very little time attributed to garbage collection, at least as printed in the debug log.  The in-memory map is small (256M) and running the native version.  Will experiment with more aggressive concurrent GC settings:
> {noformat}
> -XX:CMSInitiatingOccupancyFraction=75
> {noformat}
> to
> {noformat}
> -XX:CMSInitiatingOccupancyFraction=60
> {noformat}
> Zookeeper has already been configured with this:
> {noformat}
> globalOutstandingLimit=10000
> {noformat}
> Which helped enormously.  Each zookeeper server has between 500 and 1700 clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira