You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Ryota Egashira (JIRA)" <ji...@apache.org> on 2015/04/16 01:08:58 UTC

[jira] [Updated] (OOZIE-2206) Change Reaper mode on ChildReaper in ZKLocksService

     [ https://issues.apache.org/jira/browse/OOZIE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryota Egashira updated OOZIE-2206:
----------------------------------
    Description: 
OOZIE-1906 added znode cleanup thread.
currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server to keep reaping znode even after znode is cleaned up.  (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java) 
This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE  or REAP_UNTIL_DELETE
  
{code}
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, Reaper.Mode.REAP_INDEFINITELY, getExecutorService(), ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000, REAPING_LEADER_PATH);
{code}

we hit one scenario where  one ZK quorum slows down for short period, causing many Zk locks not released properly, right after ChildReaper (every 5 min ) runs, which keep checking the list of Znode ever since, in the end, Oozie server hit OOM.


  was:
OOZIE-1906 added znode cleanup thread.
currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server to keep reaping znode even after znode is cleaned up.   This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE  or REAP_UNTIL_DELETE
  
{code}
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, Reaper.Mode.REAP_INDEFINITELY, getExecutorService(), ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000, REAPING_LEADER_PATH);
{code}

we hit one scenario where  one ZK quorum slows down for short period, causing many Zk locks not released properly, right after ChildReaper (every 5 min ) runs, which keep checking the list of Znode ever since, in the end, Oozie server hit OOM.



> Change Reaper mode on ChildReaper in ZKLocksService
> ---------------------------------------------------
>
>                 Key: OOZIE-2206
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2206
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Ryota Egashira
>
> OOZIE-1906 added znode cleanup thread.
> currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server to keep reaping znode even after znode is cleaned up.  (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java) 
> This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE  or REAP_UNTIL_DELETE
>   
> {code}
> reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, Reaper.Mode.REAP_INDEFINITELY, getExecutorService(), ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000, REAPING_LEADER_PATH);
> {code}
> we hit one scenario where  one ZK quorum slows down for short period, causing many Zk locks not released properly, right after ChildReaper (every 5 min ) runs, which keep checking the list of Znode ever since, in the end, Oozie server hit OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)