You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Shanthoosh Venkataraman (JIRA)" <ji...@apache.org> on 2017/04/13 03:50:41 UTC

[jira] [Created] (SAMZA-1209) Improve error handling in LocalStoreMonitor

Shanthoosh Venkataraman created SAMZA-1209:
----------------------------------------------

             Summary: Improve error handling in LocalStoreMonitor 
                 Key: SAMZA-1209
                 URL: https://issues.apache.org/jira/browse/SAMZA-1209
             Project: Samza
          Issue Type: Improvement
    Affects Versions: 0.13.0
            Reporter: Shanthoosh Venkataraman
            Assignee: Shanthoosh Venkataraman
            Priority: Minor


Unused (and possibly stale) local state created by dead samza containers in yarn execution environment is garbage collected by LocalStoreMonitor. 

With errors during garbage collection of state created by a job, current implementation throws exception and exits. 

One failure scenario is in some cases, job model cannot be retrieved by LocalStoreMonitor(Due to network error or job might be invalid).

To improve, add an opt-in configuration, when enabled continues local state clean up of other dead containers when there’re failures with a garbage collection. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)