You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Sudarshan Vasudevan (Jira)" <ji...@apache.org> on 2020/03/26 23:00:00 UTC
[jira] [Created] (GOBBLIN-1099) Handle orphaned Yarn containers in
Gobblin-on-Yarn clusters
Sudarshan Vasudevan created GOBBLIN-1099:
--------------------------------------------
Summary: Handle orphaned Yarn containers in Gobblin-on-Yarn clusters
Key: GOBBLIN-1099
URL: https://issues.apache.org/jira/browse/GOBBLIN-1099
Project: Apache Gobblin
Issue Type: Improvement
Components: gobblin-yarn
Affects Versions: 0.15.0
Reporter: Sudarshan Vasudevan
Assignee: Abhishek Tiwari
Fix For: 0.15.0
A Yarn application may leave behind orphaned containers, which can happen due to lost node managers. The orphaned containers however can continue to run (potentially forever) as participants in the Helix cluster. This can cause the following problems for a Gobblin-on-Yarn application:
# Double publish of data and commit of state
# Task failures and partition starvation during application restarts, as Helix may assign tasks to the orphaned containers which have a stale state and configuration
# Container failures on application restarts due to Helix instance name collisions with orphaned containers
--
This message was sent by Atlassian Jira
(v8.3.4#803005)