You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by GitBox <gi...@apache.org> on 2020/05/14 08:35:35 UTC

[GitHub] [tez] prasanthj opened a new pull request #66: TEZ-4179: [Kubernetes] Extend NodeId in tez to support unique worker identity

prasanthj opened a new pull request #66:
URL: https://github.com/apache/tez/pull/66


   In kubernetes environment where pods can have same host name and port, there can be situations where node trackers could be retaining old instance of the pod in its cache. In case of Hive LLAP, where the llap tez task scheduler maintains the membership of nodes based on zookeeper registry events there can be cases where NODE_ADDED followed by NODE_REMOVED event could end up removing the node/host from node trackers because of stable hostname and service port. The NODE_REMOVED event in this case is old stale event of the already dead pod but ZK will send only after session timeout (in case of non-graceful shutdown). If this sequence of events happen, a node/host is completely lost form the schedulers perspective. 
   
   To support this scenario, tez can extend yarn's NodeId to include uniqueIdentifier. Llap task scheduler can construct the container object with this new NodeId that includes uniqueIdentifier as well so that stale events like above will only remove the host/node that matches the old uniqueIdentifier. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tez] jteagles closed pull request #66: TEZ-4179: [Kubernetes] Extend NodeId in tez to support unique worker identity

Posted by GitBox <gi...@apache.org>.
jteagles closed pull request #66:
URL: https://github.com/apache/tez/pull/66


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tez] jteagles commented on pull request #66: TEZ-4179: [Kubernetes] Extend NodeId in tez to support unique worker identity

Posted by GitBox <gi...@apache.org>.
jteagles commented on pull request #66:
URL: https://github.com/apache/tez/pull/66#issuecomment-745490694


   TEZ-4179 was merged via apache gitbox. Closing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org