You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/12/07 15:50:08 UTC

[GitHub] [samza] mynameborat commented on pull request #1452: SAMZA-2611: [AM-HA] heartbeat reestablish causes container's heartbeat thread to die

mynameborat commented on pull request #1452:
URL: https://github.com/apache/samza/pull/1452#issuecomment-740003171


   Symptom: When new AM takes a long time to to start up, already running container's heartbeat thread silently dies and does not make any heartbeat requests to the new AM.
   
   Cause: AM url (yarn.am.tracking.url) key-value is removed from Coordinator stream when new AM is starting up - as this config is present in old config (aka coordinator stream) but not in the new AM generated config. This causes the running container to fetch a null when its constantly fetching value for this key and thus throws NPE.
   
   Changes: When AMHA is enabled, do not remove this config
   
   Tests: works with hello-samza. Trying to write a unit test but CoordinatorStreamUtil is tricky to mock and inject stuff.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org