You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/08/22 00:21:48 UTC

[GitHub] reddycharan edited a comment on issue #1608: Issue 1578: Fixed deadlock in auditor blocking ZK thread

reddycharan edited a comment on issue #1608: Issue 1578: Fixed deadlock in auditor blocking ZK thread
URL: https://github.com/apache/bookkeeper/pull/1608#issuecomment-414863834
 
 
   @sijie, From @merlimat description 
   
   > After getting ZK callback from ZK event thread, we need to jump to a background thread before doing synchronous call to admin.openLedgerNoRecovery(ledgerId); which will try to make a ZK request a wait for a response (which would be coming through same ZK event thread currently blocked..)
   
   I understood it as that, "admin.openLedgerNoRecovery" https://github.com/apache/bookkeeper/commit/f782a9d818a12479d08c580a68b2566715da4c89#diff-7525f06ad3a1ad0a00a462df4deb4698L645 will be blocked consistently. Thats why I was wondering how were we ok so far (5 years since https://github.com/apache/bookkeeper/commit/005b62cc60093dd5b32d4abecd06c2e441bc62ae ) is introduced, since the ZK thread deadlock will eventually lead to Auditor being non-functional.
   
   if you say that because of race condition in ZK library we would run into issue, then it makes some sense for why this issue was not identified completely so far. Being said that I'm just wondering at very high level how probabilistic is it to get into this zk thread deadlock issue? Since this will effectively makes Auditor non-functional, I would like to ascertain how vulnerable we were so far.
   
   > The race condition can be happening at any "checkAllLedgers" run, not necessarily to be the first one. if you look into the code, for each Auditor checkAllLedgers, a new zookeeper client is established, so the race condition can happen any any CheckAllLedgers run. but once it is blocked, no future checkAllLedgers will be run.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services