You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/10/26 15:15:21 UTC

[GitHub] [flink] XComp commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

XComp commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292206053

   Thanks @reswqa for this PR. I'm wondering how executing the leadership granting/revocation being called from within another thread would help fixing the issue. The locks might be still acquired concurrently in opposite orders leading to the deadlock situation.
   
   The usecase that was described in FLINK-29234 essentially happens because the Dispatcher is stopped (which, as a consequence, would stop `JobMasterServiceLeadershipRunner`) while the `JobMasterServiceLeadershipRunner` is granted leadership causing the locks to be acquired in the opposite order.
   
   I think the problem is that we're still trying to acquire the lock in [JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453) even though the `JobMasterServiceLeadershipRunner` is already switched to `STOPPED` state. I'm wondering whether we could make `JobMasterServiceLeadershipRunner#state` volatile and check the instance being in `RUNNING` state outside of the lock. But this wouldn't solve the issue entirely because there's still a slight chance that the state changes after the state check is processed but before entering the lock... :thinking: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org