You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@camel.apache.org by "Claus Ibsen (Jira)" <ji...@apache.org> on 2021/02/01 14:11:00 UTC

[jira] [Updated] (CAMEL-15903) Master component do not retry endpoint startup on failure

     [ https://issues.apache.org/jira/browse/CAMEL-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claus Ibsen updated CAMEL-15903:
--------------------------------
    Fix Version/s:     (was: 3.8.0)
                   3.9.0

> Master component do not retry endpoint startup on failure
> ---------------------------------------------------------
>
>                 Key: CAMEL-15903
>                 URL: https://issues.apache.org/jira/browse/CAMEL-15903
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-master
>            Reporter: EDGAR CHERNICK
>            Priority: Major
>             Fix For: 3.9.0
>
>
> The cluster view implementations have a listener attribute where the master component hooks itself to receive leadership change events. 
> When the app instance becomes leader the cluster view will mark that instance as leader then it will trigger the leadershipchangedevent, this will trigger the master component event handler and it will start the delegated consumer and endpoint.
> The issue happens when the delegated consumer or endpoint fail to start. The exception throw by them will go up in the stack, however, this exception does not affect the leadership, i.e., once the app instance becomes leader it will stay so even if the delegated components fail to start.
> Both KubernetesClusterView and FileLockClusterView have this issue.
> KubernetesClusterView uses KubernetesLeadershipController to run the leadership check at an interval. When it acquires the leadership it updates the configmap with that info and call TimedLeaderNotifier refreshLeadership method to check if the leadership has changed. The issue here is that it will mark itself as leader before firing the leadership changed event. Another issue is that the event is fired in a separete thread, so, when the start of the delegated components fail the exception will "die" together with the thread. When the next scheduled leadership check runs the app instance is already the leader and it will not fire the leadership changed event and the delegated component will never start.
> FileLockClusterView has a similar issue, it acquires the file lock prior to firing the event, even if the event processing fails it does not rollback the leader selection.
> Other cluster view implementations might have the same issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)