You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Aleksandr Polovtcev (Jira)" <ji...@apache.org> on 2023/04/04 10:16:00 UTC

[jira] [Assigned] (IGNITE-19095) Cyclic retry of ActionRequest in RaftGroupServiceImpl

     [ https://issues.apache.org/jira/browse/IGNITE-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aleksandr Polovtcev reassigned IGNITE-19095:
--------------------------------------------

    Assignee: Aleksandr Polovtcev

> Cyclic retry of ActionRequest in RaftGroupServiceImpl
> -----------------------------------------------------
>
>                 Key: IGNITE-19095
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19095
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Konstantin Orlov
>            Assignee: Aleksandr Polovtcev
>            Priority: Critical
>              Labels: ignite-3
>         Attachments: log_pollution.txt
>
>
> Please take a look at the following snippet:
> {code:java}
> private void handleThrowable(
>            ...
>     ) {
>         if (recoverable(err)) {
>             ...
>             scheduleRetry(() -> sendWithRetry(randomNode(peer), requestFactory, stopTime, fut));
>         } else {
>             fut.completeExceptionally(err);
>         }
>     }
> {code}
> In case of a recoverable error, the request will be sent once again. But if 2 out of 3 nodes had already been stopped, this retry logic will stuck in an infinite loop. The reason is that ConnectException is considered recoverable, and we are choosing another node keeping in mind only the node that had failed during current iteration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)