You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vladimir Steshin (Jira)" <ji...@apache.org> on 2020/10/02 07:49:00 UTC

[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

     [ https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladimir Steshin updated IGNITE-13206:
--------------------------------------
    Priority: Major  (was: Minor)

> Represent in the documenttion affection of several node addresses on failure detection.
> ---------------------------------------------------------------------------------------
>
>                 Key: IGNITE-13206
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13206
>             Project: Ignite
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Vladimir Steshin
>            Assignee: Denis A. Magda
>            Priority: Major
>              Labels: iep-45
>             Fix For: 2.9
>
>
> Current TcpDiscoverySpi can prolong detection of node failure which has several IP addresses. This happens because most of the timeouts like failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual failure detection delay is: failureDetectionTimeout*addressesNumber. And the node addresses are sorted out consistently. This affection on failure detection should be noted in the documentation.
> The suggestion is to represent this behavior in https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:
> "_You should assing multiple addresses to a node only if they represent some real physical connections which can give more reliability. Providing several addresses can prolong failure detection of current node. The timeouts and settings on network operations (_failureDetectionTimeout(), sockTimeout, ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The exception is _connRecoveryTimeout_. And node addresses are sorted out sequentially.
>      Example: if you use _failureDetectionTimeout _and have set 3 ip addresses for this node, previous node in  the ring can take up to 'failureDetectionTimeout * 3' to detect failure of current node_."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)