You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Denis Chudov (Jira)" <ji...@apache.org> on 2022/10/11 12:53:00 UTC

[jira] [Created] (IGNITE-17872) Fetch commit index on non-primary replicas instead of waiting for safe time in case of RO tx on idle cluster

Denis Chudov created IGNITE-17872:
-------------------------------------

Summary: Fetch commit index on non-primary replicas instead of waiting for safe time in case of RO tx on idle cluster
Key: IGNITE-17872
URL: https://issues.apache.org/jira/browse/IGNITE-17872
Project: Ignite
Issue Type: Bug
Environment: Safe time for non-primary replicas (see IGNITE-17263 ) was conceived as optimization to avoid unnecessary network hops. Safe time is propagated from primary replica via raft appendEntries messages. When there is constant load on cluster that is caused by RW transactions, these messages are refreshing safe time on replicas with decent frequency, but in case of idle cluster, or cluster with read-only load, safe time is propagated periodically via heartbeats. This means that, if a RO transaction with read timestamp in present or future, is trying to read a value from non-primary replica, it will wait for safe time first, which is bound to frequency of heartbeat messages, and hence, the duration of the read operation may be close to the period of heartbeats. This looks weird and will cause performance issues.

Example:
Heartbeat period is 500 ms.
Current safe time on replica is 1.
We are processing read-only request with timestamp=2.
Next expected update of safe time, according to the heartbeat period, is 1 + 500 = 501.
This means that we should wait for about 499 ms (assuming the clock skew and ping in cluster is 0) to proceed with RO request processing.

So, even though safe time is an optimization, we shouldn't use it in cases when there are no RW transactions affecting the given replica, and the timestamp of current RO transaction is greater than safe time. Instead of waiting of the safe time update, we should fallback to reading index from the leader to minimize the time of processing the current RO request.
Reporter: Denis Chudov

--
This message was sent by Atlassian Jira
(v8.20.10#820010)