You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Chao Wang (Jira)" <ji...@apache.org> on 2021/03/05 01:34:00 UTC

[jira] [Commented] (IOTDB-1182) [Distributed] improve query performance in follower node

    [ https://issues.apache.org/jira/browse/IOTDB-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295685#comment-17295685 ] 

Chao Wang commented on IOTDB-1182:
----------------------------------

Solution 2: Query messages are forwarded only to the leader.
1. The follower provides read and write functions only for high availability and leader. In this way, no additional request commitid needs to be sent for each query to check whether data is consistent, improving read performance.
2. Select different physical nodes for the leaders of the datagroup group to share the load.
3. Metagroup records the leaders of all data groups in the cluster in real time. Heartbeats need to be added.
4. The table will be used together with the lookup table. (considered in the next version)


Solution 3: The follower proactively requests the leader to synchronize data immediately after detecting inconsistency.
1. This solution greatly improves the RPC loss. It is not helpful to catch up with the normal situation. The probability of RPC loss in the basic performance test is low. Therefore, this solution is not helpful for low query performance.
2. Extra network transmission is added, which may cause network congestion and slow down the performance.

> [Distributed] improve query performance  in follower node
> ---------------------------------------------------------
>
>                 Key: IOTDB-1182
>                 URL: https://issues.apache.org/jira/browse/IOTDB-1182
>             Project: Apache IoTDB
>          Issue Type: Improvement
>          Components: Core/Cluster
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.12.0
>
>
> In the raft algorithm, because the leader always has the latest data, the query requests sent by the client should be executed at the leader. However, this will cause the read and write requests to go to the leader, which makes the single point pressure too large, thus affecting the overall throughput of the system. Therefore, in order to reduce hot issues, we can use follower linear read to support follower reading. The cost is that when the follower needs to execute a read request, it needs to ensure that the local applyindex is greater than or equal to the leader's commitindex before reading the state. Therefore, you need to send an RPC to the leader, and judge according to the returned result. If the applyindex of the follower is greater than or equal to the commitindex of the leader, you can query directly. Otherwise, you should passively wait until the local applyindex is greater than or equal to the commitindex of the leader. Considering that there may be temporary inconsistency between the follower and the leader when reading and writing concurrent, it is easy to enter the passive waiting for catchup detection phase when the query is routed to the follower. At this time, the batch of data that is different between the follower and the leader is likely to be in the network, so the best way is for the follower to wait passively and not do any operation. In case of RPC loss or partition, the leader needs to actively detect the inconsistency of the follower data and initiate catchup,            Because the cycle of leader detecting follower is heartbeat interval, which is 1 s by default, and in order to reduce traffic abuse, catchup will be initiated only after 5 times of detection. Once entering this stage, the expected time of waiting for inconsistency to be detected is 2500 MS, which may greatly increase the latency of system read request. Fortunately, this extreme condition should be rare.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)