You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2021/09/09 11:17:29 UTC

[GitHub] [iotdb] cigarl opened a new pull request #3939: init dummyIndex after restart cluster

cigarl opened a new pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939


   In current logic, if we restart cluster, and `applyIndex` is smaller than `commitIndex`,here are three problems occur.
   
   - First, we cannot append un-apply logs to `commitManager`.Like 103 and 104 in the picture,they can not be appended to `commitEntries` in `commitManager`.
   - Second, un-commitManager can not append a new log. Like 0 can not be appended to `unCommitEntries`  in `unCommitManager`.
   -  Third, un-applied logs can not be re-applied.That means 103 and 104 could be lost. 
   
   This is all because `dummyIndex` is initialized to the default value,it cannot return to its pre-reboot state.
   
   _ps.The information in the picture assumes none of these questions exist_
   <img width="612" alt="6211a9f5a25a465958c041d145f6dd5" src="https://user-images.githubusercontent.com/44458757/132673815-5a17451e-0434-459e-90d5-0587e724bb36.png">
   
   This PR try to resolve this problem,but we may also need a processor to avoid logIndex is too big, dunmmyIndex need to re-initialize if logIndex will be bigger than our expect.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707112005



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       OK, I see. This is really a matter of ensuring atomicity between apply and commit. We can talk about it later during refactoring, I have no problem with the current implementation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707033648



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       Actually, these entries are committed at a high level. So we need to `commit` it before we expose them to the outside world, which means we need to revocer them in `commitEntryManager`, and `unCommitEntryManager` should be empty at current design.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-918010901


   > is linearizability really necessary for OLAP databases like time-series databases?
   
   As I see the raft is one consensus algorithm, which may do not have many relations with the linearizability, just put the post[1] for reference.
   > we currently put data from multiple storage groups into one Raft group to be executed synchronously.
   
   As far as I know, the apply function is user-defined, we can still implement a parallel apply function according to different storage groups in the one raft log.
   
   What is certain is that using a raft library will definitely limit our optimization work compared with the current implementation (mixing the raft framework and business logic), but I think the availability and correctness are far greater than the performance for now.
   [1] https://zhuanlan.zhihu.com/p/47117804


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707096181



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */

Review comment:
       The problem here is that we don't know the actual `applyIndex`.
   Fortunately, in IOTDB, repeating _apply_ operation doesn't matter too much, but losing data can be unacceptable.
   That is only my views.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917978706


   > > Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, We could focus on improving the core engine of IoTDB and ecosystem after that.
   > 
   > Actually, I have the same idea as you, I think we can integrate `Apache Ratis` with our project, so at least we can guarantee the correctness of the raft, and our energy can be used for some functions of `IoTDB`. And at present, I think the correctness is far greater than the performance.
   > But it's really a time-consuming thing. We need to work together to finish it.
   
   Very glad to work together. As I will work on prometheus integration in the later of this month, I only have small bandwidth for this. But I'd like to propose some design sooner or later to discuss together after investigation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917961701


   > Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, We could focus on improving the core engine of IoTDB and ecosystem after that.
   
   Actually, I have the same idea as you, I think we can integrate `Apache Ratis` with our project,  so at least we can guarantee the correctness of the raft, and at present, and our energy can be used for some functions of `IoTDB`.  And at present, I think the correctness is far greater than the performance.
   But it's really a time-consuming thing. We need to work together to finish it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/42756141/badge)](https://coveralls.io/builds/42756141)
   
   Coverage increased (+0.08%) to 67.499% when pulling **198100f4dc0cd880a4dc05b992b9b1ba770d4cfa on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-920520745


   > but we may also need a processor to avoid logIndex is too big, dunmmyIndex need to re-initialize if logIndex will be bigger than our expect.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707090359



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       If we update the log and meta information of all nodes to the status of all logs persisted and applied after restart, will there be any inconsistency between the leader and follower?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707085916



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */

Review comment:
       In this comment, it seems due to that the log's meta file does not flush synchronously with the raft log, however, the persistent commit raft log maybe has been applied.
   We should not commit the entry again when restart.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917804418


   > To solve these problems completely, I think `raftLogManager` needs a complete refactoring, including but not limited to better concurrency control, better persistence strategy, etc.
   > 
   > When I first joined the community in my senior year, the first thing I did was to implement `raftLogManager` with reference to `storage` and `unstable` interfaces in `etcd`. Because I didn't have a very deep understanding of `raft` and `etcd` at that time, after a long period, I finally realized that this was probably an awful implementation and I apologize for that. There are two main reasons:
   > 
   > * For `raft`, uncommitted logs also need to be persisted (i.e., logs on disk may also need to be truncated) in order to ensure correctness after reboot.
   > * `etcd` is an event-driven architecture, so its interior is all unlocked. However, in our architecture, `raftLogManager` can be accessed concurrently by multiple threads. So we've added a lot of patches for concurrency control right now, but this is actually an area we can think about implementing better.
   > 
   > Thank you very much for your great contributions. This week I will focus on the PR you raised about `cluster-` branch and a bug at hand. After that, I would like to discuss the refactoring of `raftLogManager` with you guys and community. And of course you're welcome to start to design the new `raftLogManager` with us right now if you are free.
   
   IMO，we need the following two approaches in parallel:
   
   - First,we should solve this problem in the current code structure and submit to `master` and `rel/0.12.x` branch. Because when a user has a problem like mine, it's hard to solve immediately, this means that some data could be loss, and cluster may enter an unstable state.This is a disaster for the production environment.We need to provide users with a patch to avoid this problem.
   - Second,we can refactor this part on `cluster-` branch.Once we have solved the first problem, we can spend more time refactoring this part better.  And we don't have to push this part of the content to the user immediately.It can be put into the next release as a better implementation.
   
   I'd be happy to do it together   : )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-918017017


   > > is linearizability really necessary for OLAP databases like time-series databases?
   > 
   > As I see the raft is one consensus algorithm, which may do not have many relations with the linearizability, just put the post[1] for reference.
   > 
   > > we currently put data from multiple storage groups into one Raft group to be executed synchronously.
   > 
   > As far as I know, the apply function is user-defined, we can still implement a parallel apply function according to different storage groups in the one raft log.
   > 
   > What is certain is that using a raft library will definitely limit our optimization work compared with the current implementation (mixing the raft framework and business logic), but I think the availability and correctness are far greater than the performance for now.
   > [1] https://zhuanlan.zhihu.com/p/47117804
   
   Let's move Ratis related discussion to https://github.com/apache/iotdb/issues/3954 so that more people could join.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-939772032


   Any problems with this PR? if not, I think we can merge this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707122448



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       In my opinion, it doesn't matter where the entries to be recovered. The recover progress won't affect the consistency of raft group. `CommitIndex` to be determined until the next new log appending the follower. If my understanding right, the `committed`  logs committed again in the same sequence won't break the correctness of IoTDB. We have no choice even though this is not a good one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/42776582/badge)](https://coveralls.io/builds/42776582)
   
   Coverage decreased (-0.003%) to 67.418% when pulling **e571533ab9c2ffa389840d7caa8757102e0ef3cb on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/43403646/badge)](https://coveralls.io/builds/43403646)
   
   Coverage increased (+0.3%) to 67.748% when pulling **9449a6b4f7ec241ddd451aca70cd4c162b3313bf on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r725863951



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */
+  @Override
+  public List<Log> getAllEntriesAfterCommittedIndex() {
+    long lastIndex = firstLogIndex + logIndexOffsetList.size() - 1;
+    logger.debug(
+        "getAllEntriesBeforeAppliedIndex, firstUnCommitIndex={}, lastIndexBeforeStart={}",

Review comment:
       ```suggestion
           "getAllEntriesAfterCommittedIndex, firstUnCommitIndex={}, lastIndexBeforeStart={}",
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707129038



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */

Review comment:
       According to my understanding, re-apply the same sequence(guaranteed by log index) of the plans won't break the correctness of IoTDB. And looks like we don't have a better choice. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917955815


   > To solve these problems completely, I think `raftLogManager` needs a complete refactoring, including but not limited to better concurrency control, better persistence strategy, etc.
   > 
   > When I first joined the community in my senior year, the first thing I did was to implement `raftLogManager` with reference to `storage` and `unstable` interfaces in `etcd`. Because I didn't have a very deep understanding of `raft` and `etcd` at that time, after a long period, I finally realized that this was probably an awful implementation and I apologize for that. There are two main reasons:
   > 
   > * For `raft`, uncommitted logs also need to be persisted (i.e., logs on disk may also need to be truncated) in order to ensure correctness after reboot.
   > * `etcd` is an event-driven architecture, so its interior is all unlocked. However, in our architecture, `raftLogManager` can be accessed concurrently by multiple threads. So we've added a lot of patches for concurrency control right now, but this is actually an area we can think about implementing better.
   > 
   > Thank you very much for your great contributions. This week I will focus on the PR you raised about `cluster-` branch and a bug at hand. After that, I would like to discuss the refactoring of `raftLogManager` with you guys and community. And of course you're welcome to start to design the new `raftLogManager` with us right now if you are free.
   
   Thanks for your implementation so that we can enjoy the first cluster version of IoTDB  :). Raft algorithm is really hard to implementation, there are too many corner cases to consideration. We could make the algorithm work correct in most of cases in a short time which is terrific.
   
   In my opinion, we need to spend too much effort to guarantee the correctness of Raft if we want to keep working on the current cluster implementation. Cluster module lack a whole system design according to my feeling. 
   
   Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, We could focus on improving the core engine of IoTDB and ecosystem after that.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707099973



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       This could not cause consistency problems, but some data may be lost. 
   For example, `applyIndex` in persistent meta file is 5, and `commitIndex` is 7.But the actual index flushed to 10,and apply to 7,we may lost 8~10.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r706962382



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()

Review comment:
       same as above




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707129038



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */

Review comment:
       According to my understanding, re-apply the same sequence(guaranteed by log index) of the plan won't break the correctness of IoTDB. And looks like we don't have a better choice. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl removed a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl removed a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-920520745


   > but we may also need a processor to avoid logIndex is too big, dunmmyIndex need to re-initialize if logIndex will be bigger than our expect.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] chengjianyun edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

chengjianyun edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917955815


   > To solve these problems completely, I think `raftLogManager` needs a complete refactoring, including but not limited to better concurrency control, better persistence strategy, etc.
   > 
   > When I first joined the community in my senior year, the first thing I did was to implement `raftLogManager` with reference to `storage` and `unstable` interfaces in `etcd`. Because I didn't have a very deep understanding of `raft` and `etcd` at that time, after a long period, I finally realized that this was probably an awful implementation and I apologize for that. There are two main reasons:
   > 
   > * For `raft`, uncommitted logs also need to be persisted (i.e., logs on disk may also need to be truncated) in order to ensure correctness after reboot.
   > * `etcd` is an event-driven architecture, so its interior is all unlocked. However, in our architecture, `raftLogManager` can be accessed concurrently by multiple threads. So we've added a lot of patches for concurrency control right now, but this is actually an area we can think about implementing better.
   > 
   > Thank you very much for your great contributions. This week I will focus on the PR you raised about `cluster-` branch and a bug at hand. After that, I would like to discuss the refactoring of `raftLogManager` with you guys and community. And of course you're welcome to start to design the new `raftLogManager` with us right now if you are free.
   
   Thanks for your implementation so that we can enjoy the first cluster version of IoTDB  :). Raft algorithm is really hard to implementation, there are too many corner cases to consideration. We could make the algorithm work correct in most of cases in a short time which is terrific.
   
   In my opinion, we need to spend too much effort to guarantee the correctness of Raft if we want to keep working on the current cluster implementation. Cluster module lack a whole system design according to my feeling. 
   
   Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, we can focus on improving the core engine of IoTDB and ecosystem after that.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl closed pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl closed pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan merged pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan merged pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707046453



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()
+        .truncateAndAppend(stableEntryManager.getAllEntriesAfterCommittedIndex());

Review comment:
       > which means we need to revocer them in `commitEntryManager`,
   
   That means,we need to update meta's info when recover logs file in this case(meta's info is stale)? that may be difficult _(because we need to consider the meta recovery process in all situations to ensure that leaders and followers are completely consistent)_.
   
   Of course,there is another way to fix it. Every time the log file is flushed, the meta is flushed with it. Except in special 
    situations _(log file flushed successfully but meta failed when we restart cluster)_, it doesn't have any problems.
   
   How do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917961701


   > Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, We could focus on improving the core engine of IoTDB and ecosystem after that.
   
   Actually, I have the same idea as you, I think we can integrate `Apache Ratis` with our project,  so at least we can guarantee the correctness of the raft, and our energy can be used for some functions of `IoTDB`.  And at present, I think the correctness is far greater than the performance.
   But it's really a time-consuming thing. We need to work together to finish it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r706983694



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,24 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * Recover un-committed logs but already persistent. Maybe,we can extract

Review comment:
       That can be adjusted.  But when I name it, I base it on the meta's information, because these logs are uncommitted for the meta(although its info is staled).Maybe,i can add some comments to illustrate the point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/42911263/badge)](https://coveralls.io/builds/42911263)
   
   Coverage decreased (-0.08%) to 67.345% when pulling **9257baf9d3e528a74230a592934413ebb1cd640f on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917808675


   > > I'd be happy to do it together : )
   > 
   > OK, then that's my reviews~At current, it seems that the `getAllEntriesAfterCommittedIndex()` will always return empty list in current design? What's your opinion?
   
   see [#3930](https://github.com/apache/iotdb/pull/3930)
   In my environment, that is not empty.  It is caused by meta information not being flushed synchronously.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/42776859/badge)](https://coveralls.io/builds/42776859)
   
   Coverage decreased (-0.008%) to 67.413% when pulling **e571533ab9c2ffa389840d7caa8757102e0ef3cb on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r706982153



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,24 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * Recover un-committed logs but already persistent. Maybe,we can extract

Review comment:
       If so, I think it would be better to adjust the function name and comment?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r706980430



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,24 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * Recover un-committed logs but already persistent. Maybe,we can extract

Review comment:
       > This persistence strategy needs to be improved, such as persisting uncommitted logs to ensure that some of raft's corner cases are correct. But in currrent logic, this function doesn't seem to be working?
   
   In fact, this part of the change is due to the meta information was not flushed in time. When we flush the log file, we don't flush the `LogManagerMeta` at the same time(see [#3930](https://github.com/apache/iotdb/pull/3930)).And we can't be sure that log and meta files are in perfect sync.Therefore, we need to start up with meta information as a baseline for recovery  
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] neuyilan commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

neuyilan commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r707116648



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,30 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * When raft log files flushed,meta would not be flushed synchronously.So data has flushed to disk
+   * is uncommitted for persistent LogManagerMeta(meta's info is stale).We need to recover these
+   * already persistent logs.
+   *
+   * <p>For example,commitIndex is 5 in persistent LogManagerMeta,But the log file has actually been
+   * flushed to 7,when we restart cluster,we need to recover 6 and 7.
+   *
+   * <p>Maybe,we can extract getAllEntriesAfterAppliedIndex and getAllEntriesAfterCommittedIndex
+   * into getAllEntriesByIndex,but now there are too many test cases using it.
+   */

Review comment:
       As we have so many plans, I am not sure all the plan is idempotent.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] coveralls edited a comment on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

coveralls edited a comment on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-916028782


   
   [![Coverage Status](https://coveralls.io/builds/42816811/badge)](https://coveralls.io/builds/42816811)
   
   Coverage increased (+0.07%) to 67.492% when pulling **9257baf9d3e528a74230a592934413ebb1cd640f on cigarl:di_fix** into **601df56b76df7dbda720361a536ab66e492f1b25 on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on a change in pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on a change in pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#discussion_r706962210



##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/serializable/SyncLogDequeSerializer.java
##########
@@ -236,6 +237,24 @@ public LogManagerMeta getMeta() {
     return getLogs(meta.getMaxHaveAppliedCommitIndex(), meta.getCommitLogIndex());
   }
 
+  /**
+   * Recover un-committed logs but already persistent. Maybe,we can extract

Review comment:
       This persistence strategy needs to be improved, such as persisting uncommitted logs to ensure that some of raft's corner cases are correct. But in currrent logic, this function doesn't seem to be working?

##########
File path: cluster/src/main/java/org/apache/iotdb/cluster/log/manage/RaftLogManager.java
##########
@@ -125,6 +127,8 @@ protected RaftLogManager(StableEntryManager stableEntryManager, LogApplier appli
     long first = getCommittedEntryManager().getDummyIndex();
     long last = getCommittedEntryManager().getLastIndex();
     this.setUnCommittedEntryManager(new UnCommittedEntryManager(last + 1));
+    this.getUnCommittedEntryManager()

Review comment:
       same as below




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917806947


    
   > I'd be happy to do it together : )
   
   OK, then that's my reviews~At current, it seems that the `getAllEntriesAfterCommittedIndex()` will always return empty list in current design? What's your opinion?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] LebronAl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

LebronAl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-917983405

> Another my suggestion is we can gradually separate raft implementation and raft client(application) just like what `etcd` did. The goal of this is, finally, we can involve `Ratis` to help manage the Raft status. Maybe it's time to put `Ratis` on the agenda. I strongly suggest someone spend some time to investigate `Ratis` and evaluate the cost of integration. Instead of Raft correctness, we can focus on improving the core engine of IoTDB and ecosystem after that.

In fact, In the last few months I have studied Raft algorithm carefully and completed [6.824 lab](https://github.com/LebronAl/MIT6.824-2021). This is just an instructional Raft implementation, which makes me deeply appreciate how difficult it is to implement a correct consensus algorithm, while Raft algorithm at production level requires much more. So I think maybe it's time for us to weigh in and make a decision. Choosing a mature implementation of the raft algorithm liberates our productivity, which may take a long time at first, but can be hugely beneficial once implemented.

But I also want to list two concerns:

- Production-level raft algorithms generally guarantee linearizability, which is the strictest consistency in distributed systems. Our current Raft algorithm may not be complete, so strictly speaking it does not guarantee linearizability, but at the same time it's performance may be better. What would we say if we migrated raft implementation and performance dropped? Furthermore, is linearizability really necessary for OLAP databases like time-series databases?

- At the moment we have Raft implementation and business logic mixed together, but there is a certain amount of performance improvement. For example, we currently put data from multiple storage groups into one Raft group to be executed synchronously. In fact, at the bottom, they can be executed in parallel. So we implemented parallel asynchronous apply optimization so that plans from the same Raft group but different storage groups can be applied in parallel. The performance gains from this optimization are significant. How should we handle this case after we migrate raft implementation? Add parallel apply feature to raft library Or do we change our partitioning pattern so that a Raft group serves only one storage group? Or just let them execute inefficiently?

Any discussions are welcomed~

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [iotdb] cigarl commented on pull request #3939: init dummyIndex after restart cluster

Posted by GitBox <gi...@apache.org>.

cigarl commented on pull request #3939:
URL: https://github.com/apache/iotdb/pull/3939#issuecomment-920522689


   > but we may also need a processor to avoid logIndex is too big, dunmmyIndex need to re-initialize if logIndex will be bigger than our expect.
   
   This may not seem like a problem.I tried to calculate that it would handle 100 million requests per second and the cluster would run for 2924 years.
   
   Sorry, I closed this PR due to my misoperation. I have reopened it.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org