You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/20 07:15:44 UTC

[GitHub] [ozone] amaliujia opened a new pull request #1725: HDDS-3208. Implement Ratis Snapshots on SCM

amaliujia opened a new pull request #1725:
URL: https://github.com/apache/ozone/pull/1725


   ## What changes were proposed in this pull request?
   
   Design doc: https://docs.google.com/document/d/1uy4_ER2V6nNQJ7_5455Wz8NmI142JHPnif6Y1OdPi8E/edit?usp=sharing
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3208
   
   ## How was this patch tested?
   
   UT


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559800666



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       Such case should be considered as a bug and be fixed. 
   
   I created a JIRA to further discuss how to deal with such case: https://issues.apache.org/jira/browse/HDDS-4723




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559362036



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
##########
@@ -397,4 +396,8 @@ public void close() throws IOException {
   protected ContainerStateManagerV2 getContainerStateManager() {
     return containerStateManager;
   }
+
+  public SCMHAManager getSCMHAManager() {

Review comment:
       Please add an annotation of `@VisibleForTesting`.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMRatisSnapshotInfo.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.server.storage.FileInfo;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.List;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * This class captures the snapshotIndex and term of the latest snapshot in
+ * the SCM
+ * Ratis server loads the snapshotInfo during startup and updates the
+ * lastApplied index to this snapshotIndex. SCM SnapshotInfo does not contain
+ * any files. It is used only to store/ update the last applied index and term.
+ */
+public class SCMRatisSnapshotInfo implements SnapshotInfo {
+  private volatile long term;
+  private volatile long snapshotIndex;
+
+  public SCMRatisSnapshotInfo(long term, long index) {
+    this.term = term;
+    this.snapshotIndex = index;
+  }
+
+  public void updateTermIndex(long newTerm, long newIndex) {
+    this.term = newTerm;
+    this.snapshotIndex = newIndex;
+  }

Review comment:
       The `SCMRatisSnapshotInfo` is an immutable object after being created. How about make `term` and `snapshotIndex` be `final` and remove `updateTermIndex()` ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559326145



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +119,30 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    SCMTransactionInfo lastAppliedTrxInfo =
+        SCMTransactionInfo.fromTermIndex(lastTermIndex);
+    if (transactionBuffer.getLatestTrxInfo()
+        .compareTo(lastAppliedTrxInfo) < 0) {
+      transactionBuffer.updateLatestTrxInfo(
+          SCMTransactionInfo.builder()
+              .setCurrentTerm(lastTermIndex.getTerm())
+              .setTransactionIndex(lastTermIndex.getIndex())
+              .build());
+      transactionBuffer.setLatestSnapshot(
+          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    } else {
+      lastAppliedIndex =
+          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
+    }
+
+    transactionBuffer.flush();
+    LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms",
+        getLastAppliedTermIndex(), Time.monotonicNow() - startTime);
+    return lastAppliedIndex;

Review comment:
       I see the buffer getting updated with applied index only after applyTransaction is called. ApplyTransaction is not invoked for every log transaction in Ratis. So, its better to rely on getLastAppliedTermIndex() rather than the buffer for the latest applied index.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559326758



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -26,27 +26,55 @@
 import java.util.concurrent.CompletableFuture;
 
 import com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.hadoop.hdds.scm.exceptions.SCMException;
+import org.apache.hadoop.util.Time;
 import org.apache.ratis.protocol.Message;
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.statemachine.SnapshotInfo;
 import org.apache.ratis.statemachine.TransactionContext;
 import org.apache.ratis.statemachine.impl.BaseStateMachine;
 
 import org.apache.hadoop.hdds.protocol.proto.SCMRatisProtocol.RequestType;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.scm.exceptions.SCMException.ResultCodes.SCM_NOT_INITIALIZED;
 
 /**
  * TODO.
  */
 public class SCMStateMachine extends BaseStateMachine {
+  private static final Logger LOG =
+      LoggerFactory.getLogger(BaseStateMachine.class);
 
   private final Map<RequestType, Object> handlers;
+  private final SCMDBTransactionBuffer transactionBuffer;
 
-  public SCMStateMachine() {
+  public SCMStateMachine(SCMDBTransactionBuffer buffer) throws SCMException {
     this.handlers = new EnumMap<>(RequestType.class);
+    this.transactionBuffer = buffer;
+    SCMTransactionInfo latestTrxInfo = buffer.getLatestTrxInfo();
+    if (!latestTrxInfo.isInitialized()) {
+      if (!updateLastAppliedTermIndex(latestTrxInfo.getTerm(),
+          latestTrxInfo.getTransactionIndex())) {
+        throw new SCMException(
+            String.format("Failed to update LastAppliedTermIndex " +
+                    "in StateMachine to term:{} index:{}",
+                latestTrxInfo.getTerm(), latestTrxInfo.getTransactionIndex()
+            ), SCM_NOT_INITIALIZED);
+      }
+    }
   }
 
   public void registerHandler(RequestType type, Object handler) {
     handlers.put(type, handler);
   }
 
+  @Override
+  public SnapshotInfo getLatestSnapshot() {
+    return transactionBuffer.getLatestSnapshot();
+  }
+
   @Override
   public CompletableFuture<Message> applyTransaction(

Review comment:
       Can we rename applyTransaction to applyTansactionSerial as this is a serialized operation anyways?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557147410



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -55,15 +84,25 @@ public void registerHandler(RequestType type, Object handler) {
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+      applyTransactionFuture.complete(
+          process(
+              request,
+              SCMTransactionInfo.builder()
+                  .setCurrentTerm(trx.getLogEntry().getTerm())
+                  .setTransactionIndex(trx.getLogEntry().getIndex())
+                  .build()));
     } catch (Exception ex) {
       applyTransactionFuture.completeExceptionally(ex);
     }
     return applyTransactionFuture;
   }
 
-  private Message process(final SCMRatisRequest request)
-      throws Exception {
+  private boolean shouldUpdate(SCMTransactionInfo info) {
+    return !(info.getTransactionIndex() == -1 && info.getTerm() == 0);

Review comment:
       Makes sense




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557759502



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * SCMTransactionInfo saves two fields for a transaction:
+ *  1. term, of which the transaction belongs to
+ *  2. transactionIndex, which is a monotonic increasing index
+ *     (e.g. Raft Log index)
+ */
+final public class SCMTransactionInfo {
+  private long term;
+  private long transactionIndex;
+
+  private SCMTransactionInfo(String transactionInfo) {
+    String[] tInfo =
+        transactionInfo.split(TRANSACTION_INFO_SPLIT_KEY);
+    Preconditions.checkState(tInfo.length == 2,
+        "Incorrect TransactionInfo value");
+
+    term = Long.parseLong(tInfo[0]);
+    transactionIndex = Long.parseLong(tInfo[1]);
+  }
+
+  private SCMTransactionInfo(long currentTerm, long transactionIndex) {
+    this.term = currentTerm;
+    this.transactionIndex = transactionIndex;
+  }
+
+  public boolean shouldUpdate() {

Review comment:
       `isInitialized` is a good naming.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-763824491


   Thanks @amaliujia for the contribution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553552267



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +99,15 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    transactionBuffer.flush();
+    LOG.debug("SCM takeSnapshot: {} took {} ms",
+        Time.monotonicNow() - startTime);
+    return lastAppliedIndex;

Review comment:
       Code changes a bit. Maybe take another look?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r549927194



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java
##########
@@ -110,6 +111,11 @@
    */
   Table<PipelineID, Pipeline> getPipelineTable();
 
+  /**
+   *
+   */
+  Table <String, SCMTransactionInfo> getTransactionInfoTable();

Review comment:
       Are we missing some comments here? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r548129231



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       To reuse `OMTransactionInfo`, my current answer is probably no.
   
   Though both `OMTransactionInfo` and `SCMTransactionInfo` seems are doing the same thing for now, decoupling is good when there is a need that either `OMTransactionInfo` and `SCMTransactionInfo` need to add new/drop existing fields while the other does not need. Especially in this time that SCM snapshot is still under development. 
   
   I think we can consider this after SCM snapshot is stable. I can create a JIRA though to track the merge of two transaction info.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-752855991


   @ChenSammi  thank you! I will start to address comments since there are a bunch of actionable ones existing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r549943738



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    TermIndex lastTermIndex = getLastAppliedTermIndex();

Review comment:
       I believe we don't need to. 
   
   ```
       public TermIndex getLastAppliedTermIndex() {
           return (TermIndex)this.lastAppliedTermIndex.get();
       }
   ```
   
   ```
       protected boolean updateLastAppliedTermIndex(long term, long index) {
           TermIndex newTI = TermIndex.newTermIndex(term, index);
           TermIndex oldTI = (TermIndex)this.lastAppliedTermIndex.getAndSet(newTI);
            ....
   ```
   
   Both function access `this.lastAppliedTermIndex`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550405829



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java
##########
@@ -93,6 +104,6 @@ public String getLocationConfigKey() {
   @Override
   public DBColumnFamilyDefinition[] getColumnFamilies() {
     return new DBColumnFamilyDefinition[] {DELETED_BLOCKS, VALID_CERTS,
-        REVOKED_CERTS, PIPELINES, CONTAINERS};

Review comment:
       https://issues.apache.org/jira/browse/HDDS-4635




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r556313457



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
##########
@@ -101,6 +102,16 @@ public ContainerManagerImpl(
       final PipelineManager pipelineManager,
       final Table<ContainerID, ContainerInfo> containerStore)
       throws IOException {
+    this(conf, scmHaManager, pipelineManager, containerStore, null);

Review comment:
       Could we implement the mock buffer in this PR ? Since missing this change will pollute the production code, and will also give a burden to the on-going PR of deleted block log.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550893488



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {

Review comment:
       Could we document javadoc for this? That will be easier understood by others.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557829940



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -54,7 +54,7 @@ public SCMStateMachine(SCMDBTransactionBuffer buffer) throws SCMException {
     this.handlers = new EnumMap<>(RequestType.class);
     this.transactionBuffer = buffer;
     SCMTransactionInfo latestTrxInfo = buffer.getLatestTrxInfo();
-    if (latestTrxInfo.shouldUpdate()) {
+    if (!latestTrxInfo.isInitialized()) {

Review comment:
       It should be 
   ```
   if (latestTrxInfo.isInitialized()) {
     updateLastAppliedTermIndex(...)
   }
   ``` 
   ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550404996



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
##########
@@ -101,6 +102,16 @@ public ContainerManagerImpl(
       final PipelineManager pipelineManager,
       final Table<ContainerID, ContainerInfo> containerStore)
       throws IOException {
+    this(conf, scmHaManager, pipelineManager, containerStore, null);

Review comment:
       Same https://issues.apache.org/jira/browse/HDDS-4634.
   
   This is because MockHAManager does not have a mock buffer implementation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-756342912


   @ChenSammi @runzhiwang @GlenGeng @linyiqun 
   
   I have rebased and updated this PR. Now this PR includes the takeSnapshot and loadSnapshot with a test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r549916449



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       Shall we add some checks here to guarantee new info is greater than current lastTrxInfo? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r549927001



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java
##########
@@ -93,6 +104,6 @@ public String getLocationConfigKey() {
   @Override
   public DBColumnFamilyDefinition[] getColumnFamilies() {
     return new DBColumnFamilyDefinition[] {DELETED_BLOCKS, VALID_CERTS,
-        REVOKED_CERTS, PIPELINES, CONTAINERS};

Review comment:
       Since we add a new table in SCM DB,  we need a JIRA to track how to handle back compatibility during upgrade non HA SCM to HA SCM. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550406229



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553765849



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       I think we are better to reuse `OMTransactionInfo` after merge HDDS-2823 back to master.
   
   Not reusing `OMTransactionInfo` is to try to reduce the risk that have conflicts after syncing master int HDDS-2823. Also, to reuse `OMTransactionInfo`, I think we'd better to update the name to `TransactionInfo`, which will be a source of conflicts , unless we can change it in the master branch first.
   
   To conclude a bit, we can reuse it after merging SCM HA back to master, the goal is try to reduce potential conflicts if there could be any.
   
   SCM HA is in developing branch anyway and I am sure we will need to clean up lots of stuff eventually. As long as we have JIRAs to not lose track of such work. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557145827



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +128,33 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    SCMTransactionInfo lastAppliedTrxInfo =
+        SCMTransactionInfo.fromTermIndex(lastTermIndex);
+    if (transactionBuffer.getLatestTrxInfo()
+        .compareTo(lastAppliedTrxInfo) < 0) {
+      transactionBuffer.updateLatestTrxInfo(
+          SCMTransactionInfo.builder()
+              .setCurrentTerm(lastTermIndex.getTerm())
+              .setTransactionIndex(lastTermIndex.getIndex())
+              .build());
+      transactionBuffer.setLatestSnapshot(
+          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    } else {
+      lastAppliedIndex =
+          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
+    }
+
+    transactionBuffer.flush();
+    transactionBuffer.setLatestSnapshot(
+        transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    LOG.debug("SCM takeSnapshot: {} took {} ms",

Review comment:
       Done

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -55,15 +84,25 @@ public void registerHandler(RequestType type, Object handler) {
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+      applyTransactionFuture.complete(
+          process(

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r556331845



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+/**
+ * This is a transaction buffer that buffers SCM DB operations for Pipeline and
+ * Container. When flush this buffer to DB, a transaction info will also be
+ * written into DB to indicate the term and transaction index for the latest
+ * operation in DB.
+ */
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+  private SnapshotInfo latestSnapshot;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) throws IOException {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo = store.getTransactionInfoTable().get(TRANSACTION_INFO_KEY);
+    if (latestTrxInfo == null) {
+      // transaction table is empty
+      latestTrxInfo =
+          SCMTransactionInfo
+              .builder()
+              .setTransactionIndex(-1)
+              .setCurrentTerm(0)
+              .build();
+    }
+    latestSnapshot = latestTrxInfo.toSnapshotInfo();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    if (info.compareTo(this.latestTrxInfo) <= 0) {
+      throw new IllegalArgumentException(
+          "Updating DB buffer transaction info by an older transaction info, "
+          + "current: " + this.latestTrxInfo + ", updating to: " + info);
+    }
+    this.latestTrxInfo = info;
+  }
+
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  public SnapshotInfo getLatestSnapshot() {
+    return latestSnapshot;
+  }
+
+  public void setLatestSnapshot(SnapshotInfo latestSnapshot) {
+    this.latestSnapshot = latestSnapshot;
+  }
+
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();
+

Review comment:
       Will it be better to call `latestSnapshot = latestTrxInfo.toSnapshotInfo();` here, and remove the `setLatestSnapshot ()` method ? 
   If caller always has to call `flush()` and `setLatestSnapshot()` together, better to merge them to avoid  human mistake.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -55,15 +84,25 @@ public void registerHandler(RequestType type, Object handler) {
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+      applyTransactionFuture.complete(
+          process(

Review comment:
       How about revert the change to `process()` and change like this ?
   ```
   applyTransactionFuture.complete(process(request));
   transactionBuffer.updateLatestTrxInfo(SCMTransactionInfo.builder()
                      .setCurrentTerm(trx.getLogEntry().getTerm())
                      .setTransactionIndex(trx.getLogEntry().getIndex())
                      .build()));
   ```
   the `process()` does not needs to know about the `trxInfo `

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -55,15 +84,25 @@ public void registerHandler(RequestType type, Object handler) {
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+      applyTransactionFuture.complete(
+          process(
+              request,
+              SCMTransactionInfo.builder()
+                  .setCurrentTerm(trx.getLogEntry().getTerm())
+                  .setTransactionIndex(trx.getLogEntry().getIndex())
+                  .build()));
     } catch (Exception ex) {
       applyTransactionFuture.completeExceptionally(ex);
     }
     return applyTransactionFuture;
   }
 
-  private Message process(final SCMRatisRequest request)
-      throws Exception {
+  private boolean shouldUpdate(SCMTransactionInfo info) {
+    return !(info.getTransactionIndex() == -1 && info.getTerm() == 0);

Review comment:
       How about move this `shouldUpdate()` in to `SCMTransactionInfo`, as a method `isEmpty()`? We'd better encapulate  the magic number `0` and `-1` into `SCMTransactionInfo `.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +128,33 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    SCMTransactionInfo lastAppliedTrxInfo =
+        SCMTransactionInfo.fromTermIndex(lastTermIndex);
+    if (transactionBuffer.getLatestTrxInfo()
+        .compareTo(lastAppliedTrxInfo) < 0) {
+      transactionBuffer.updateLatestTrxInfo(
+          SCMTransactionInfo.builder()
+              .setCurrentTerm(lastTermIndex.getTerm())
+              .setTransactionIndex(lastTermIndex.getIndex())
+              .build());
+      transactionBuffer.setLatestSnapshot(
+          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    } else {
+      lastAppliedIndex =
+          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
+    }
+
+    transactionBuffer.flush();
+    transactionBuffer.setLatestSnapshot(
+        transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    LOG.debug("SCM takeSnapshot: {} took {} ms",

Review comment:
       Better remove the info in line 133 and replace 156 as 
   
   ```
   LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms", 
       getLastAppliedTermIndex(), Time.monotonicNow() - startTime);
   ```

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    TermIndex lastTermIndex = getLastAppliedTermIndex();

Review comment:
       You need call `updateLastAppliedTermIndex ` in `applyTransaction()`, the `lastAppliedTermIndex` in `BaseStateMachine` should be updated manually when apply a entry.
   
   Please double check `applyTransaction` in `BaseStateMachine`
   ```
     @Override
     public CompletableFuture<Message> applyTransaction(TransactionContext trx) {
       // return the same message contained in the entry
       RaftProtos.LogEntryProto entry = Objects.requireNonNull(trx.getLogEntry());
       updateLastAppliedTermIndex(entry.getTerm(), entry.getIndex());
       return CompletableFuture.completedFuture(
           Message.valueOf(trx.getLogEntry().getStateMachineLogEntry().getLogData()));
     }
   ```
   and  `applyTransaction` in `ArithmeticStateMachine`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r548131536



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());

Review comment:
       makes sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557147075



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+/**
+ * This is a transaction buffer that buffers SCM DB operations for Pipeline and
+ * Container. When flush this buffer to DB, a transaction info will also be
+ * written into DB to indicate the term and transaction index for the latest
+ * operation in DB.
+ */
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+  private SnapshotInfo latestSnapshot;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) throws IOException {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo = store.getTransactionInfoTable().get(TRANSACTION_INFO_KEY);
+    if (latestTrxInfo == null) {
+      // transaction table is empty
+      latestTrxInfo =
+          SCMTransactionInfo
+              .builder()
+              .setTransactionIndex(-1)
+              .setCurrentTerm(0)
+              .build();
+    }
+    latestSnapshot = latestTrxInfo.toSnapshotInfo();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    if (info.compareTo(this.latestTrxInfo) <= 0) {
+      throw new IllegalArgumentException(
+          "Updating DB buffer transaction info by an older transaction info, "
+          + "current: " + this.latestTrxInfo + ", updating to: " + info);
+    }
+    this.latestTrxInfo = info;
+  }
+
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  public SnapshotInfo getLatestSnapshot() {
+    return latestSnapshot;
+  }
+
+  public void setLatestSnapshot(SnapshotInfo latestSnapshot) {
+    this.latestSnapshot = latestSnapshot;
+  }
+
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();
+

Review comment:
       The snapshot info is set also during the initialization. So not only `flush()` will need to update that information.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550383329



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    TermIndex lastTermIndex = getLastAppliedTermIndex();

Review comment:
       Since all rocksdb writes are batched to write to disk in takesnapshot,  we need to change takesnapshot default frequency of ratis during scm ratis server creation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r554169986



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       Sure we can do that. I will need add some changes in main branch, and then sync that to HDDS-2823 and then reuse in SCM.
   
   I think we should do that separately as that will take time. I have created https://issues.apache.org/jira/browse/HDDS-4660 to track this effort.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553759439



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       How about return an anonymous subclass of `RDBBatchOperation` whose 'commit()' will throw `RuntimeException`, which means you can only write to the batch, but there is no way to commit this batch ?
   
   `RDBBatchOperation#commit()` is called by `RDBStore#commitBatchOperation()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r555393761



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMRatisSnapshotInfo.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.server.storage.FileInfo;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.List;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * This class captures the snapshotIndex and term of the latest snapshot in
+ * the SCM
+ * Ratis server loads the snapshotInfo during startup and updates the
+ * lastApplied index to this snapshotIndex. SCM SnapshotInfo does not contain
+ * any files. It is used only to store/ update the last applied index and term.
+ */
+public class SCMRatisSnapshotInfo implements SnapshotInfo {
+  private volatile long term;
+  private volatile long snapshotIndex;
+
+  public SCMRatisSnapshotInfo(long term, long index) {
+    this.term = term;
+    this.snapshotIndex = index;
+  }
+
+  public void updateTermIndex(long newTerm, long newIndex) {
+    this.term = newTerm;
+    this.snapshotIndex = newIndex;
+  }

Review comment:
       got it. Will remove this method.
   
   The `volatile` is a good point. I am also looking for comments to point me out whether there are potential concurrent operations.  I will remove `volatile` if there is no comment about this situation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550893488



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {

Review comment:
       Could we document javadoc for this? That will be easier understood by others. At lease, we should document the purpose for this transaction buffer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558145432



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       In my opinion, we should stop the ratis server here of such a case arise. It is fatal case where we cannot move ahead from here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559798815



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -26,27 +26,55 @@
 import java.util.concurrent.CompletableFuture;
 
 import com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.hadoop.hdds.scm.exceptions.SCMException;
+import org.apache.hadoop.util.Time;
 import org.apache.ratis.protocol.Message;
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.statemachine.SnapshotInfo;
 import org.apache.ratis.statemachine.TransactionContext;
 import org.apache.ratis.statemachine.impl.BaseStateMachine;
 
 import org.apache.hadoop.hdds.protocol.proto.SCMRatisProtocol.RequestType;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.hadoop.hdds.scm.exceptions.SCMException.ResultCodes.SCM_NOT_INITIALIZED;
 
 /**
  * TODO.
  */
 public class SCMStateMachine extends BaseStateMachine {
+  private static final Logger LOG =
+      LoggerFactory.getLogger(BaseStateMachine.class);
 
   private final Map<RequestType, Object> handlers;
+  private final SCMDBTransactionBuffer transactionBuffer;
 
-  public SCMStateMachine() {
+  public SCMStateMachine(SCMDBTransactionBuffer buffer) throws SCMException {
     this.handlers = new EnumMap<>(RequestType.class);
+    this.transactionBuffer = buffer;
+    SCMTransactionInfo latestTrxInfo = buffer.getLatestTrxInfo();
+    if (!latestTrxInfo.isInitialized()) {
+      if (!updateLastAppliedTermIndex(latestTrxInfo.getTerm(),
+          latestTrxInfo.getTransactionIndex())) {
+        throw new SCMException(
+            String.format("Failed to update LastAppliedTermIndex " +
+                    "in StateMachine to term:{} index:{}",
+                latestTrxInfo.getTerm(), latestTrxInfo.getTransactionIndex()
+            ), SCM_NOT_INITIALIZED);
+      }
+    }
   }
 
   public void registerHandler(RequestType type, Object handler) {
     handlers.put(type, handler);
   }
 
+  @Override
+  public SnapshotInfo getLatestSnapshot() {
+    return transactionBuffer.getLatestSnapshot();
+  }
+
   @Override
   public CompletableFuture<Message> applyTransaction(

Review comment:
       I created https://issues.apache.org/jira/browse/HDDS-4684.
   
   After check the Ratis state machine interface, I can see two functions:
   ```
     /**
      * Called for transactions that have been committed to the RAFT log. This step is called
      * sequentially in strict serial order that the transactions have been committed in the log.
      * The SM is expected to do only necessary work, and leave the actual apply operation to the
      * applyTransaction calls that can happen concurrently.
      * @param trx the transaction state including the log entry that has been committed to a quorum
      *            of the raft peers
      * @return The Transaction context.
      */
     TransactionContext applyTransactionSerial(TransactionContext trx);
   
     /**
      * Apply a committed log entry to the state machine. This method can be called concurrently with
      * the other calls, and there is no guarantee that the calls will be ordered according to the
      * log commit order.
      * @param trx the transaction state including the log entry that has been committed to a quorum
      *            of the raft peers
      */
     CompletableFuture<Message> applyTransaction(TransactionContext trx);
   ```
   
   So there is no API as the following 
   ```
     Message applyTransactionSerial(TransactionContext trx);
   ```
   
   Basically a function to apply transaction in strict serial order where it returns a Message.
   
   I am planning to send an email to Ratis community to discuss the intention of only having one applyTransaction that returns Message (CompletableFuture<Message>) but can be called concurrently.
   
   We can address this API change in HDDS-4684.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r548129231



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       To reuse `OMTransactionInfo`, my current answer is probably no.
   
   Though both `OMTransactionInfo` and `SCMTransactionInfo` seems are doing the same thing for now, decoupling is good when there is a need that either `OMTransactionInfo` and `SCMTransactionInfo` need to add new/drop existing fields while the other does not need. Especially in this time that SCM snapshot is still under development. 
   
   I can create a JIRA though to track the merge of two transaction info.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis Snapshots on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r547708056



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    TermIndex lastTermIndex = getLastAppliedTermIndex();

Review comment:
       You have to call `BaseStateMachine#updateLastAppliedTermIndex()` to make `BaseStateMachine#getLastAppliedTermIndex ()` to take effect.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;
+  }
+
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();

Review comment:
       Good Job!

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMTransactionInfoCodec.java
##########
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.scm.metadata;
+
+import org.apache.hadoop.hdds.scm.ha.SCMTransactionInfo;
+import org.apache.hadoop.hdds.utils.db.Codec;
+
+import java.io.IOException;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+public class SCMTransactionInfoCodec implements Codec<SCMTransactionInfo> {
+
+  @Override
+  public byte[] toPersistedFormat(SCMTransactionInfo object)
+      throws IOException {
+    checkNotNull(object, "Null object can't be converted to byte array.");

Review comment:
       Ditto,  could we reuse `OMTransactionInfoCodec` ?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)

Review comment:
       `.setCurrentTerm(0)`. The first leader will be elected from term 1.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());

Review comment:
       Better calculate the time cost during `takeSnapshot()`.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       - Will it be better to put `SCMTransactionInfo` near `OMTransactionInfo`, under `interface-storage` ?
   
   - Can we reuse the `OMTransactionInfo`? For example, rename `OMTransactionInfo` to `TransactionInfo`, and use them in both OM and SCM ?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
##########
@@ -275,12 +278,24 @@ public void addContainer(final ContainerInfoProto containerInfo)
 
     if (!containers.contains(containerID)) {
       ExecutionUtil.create(() -> {
-        containerStore.put(containerID, container);
+        if (transactionBuffer != null) {
+          containerStore.putWithBatch(
+              transactionBuffer.getCurrentBatchOperation(),
+              containerID, container);
+        } else {
+          containerStore.put(containerID, container);

Review comment:
       Can we remove the 'else {}' branch ? When we will migrate to batch operations, there will be no other way to do db writes.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##########
@@ -114,11 +115,21 @@ public static PipelineManagerV2Impl newPipelineManager(
       ConfigurationSource conf, SCMHAManager scmhaManager,
       NodeManager nodeManager, Table<PipelineID, Pipeline> pipelineStore,
       EventPublisher eventPublisher) throws IOException {
+    return newPipelineManager(conf, scmhaManager, nodeManager, pipelineStore,

Review comment:
       Ditto, may we remove this Ctor ?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
##########
@@ -101,6 +102,16 @@ public ContainerManagerImpl(
       final PipelineManager pipelineManager,
       final Table<ContainerID, ContainerInfo> containerStore)
       throws IOException {
+    this(conf, scmHaManager, pipelineManager, containerStore, null);

Review comment:
       Could we avoid using this Ctor ? We may need to change the related test cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-751327815


   With https://github.com/apache/ozone/pull/1733 we will be able to write tests for snapshot based on on single Ratis server SCM setup in MiniOzoneCluster.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant merged pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant merged pull request #1725:
URL: https://github.com/apache/ozone/pull/1725


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis Snapshots on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-748852723


   I found one of the failed test is `TestMiniOzoneCluster.testDNstartAfterSCM`, which failed on loading RocksDB when restarting SCM. Will try investigate the root cause.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553781521



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       `OMTransactionInfo` is written by Bharat, since SCM and OM has a quite similar requirement, we may have a talk with him, to determine whether OM and SCM can share the same `TransactionInfo` and the `Codec`. @amaliujia  What do you think ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553743247



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Do you know is there a better mechanism to achieve the buffering? 
   
   I studied what OM does and that is different from SCM. OM just buffer entry and then apply entries in a batch, but SCM need to router such entry to handlers and then different handler will apply the entry.
   
   So if we change to the way that OM is doing, there will be a good amount of refactoring needs happen, which might be not appropriate in this PR. E.g. We will need to move handlers to the buffer class and insert entries into buffer. Buffer will be the place to trouter entries to right state managers.
   
   What do you think? I can create a PR to track such refactoring thus we won't need to expose  `currentBatchOperation`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553765849



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       I think we are better to reuse `OMTransactionInfo` after merge HDDS-2823 back to master.
   
   Not reusing `OMTransactionInfo` is to try to reduce the risk that have conflicts after syncing master int HDDS-2823. Also, to reuse `OMTransactionInfo`, I think we'd better to update the name to `TransactionInfo`, which will be a source of conflicts , unless we can change it in the master branch first. And I also add some more functions in `SCMTransactionInfo`, if we reuse `OMTransactionInfo`, those new functions will also become source of conflicts. 
   
   To conclude a bit, we can reuse it after merging SCM HA back to master, the goal is try to reduce potential conflicts if there could be any.
   
   SCM HA is in developing branch anyway and I am sure we will need to clean up lots of stuff eventually. As long as we have JIRAs to not lose track of such work. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559327463



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+/**
+ * This is a transaction buffer that buffers SCM DB operations for Pipeline and
+ * Container. When flush this buffer to DB, a transaction info will also be
+ * written into DB to indicate the term and transaction index for the latest
+ * operation in DB.
+ */
+public class SCMDBTransactionBuffer implements DBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+  private SnapshotInfo latestSnapshot;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) throws IOException {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo = store.getTransactionInfoTable().get(TRANSACTION_INFO_KEY);
+    if (latestTrxInfo == null) {
+      // transaction table is empty
+      latestTrxInfo =
+          SCMTransactionInfo
+              .builder()
+              .setTransactionIndex(-1)
+              .setCurrentTerm(0)
+              .build();
+    }
+    latestSnapshot = latestTrxInfo.toSnapshotInfo();
+  }
+
+  @Override
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  @Override
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    if (info.compareTo(this.latestTrxInfo) <= 0) {
+      throw new IllegalArgumentException(
+          "Updating DB buffer transaction info by an older transaction info, "
+          + "current: " + this.latestTrxInfo + ", updating to: " + info);
+    }
+    this.latestTrxInfo = info;
+  }
+
+  @Override
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  @Override
+  public SnapshotInfo getLatestSnapshot() {
+    return latestSnapshot;
+  }
+
+  @Override
+  public void setLatestSnapshot(SnapshotInfo latestSnapshot) {
+    this.latestSnapshot = latestSnapshot;
+  }
+
+  @Override
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();
+    this.latestSnapshot = latestTrxInfo.toSnapshotInfo();

Review comment:
       I am ok with addressing in a different PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558609512



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+/**
+ * This is a transaction buffer that buffers SCM DB operations for Pipeline and
+ * Container. When flush this buffer to DB, a transaction info will also be
+ * written into DB to indicate the term and transaction index for the latest
+ * operation in DB.
+ */
+public class SCMDBTransactionBuffer implements DBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+  private SnapshotInfo latestSnapshot;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) throws IOException {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo = store.getTransactionInfoTable().get(TRANSACTION_INFO_KEY);
+    if (latestTrxInfo == null) {
+      // transaction table is empty
+      latestTrxInfo =
+          SCMTransactionInfo
+              .builder()
+              .setTransactionIndex(-1)
+              .setCurrentTerm(0)
+              .build();
+    }
+    latestSnapshot = latestTrxInfo.toSnapshotInfo();
+  }
+
+  @Override
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  @Override
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    if (info.compareTo(this.latestTrxInfo) <= 0) {
+      throw new IllegalArgumentException(
+          "Updating DB buffer transaction info by an older transaction info, "
+          + "current: " + this.latestTrxInfo + ", updating to: " + info);
+    }
+    this.latestTrxInfo = info;
+  }
+
+  @Override
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  @Override
+  public SnapshotInfo getLatestSnapshot() {
+    return latestSnapshot;
+  }
+
+  @Override
+  public void setLatestSnapshot(SnapshotInfo latestSnapshot) {
+    this.latestSnapshot = latestSnapshot;
+  }
+
+  @Override
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();
+    this.latestSnapshot = latestTrxInfo.toSnapshotInfo();

Review comment:
       I agree with it :-)
   
   In fact we have discussed not to expose batch operation above and I created  https://issues.apache.org/jira/browse/HDDS-4661.
   
   As this PR becomes larger and larger so I am planning to address this batch operation related comments in HDDS-4661, including when to init and close it.
   
   How do you feel? Do you agree with my idea?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558182242



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##########
@@ -114,12 +114,14 @@ private PipelineManagerV2Impl(ConfigurationSource conf,
   public static PipelineManagerV2Impl newPipelineManager(
       ConfigurationSource conf, SCMHAManager scmhaManager,
       NodeManager nodeManager, Table<PipelineID, Pipeline> pipelineStore,
-      EventPublisher eventPublisher) throws IOException {
+      EventPublisher eventPublisher)

Review comment:
       unintended change??

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +119,30 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    SCMTransactionInfo lastAppliedTrxInfo =
+        SCMTransactionInfo.fromTermIndex(lastTermIndex);
+    if (transactionBuffer.getLatestTrxInfo()
+        .compareTo(lastAppliedTrxInfo) < 0) {
+      transactionBuffer.updateLatestTrxInfo(
+          SCMTransactionInfo.builder()
+              .setCurrentTerm(lastTermIndex.getTerm())
+              .setTransactionIndex(lastTermIndex.getIndex())
+              .build());
+      transactionBuffer.setLatestSnapshot(
+          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    } else {
+      lastAppliedIndex =
+          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
+    }
+
+    transactionBuffer.flush();
+    LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms",
+        getLastAppliedTermIndex(), Time.monotonicNow() - startTime);
+    return lastAppliedIndex;

Review comment:
       why the lastApplied index returned here is the one derived from transaction buffer? Once the snapshot is taken, the last applied index should be given by getLastAppliedTermIndex() not transaction buffer which may be behind actual last applied index in ratis?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+/**
+ * This is a transaction buffer that buffers SCM DB operations for Pipeline and
+ * Container. When flush this buffer to DB, a transaction info will also be
+ * written into DB to indicate the term and transaction index for the latest
+ * operation in DB.
+ */
+public class SCMDBTransactionBuffer implements DBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+  private SnapshotInfo latestSnapshot;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) throws IOException {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo = store.getTransactionInfoTable().get(TRANSACTION_INFO_KEY);
+    if (latestTrxInfo == null) {
+      // transaction table is empty
+      latestTrxInfo =
+          SCMTransactionInfo
+              .builder()
+              .setTransactionIndex(-1)
+              .setCurrentTerm(0)
+              .build();
+    }
+    latestSnapshot = latestTrxInfo.toSnapshotInfo();
+  }
+
+  @Override
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  @Override
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    if (info.compareTo(this.latestTrxInfo) <= 0) {
+      throw new IllegalArgumentException(
+          "Updating DB buffer transaction info by an older transaction info, "
+          + "current: " + this.latestTrxInfo + ", updating to: " + info);
+    }
+    this.latestTrxInfo = info;
+  }
+
+  @Override
+  public SCMTransactionInfo getLatestTrxInfo() {
+    return this.latestTrxInfo;
+  }
+
+  @Override
+  public SnapshotInfo getLatestSnapshot() {
+    return latestSnapshot;
+  }
+
+  @Override
+  public void setLatestSnapshot(SnapshotInfo latestSnapshot) {
+    this.latestSnapshot = latestSnapshot;
+  }
+
+  @Override
+  public void flush() throws IOException {
+    // write latest trx info into trx table in the same batch
+    Table<String, SCMTransactionInfo> transactionInfoTable
+        = metadataStore.getTransactionInfoTable();
+    transactionInfoTable.putWithBatch(currentBatchOperation,
+        TRANSACTION_INFO_KEY, latestTrxInfo);
+
+    metadataStore.getStore().commitBatchOperation(currentBatchOperation);
+    currentBatchOperation.close();
+    this.latestSnapshot = latestTrxInfo.toSnapshotInfo();

Review comment:
       i think, reset of the rocks db batch operation should be made independent of flush. We may/may not require to reinitialise every time we call flush. For example, shutting down the raft server instance may initiate the last snapshot but will not require the batch reinitialisation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558613616



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +119,30 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    SCMTransactionInfo lastAppliedTrxInfo =
+        SCMTransactionInfo.fromTermIndex(lastTermIndex);
+    if (transactionBuffer.getLatestTrxInfo()
+        .compareTo(lastAppliedTrxInfo) < 0) {
+      transactionBuffer.updateLatestTrxInfo(
+          SCMTransactionInfo.builder()
+              .setCurrentTerm(lastTermIndex.getTerm())
+              .setTransactionIndex(lastTermIndex.getIndex())
+              .build());
+      transactionBuffer.setLatestSnapshot(
+          transactionBuffer.getLatestTrxInfo().toSnapshotInfo());
+    } else {
+      lastAppliedIndex =
+          transactionBuffer.getLatestTrxInfo().getTransactionIndex();
+    }
+
+    transactionBuffer.flush();
+    LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms",
+        getLastAppliedTermIndex(), Time.monotonicNow() - startTime);
+    return lastAppliedIndex;

Review comment:
       O the code here is keeping the lastAppliedIndex in buffer to equal to the getLastAppliedTermIndex() actually. 
   
   ```
       LOG.info("Current Snapshot Index {}, takeSnapshot took {} ms",
           getLastAppliedTermIndex(), Time.monotonicNow() - startTime);
   ```
   is wrong.
   
   I should replace `getLastAppliedTermIndex()` with `lastAppliedIndex` to make it clear.
   
   
   We need the `lastAppliedIndex` in buffer keep updated so that will be written into DB.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557260195



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * SCMTransactionInfo saves two fields for a transaction:
+ *  1. term, of which the transaction belongs to
+ *  2. transactionIndex, which is a monotonic increasing index
+ *     (e.g. Raft Log index)
+ */
+final public class SCMTransactionInfo {
+  private long term;
+  private long transactionIndex;
+
+  private SCMTransactionInfo(String transactionInfo) {
+    String[] tInfo =
+        transactionInfo.split(TRANSACTION_INFO_SPLIT_KEY);
+    Preconditions.checkState(tInfo.length == 2,
+        "Incorrect TransactionInfo value");
+
+    term = Long.parseLong(tInfo[0]);
+    transactionIndex = Long.parseLong(tInfo[1]);
+  }
+
+  private SCMTransactionInfo(long currentTerm, long transactionIndex) {
+    this.term = currentTerm;
+    this.transactionIndex = transactionIndex;
+  }
+
+  public boolean shouldUpdate() {

Review comment:
       `shouldUpdate()` is not very intuitive in the context of `SCMTransactionInfo`, how about `isInitialized()` or something like this ?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##########
@@ -111,15 +113,25 @@ private PipelineManagerV2Impl(ConfigurationSource conf,
     this.pipelineCreationAllowed = new AtomicBoolean(!this.isInSafeMode.get());
   }
 
-  public static PipelineManagerV2Impl newPipelineManager(
+  public static PipelineManagerV2Impl newPipelineManagerWithMockBuffer(

Review comment:
       `DBTransactionBuffer` can be fetched by calling `SCMHAManager#getDBTransactionBuffer`, so this function `newPipelineManagerWithMockBuffer` is not needed.
   ```
   // Create PipelineStateManager
   StateManager stateManager = PipelineStateManagerV2Impl
       .newBuilder().setPipelineStore(pipelineStore)
       .setRatisServer(scmhaManager.getRatisServer())
       .setNodeManager(nodeManager)
       .setSCMDBTransactionBuffer(scmhaManager.getDBTransactionBuffer())
       .build()
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r554511795



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMRatisSnapshotInfo.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.server.storage.FileInfo;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.List;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * This class captures the snapshotIndex and term of the latest snapshot in
+ * the SCM
+ * Ratis server loads the snapshotInfo during startup and updates the
+ * lastApplied index to this snapshotIndex. SCM SnapshotInfo does not contain
+ * any files. It is used only to store/ update the last applied index and term.
+ */
+public class SCMRatisSnapshotInfo implements SnapshotInfo {
+  private volatile long term;
+  private volatile long snapshotIndex;
+
+  public SCMRatisSnapshotInfo(long term, long index) {
+    this.term = term;
+    this.snapshotIndex = index;
+  }
+
+  public void updateTermIndex(long newTerm, long newIndex) {
+    this.term = newTerm;
+    this.snapshotIndex = newIndex;
+  }

Review comment:
       I don't see this method is used in this PR. Will this be used in follow-up task? If not we could remove this and remove volatile keyword for term/snapshotIndex as well. Because I didn't find the concurrent update for this two variables.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553759439



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       How about return an anonymous subclass of `RDBBatchOperation` whose `commit()` will throw `RuntimeException`, which means you can only write to the batch, but there is no way to commit this batch ?
   
   `RDBBatchOperation#commit()` is called by `RDBStore#commitBatchOperation()`.
   
   Also add javadoc to this `getCurrentBatchOperation()`, notifying that this returned batch can not be commited.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-752826119


   @amaliujia, thanks for working on this,  I left a few comments. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r554171187



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Glen's idea might be much easier. That's a good option.
   
   I think I will separate it to:
   1. add javadoc to `getCurrentBatchOperation()` to remind that do not use this batch to commit.
   2. Apply a solution to fix this issue. Created https://issues.apache.org/jira/browse/HDDS-4661 to track it.
   
   Because `getCurrentBatchOperation()` is not a user-facing API, so only Ozone developer could make such mistakes, and in short term we have code review process to catch those. But I agree we should apply a fix for the longer term. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r554171187



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Glen's idea might be much easier. That's a good option.
   
   I think I will separate it to:
   1. add javadoc to `getCurrentBatchOperation()` to remind that do not use this batch to commit.
   2. Apply a solution to fix this issue. Created https://issues.apache.org/jira/browse/HDDS-4661 to track it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] runzhiwang commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553683757



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       Sorry, I'm not clear why we can not reuse OMTransactionInfo. They has the same fields.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-761265536


   Failed UT is the decommission one, which is known to be flaky.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GlenGeng commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
GlenGeng commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553759439



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       How about return an anonymous subclass of `RDBBatchOperation` whose `commit()` will throw `RuntimeException`, which means you can only write to the batch, but there is no way to commit this batch ?
   
   `RDBBatchOperation#commit()` is called by `RDBStore#commitBatchOperation()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550165642



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java
##########
@@ -80,6 +82,15 @@
           ContainerInfo.class,
           new ContainerInfoCodec());
 
+  public static final DBColumnFamilyDefinition<String, SCMTransactionInfo>
+      TRANSACTIONINFO =
+      new DBColumnFamilyDefinition<>(
+          "scmTransactionInfoTable",

Review comment:
       scmTransactionInfoTable -> scmTransactionInfos




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558610128



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##########
@@ -114,12 +114,14 @@ private PipelineManagerV2Impl(ConfigurationSource conf,
   public static PipelineManagerV2Impl newPipelineManager(
       ConfigurationSource conf, SCMHAManager scmhaManager,
       NodeManager nodeManager, Table<PipelineID, Pipeline> pipelineStore,
-      EventPublisher eventPublisher) throws IOException {
+      EventPublisher eventPublisher)

Review comment:
       Yes :) will undo it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] runzhiwang commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553675463



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       I'm afraid if someone do not know must call flush() to sync, but sync `currentBatchOperation `directly, then inconsistent will happen.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550406185



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-759931810


   > Thanks for the job!
   > 
   > Better solve following task in this PR, it should be part of the job of load snapshot.
   > https://issues.apache.org/jira/browse/HDDS-4533
   
   @GlenGeng  I prefer to fix such issue separately so create a PR for it: https://github.com/apache/ozone/pull/1796


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r548129231



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       To reuse `OMTransactionInfo`, my current answer is probably no.
   
   Though both `OMTransactionInfo` and `SCMTransactionInfo` seems are doing the same thing for now, decoupling is good when there is a need that either `OMTransactionInfo` or `SCMTransactionInfo` needs to add(drop) new(existing) fields while the other does not need. Especially at this moment that SCM snapshot is still under development. 
   
   I think we can consider this after SCM snapshot is stable. I can create a JIRA though to track the merge of two transaction info.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553551267



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       hmm I think the goal here is commiting batch operations won't need to flush DB. The flushing is controlled by takeSnapshot().
   
   Is there a case that committing will have a following flush?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r548131093



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
##########
@@ -275,12 +278,24 @@ public void addContainer(final ContainerInfoProto containerInfo)
 
     if (!containers.contains(containerID)) {
       ExecutionUtil.create(() -> {
-        containerStore.put(containerID, container);
+        if (transactionBuffer != null) {
+          containerStore.putWithBatch(
+              transactionBuffer.getCurrentBatchOperation(),
+              containerID, container);
+        } else {
+          containerStore.put(containerID, container);

Review comment:
       This is leave for MockHAManager code path in which the buffer is NULL :) 
   
   If we choose to update MockHAManager to have a sort of MockBuffer implementation, then we can move this `else branch` because at that time I think all existing code will access this buffer.
   
   I can create a JIRA to track that add buffer to MockHAManager and remove this else :-)

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
##########
@@ -275,12 +278,24 @@ public void addContainer(final ContainerInfoProto containerInfo)
 
     if (!containers.contains(containerID)) {
       ExecutionUtil.create(() -> {
-        containerStore.put(containerID, container);
+        if (transactionBuffer != null) {
+          containerStore.putWithBatch(
+              transactionBuffer.getCurrentBatchOperation(),
+              containerID, container);
+        } else {
+          containerStore.put(containerID, container);

Review comment:
       This is left for MockHAManager code path in which the buffer is NULL :) 
   
   If we choose to update MockHAManager to have a sort of MockBuffer implementation, then we can move this `else branch` because at that time I think all existing code will access this buffer.
   
   I can create a JIRA to track that add buffer to MockHAManager and remove this else :-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r558145432



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       In my opinion, we should stop the ratis server here if such a case arise. It is fatal case where we cannot move ahead from here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis Snapshots on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-749421866


   cc @nandakumar131 @ChenSammi @GlenGeng 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] runzhiwang commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553318732



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Expose currentBatchOperation has a risk, if someone commit this batch operation not by calling flush(), then applyIndex did not wrote into RocksDB.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559799875



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMRatisSnapshotInfo.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.ratis.server.protocol.TermIndex;
+import org.apache.ratis.server.storage.FileInfo;
+import org.apache.ratis.statemachine.SnapshotInfo;
+
+import java.util.List;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+/**
+ * This class captures the snapshotIndex and term of the latest snapshot in
+ * the SCM
+ * Ratis server loads the snapshotInfo during startup and updates the
+ * lastApplied index to this snapshotIndex. SCM SnapshotInfo does not contain
+ * any files. It is used only to store/ update the last applied index and term.
+ */
+public class SCMRatisSnapshotInfo implements SnapshotInfo {
+  private volatile long term;
+  private volatile long snapshotIndex;
+
+  public SCMRatisSnapshotInfo(long term, long index) {
+    this.term = term;
+    this.snapshotIndex = index;
+  }
+
+  public void updateTermIndex(long newTerm, long newIndex) {
+    this.term = newTerm;
+    this.snapshotIndex = newIndex;
+  }

Review comment:
       That makes sense. Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r549917323



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(-1)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {
+    return currentBatchOperation;
+  }
+
+  public void updateLatestTrxInfo(SCMTransactionInfo info) {
+    this.latestTrxInfo = info;

Review comment:
       Good point. I will add a check or a LOG.error for that case.
   
   Theoretically, the Ratis log will be applied with index monotonically increasing thus the new info is supposed to greater than current  lastTrxInfo. But have a check or a LOG will better handle rare cases or bugs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r554169986



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       Sure we can do that. I will need add some changes in main branch, and then sync that to HDDS-2823 and then reuse in SCM.
   
   I think we should do that separately as that will take time. I have created https://issues.apache.org/jira/browse/HDDS-4660 to track this effort.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Glen's idea might be much easier. That's a good option.
   
   I think I will separate it to:
   1. add javadoc to `getCurrentBatchOperation()` to remind that do not use this batch to commit.
   2. Apply a solution to fix this issue. Created https://issues.apache.org/jira/browse/HDDS-4661 to track it.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMDBTransactionBuffer.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p/>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p/>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import org.apache.hadoop.hdds.scm.metadata.SCMMetadataStore;
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.hdds.utils.db.Table;
+
+import java.io.IOException;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_KEY;
+
+public class SCMDBTransactionBuffer {
+  private final SCMMetadataStore metadataStore;
+  private BatchOperation currentBatchOperation;
+  private SCMTransactionInfo latestTrxInfo;
+
+  public SCMDBTransactionBuffer(SCMMetadataStore store) {
+    this.metadataStore = store;
+
+    // initialize a batch operation during construction time
+    currentBatchOperation = this.metadataStore.getStore().initBatchOperation();
+    latestTrxInfo =
+        SCMTransactionInfo
+            .builder()
+            .setTransactionIndex(-1)
+            .setCurrentTerm(0)
+            .build();
+  }
+
+  public BatchOperation getCurrentBatchOperation() {

Review comment:
       Glen's idea might be much easier. That's a good option.
   
   I think I will separate it to:
   1. add javadoc to `getCurrentBatchOperation()` to remind that do not use this batch to commit.
   2. Apply a solution to fix this issue. Created https://issues.apache.org/jira/browse/HDDS-4661 to track it.
   
   Because `getCurrentBatchOperation()` is not a user-facing API, so only Ozone developer could make such mistakes, and in short term we have code review process to catch those. But I agree we should apply a fix for the longer term. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on pull request #1725: HDDS-3208. Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#issuecomment-756437020


   Will fix failing tests.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550404763



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
##########
@@ -275,12 +278,24 @@ public void addContainer(final ContainerInfoProto containerInfo)
 
     if (!containers.contains(containerID)) {
       ExecutionUtil.create(() -> {
-        containerStore.put(containerID, container);
+        if (transactionBuffer != null) {
+          containerStore.putWithBatch(
+              transactionBuffer.getCurrentBatchOperation(),
+              containerID, container);
+        } else {
+          containerStore.put(containerID, container);

Review comment:
       https://issues.apache.org/jira/browse/HDDS-4634
   
   Create JIRA to add buffer in MockHAManager thus we can remove this `null` check.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
##########
@@ -275,12 +278,24 @@ public void addContainer(final ContainerInfoProto containerInfo)
 
     if (!containers.contains(containerID)) {
       ExecutionUtil.create(() -> {
-        containerStore.put(containerID, container);
+        if (transactionBuffer != null) {
+          containerStore.putWithBatch(
+              transactionBuffer.getCurrentBatchOperation(),
+              containerID, container);
+        } else {
+          containerStore.put(containerID, container);

Review comment:
       https://issues.apache.org/jira/browse/HDDS-4634
   
   Create JIRA to add buffer in MockHAManager thus we can remove `else`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557866528



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -54,7 +54,7 @@ public SCMStateMachine(SCMDBTransactionBuffer buffer) throws SCMException {
     this.handlers = new EnumMap<>(RequestType.class);
     this.transactionBuffer = buffer;
     SCMTransactionInfo latestTrxInfo = buffer.getLatestTrxInfo();
-    if (latestTrxInfo.shouldUpdate()) {
+    if (!latestTrxInfo.isInitialized()) {

Review comment:
       Actually it is `!latestTrxInfo.isInitialized()`.
   
   `SCMTransactionInfo latestTrxInfo = buffer.getLatestTrxInfo();` comes from the buffer. The buffer will load the SCMTransactionInfo from the DB. If there was no snapshot taken before (e.g. brand new SCM), there isn't a SCMTransactionInfo from DB thus buffer will create a default one. 
   
   So we only need to updateLastAppliedTermIndex when latestTrxInfo is not initialized (thus it is loaded from DB and it's meaningful).
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r557759340



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerV2Impl.java
##########
@@ -111,15 +113,25 @@ private PipelineManagerV2Impl(ConfigurationSource conf,
     this.pipelineCreationAllowed = new AtomicBoolean(!this.isInSafeMode.get());
   }
 
-  public static PipelineManagerV2Impl newPipelineManager(
+  public static PipelineManagerV2Impl newPipelineManagerWithMockBuffer(

Review comment:
       This is a very good point.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r559799579



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java
##########
@@ -397,4 +396,8 @@ public void close() throws IOException {
   protected ContainerStateManagerV2 getContainerStateManager() {
     return containerStateManager;
   }
+
+  public SCMHAManager getSCMHAManager() {

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] runzhiwang commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553317733



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +99,15 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    long startTime = Time.monotonicNow();
+    TermIndex lastTermIndex = getLastAppliedTermIndex();
+    long lastAppliedIndex = lastTermIndex.getIndex();
+    transactionBuffer.flush();
+    LOG.debug("SCM takeSnapshot: {} took {} ms",
+        Time.monotonicNow() - startTime);
+    return lastAppliedIndex;

Review comment:
       Maybe flush should return SCMTransactionInfo, and this return SCMTransactionInfo#transactionIndex ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] linyiqun commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
linyiqun commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550893634



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       Also would be great to add javadoc for this class.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. SCM Snapshot Phase 1: Implement Ratis takeSnapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r550406429



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java
##########
@@ -89,4 +98,12 @@ private Message process(final SCMRatisRequest request)
     }
   }
 
+  @Override
+  public long takeSnapshot() throws IOException {
+    LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+    TermIndex lastTermIndex = getLastAppliedTermIndex();

Review comment:
       > Since all rocksdb writes are batched to write to disk in takesnapshot, we need to change takesnapshot default frequency of ratis during scm ratis server creation.
   
   https://issues.apache.org/jira/browse/HDDS-4636 I will prefer to handle separately.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] amaliujia commented on a change in pull request #1725: HDDS-3208. Implement Ratis snapshot on SCM

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1725:
URL: https://github.com/apache/ozone/pull/1725#discussion_r553765849



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMTransactionInfo.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * <p>http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * <p>Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.hdds.scm.ha;
+
+import com.google.common.base.Preconditions;
+import org.apache.hadoop.hdds.StringUtils;
+
+import java.util.Objects;
+
+import static org.apache.hadoop.ozone.OzoneConsts.TRANSACTION_INFO_SPLIT_KEY;
+
+final public class SCMTransactionInfo {

Review comment:
       I think we are better to reuse `OMTransactionInfo` after merge HDDS-2823 back to master.
   
   Not reusing `OMTransactionInfo` is to try to reduce the risk that have conflicts after syncing master int HDDS-2823. Also, to reuse `OMTransactionInfo`, I think we'd better to update the name to `TransactionInfo`, which will be a source of complicates, unless we can change it in the master branch first.
   
   To conclude a bit, we can reuse it after merging SCM HA back to master, the goal is try to reduce potential conflicts if there could be any.
   
   SCM HA is in developing branch anyway and I am sure we will need to clean up lots of stuff eventually. As long as we have JIRAs to not lose track of such work. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org