You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/06/07 17:01:45 UTC

[GitHub] [hadoop-ozone] leosunli opened a new pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

leosunli opened a new pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033


   …econ to unregister
   
   Signed-off-by: sunlisheng <li...@gmail.com>
   
   ## What changes were proposed in this pull request?
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r458794752



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java
##########
@@ -164,9 +171,29 @@ public Void call() throws Exception {
           HddsDatanodeService.class, args, LOG);
     }
     start(createOzoneConfiguration());
+    stopDataNodeShutdownHook = () ->
+            stopDataNode();
+    ShutdownHookManager.get().addShutdownHook(stopDataNodeShutdownHook,
+            SHUTDOWN_HOOK_PRIORITY);
     join();
     return null;
   }
+  
+  private void stopDataNode() {
+    Collection<EndpointStateMachine> endpointStateMachines =
+        datanodeStateMachine.getConnectionManager().getValues();
+    for (EndpointStateMachine endpointStateMachine : endpointStateMachines) {
+      try {
+        // Make sure the heartbeat is not sent

Review comment:
       is not sent -> won't send

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -708,6 +707,21 @@ private void updateNodeState(DatanodeInfo node, Predicate<Long> condition,
     }
   }
 
+  /**
+   *
+   * @param datanodeDetails
+   */
+  public void stopNode(DatanodeDetails datanodeDetails)
+      throws NodeNotFoundException {
+    UUID uuid = datanodeDetails.getUuid();
+    NodeState state = nodeStateMap.getNodeState(uuid);
+    DatanodeInfo node = nodeStateMap.getNodeInfo(uuid);
+    nodeStateMap.updateNodeState(datanodeDetails.getUuid(), state,
+        NodeState.DEAD);
+    node.updateLastHeartbeatTime(0L);

Review comment:
       Can you please explain why you set LastHeartbeatTime to 0?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-643618997


   # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=h1) Report
   > Merging [#1033](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hadoop-ozone/commit/96f07d5c27ba8f2ab9d77df25e10d4cb94489e5f&el=desc) will **decrease** coverage by `0.06%`.
   > The diff coverage is `28.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/graphs/tree.svg?width=650&height=150&src=pr&token=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1033      +/-   ##
   ============================================
   - Coverage     69.36%   69.29%   -0.07%     
   - Complexity     9103     9112       +9     
   ============================================
     Files           961      963       +2     
     Lines         48127    48244     +117     
     Branches       4676     4687      +11     
   ============================================
   + Hits          33381    33430      +49     
   - Misses        12530    12594      +64     
   - Partials       2216     2220       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ne/container/common/statemachine/StateContext.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3N0YXRlbWFjaGluZS9TdGF0ZUNvbnRleHQuamF2YQ==) | `85.02% <ø> (-0.09%)` | `52.00 <0.00> (-1.00)` | |
   | [...p/ozone/protocol/commands/UnRegisteredCommand.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9wcm90b2NvbC9jb21tYW5kcy9VblJlZ2lzdGVyZWRDb21tYW5kLmphdmE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | [...ntainerDatanodeProtocolClientSideTranslatorPB.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9wcm90b2NvbFBCL1N0b3JhZ2VDb250YWluZXJEYXRhbm9kZVByb3RvY29sQ2xpZW50U2lkZVRyYW5zbGF0b3JQQi5qYXZh) | `85.71% <0.00%> (-11.06%)` | `10.00 <0.00> (ø)` | |
   | [...rg/apache/hadoop/hdds/scm/node/SCMNodeManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL25vZGUvU0NNTm9kZU1hbmFnZXIuamF2YQ==) | `84.29% <0.00%> (-2.89%)` | `55.00 <0.00> (ø)` | |
   | [...oop/hdds/scm/server/SCMDatanodeProtocolServer.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL3NlcnZlci9TQ01EYXRhbm9kZVByb3RvY29sU2VydmVyLmphdmE=) | `67.97% <0.00%> (-7.18%)` | `24.00 <0.00> (ø)` | |
   | [.../apache/hadoop/hdds/scm/node/NodeStateManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL25vZGUvTm9kZVN0YXRlTWFuYWdlci5qYXZh) | `73.37% <33.33%> (-1.63%)` | `42.00 <0.00> (ø)` | |
   | [...common/states/endpoint/UnRegisterEndpointTask.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3N0YXRlcy9lbmRwb2ludC9VblJlZ2lzdGVyRW5kcG9pbnRUYXNrLmphdmE=) | `44.23% <44.23%> (ø)` | `4.00 <4.00> (?)` | |
   | [.../java/org/apache/hadoop/ozone/audit/SCMAction.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvYXVkaXQvU0NNQWN0aW9uLmphdmE=) | `100.00% <100.00%> (ø)` | `2.00 <0.00> (ø)` | |
   | [...r/common/states/datanode/RunningDatanodeState.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3N0YXRlcy9kYXRhbm9kZS9SdW5uaW5nRGF0YW5vZGVTdGF0ZS5qYXZh) | `78.57% <100.00%> (-3.25%)` | `17.00 <0.00> (-2.00)` | |
   | [...otocol/commands/RetriableDatanodeEventWatcher.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL3Byb3RvY29sL2NvbW1hbmRzL1JldHJpYWJsZURhdGFub2RlRXZlbnRXYXRjaGVyLmphdmE=) | `55.55% <0.00%> (-44.45%)` | `3.00% <0.00%> (-1.00%)` | |
   | ... and [17 more](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=footer). Last update [96f07d5...8b89fad](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-689204311


   /pending @nandakumar131: "State change is not properly handled"


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-669961893


   @elek Would you like to take a look at this PR, that's a interesting feature, and it would be a useful feature.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-643611712


   the failed pre-check is unrelated to this patch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-659808668


   @leosunli After you are ready for review, please use `/ready` to restore this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r438540923



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+      ConfigurationSource conf, OzoneContainer ozoneContainer,
+      StateContext context) {
+    this.rpcEndPoint = rpcEndPoint;
+    this.conf = conf;
+    this.datanodeContainerManager = ozoneContainer;
+    this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+    return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+      DatanodeDetails datanodeDetails) {
+    this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+    if (getDatanodeDetails() == null) {
+      LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+          + "shutting down the endpoint.");
+      return rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+    }
+
+    rpcEndPoint.lock();
+    try {
+
+      if (rpcEndPoint.getState()
+          .equals(EndPointStates.SHUTDOWN)) {
+        ContainerReportsProto containerReport =
+            datanodeContainerManager.getController().getContainerReport();
+        NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+        PipelineReportsProto pipelineReportsProto =
+            datanodeContainerManager.getPipelineReport();
+        // TODO : Add responses to the command Queue.
+        SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()
+            .unregister(datanodeDetails.getProtoBufMessage(), nodeReport,
+                containerReport, pipelineReportsProto);
+        Preconditions.checkState(UUID.fromString(response.getDatanodeUUID())
+                .equals(datanodeDetails.getUuid()),
+            "Unexpected datanode ID in the response.");
+        Preconditions.checkState(!StringUtils.isBlank(response.getClusterID()),
+            "Invalid cluster ID in the response.");
+        if (response.hasHostname() && response.hasIpAddress()) {

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] codecov-commenter edited a comment on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-643618997


   # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=h1) Report
   > Merging [#1033](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hadoop-ozone/commit/715aed2d158d2c6708af8b6a9b8270766103ee52&el=desc) will **decrease** coverage by `0.06%`.
   > The diff coverage is `68.49%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/graphs/tree.svg?width=650&height=150&src=pr&token=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1033      +/-   ##
   ============================================
   - Coverage     73.54%   73.47%   -0.07%     
   + Complexity    10043    10041       -2     
   ============================================
     Files           978      979       +1     
     Lines         49894    49966      +72     
     Branches       4845     4847       +2     
   ============================================
   + Hits          36696    36714      +18     
   - Misses        10887    10930      +43     
   - Partials       2311     2322      +11     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...ntainerDatanodeProtocolClientSideTranslatorPB.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9wcm90b2NvbFBCL1N0b3JhZ2VDb250YWluZXJEYXRhbm9kZVByb3RvY29sQ2xpZW50U2lkZVRyYW5zbGF0b3JQQi5qYXZh) | `83.33% <0.00%> (-13.45%)` | `10.00 <0.00> (ø)` | |
   | [...a/org/apache/hadoop/ozone/HddsDatanodeService.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9IZGRzRGF0YW5vZGVTZXJ2aWNlLmphdmE=) | `67.75% <42.85%> (-1.51%)` | `39.00 <1.00> (+1.00)` | :arrow_down: |
   | [...rg/apache/hadoop/hdds/scm/node/SCMNodeManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL25vZGUvU0NNTm9kZU1hbmFnZXIuamF2YQ==) | `84.73% <57.14%> (-0.80%)` | `57.00 <1.00> (+1.00)` | :arrow_down: |
   | [...oop/hdds/scm/server/SCMDatanodeProtocolServer.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL3NlcnZlci9TQ01EYXRhbm9kZVByb3RvY29sU2VydmVyLmphdmE=) | `75.84% <64.70%> (-1.18%)` | `28.00 <2.00> (+2.00)` | :arrow_down: |
   | [...p/ozone/protocol/commands/StopDataNodeCommand.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9wcm90b2NvbC9jb21tYW5kcy9TdG9wRGF0YU5vZGVDb21tYW5kLmphdmE=) | `90.90% <90.90%> (ø)` | `3.00 <3.00> (?)` | |
   | [.../java/org/apache/hadoop/ozone/audit/SCMAction.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvYXVkaXQvU0NNQWN0aW9uLmphdmE=) | `100.00% <100.00%> (ø)` | `2.00 <0.00> (ø)` | |
   | [...ntainerDatanodeProtocolServerSideTranslatorPB.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9wcm90b2NvbFBCL1N0b3JhZ2VDb250YWluZXJEYXRhbm9kZVByb3RvY29sU2VydmVyU2lkZVRyYW5zbGF0b3JQQi5qYXZh) | `90.24% <100.00%> (+1.67%)` | `8.00 <0.00> (+1.00)` | |
   | [.../org/apache/hadoop/hdds/scm/node/DatanodeInfo.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL25vZGUvRGF0YW5vZGVJbmZvLmphdmE=) | `96.29% <100.00%> (+0.64%)` | `8.00 <1.00> (+1.00)` | |
   | [.../apache/hadoop/hdds/scm/node/NodeStateManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL25vZGUvTm9kZVN0YXRlTWFuYWdlci5qYXZh) | `76.02% <100.00%> (+1.02%)` | `43.00 <1.00> (+1.00)` | |
   | [...hdds/scm/container/common/helpers/ExcludeList.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zY20vY29udGFpbmVyL2NvbW1vbi9oZWxwZXJzL0V4Y2x1ZGVMaXN0LmphdmE=) | `78.26% <0.00%> (-21.74%)` | `17.00% <0.00%> (-5.00%)` | |
   | ... and [23 more](https://codecov.io/gh/apache/hadoop-ozone/pull/1033/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=footer). Last update [715aed2...faa9d59](https://codecov.io/gh/apache/hadoop-ozone/pull/1033?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli closed pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli closed pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r459203399



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -708,6 +707,21 @@ private void updateNodeState(DatanodeInfo node, Predicate<Long> condition,
     }
   }
 
+  /**
+   *
+   * @param datanodeDetails
+   */
+  public void stopNode(DatanodeDetails datanodeDetails)
+      throws NodeNotFoundException {
+    UUID uuid = datanodeDetails.getUuid();
+    NodeState state = nodeStateMap.getNodeState(uuid);
+    DatanodeInfo node = nodeStateMap.getNodeInfo(uuid);
+    nodeStateMap.updateNodeState(datanodeDetails.getUuid(), state,
+        NodeState.DEAD);
+    node.updateLastHeartbeatTime(0L);

Review comment:
       According to LastHeartbeatTime, NodeStateManager#checkNodesHealth detemine the status of node periodically.
   If LastHeartbeatTime is not set to 0, NodeStateManager#checkNodesHealth could update the healty of node after we set the status of the node to dead.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r463452504



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -708,6 +707,21 @@ private void updateNodeState(DatanodeInfo node, Predicate<Long> condition,
     }
   }
 
+  /**
+   *
+   * @param datanodeDetails
+   */
+  public void stopNode(DatanodeDetails datanodeDetails)
+      throws NodeNotFoundException {
+    UUID uuid = datanodeDetails.getUuid();
+    NodeState state = nodeStateMap.getNodeState(uuid);
+    DatanodeInfo node = nodeStateMap.getNodeInfo(uuid);
+    nodeStateMap.updateNodeState(datanodeDetails.getUuid(), state,
+        NodeState.DEAD);
+    node.updateLastHeartbeatTime(0L);

Review comment:
       Ok, so would you like to create a ticket an paste it as a TODO comment here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r437106724



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -384,10 +384,9 @@ public void addPipelineActionIfAbsent(PipelineAction pipelineAction) {
       return new InitDatanodeState(this.conf, parent.getConnectionManager(),
           this);
     case RUNNING:
-      return new RunningDatanodeState(this.conf, parent.getConnectionManager(),
-          this);
     case SHUTDOWN:
-      return null;
+      return new RunningDatanodeState(this.conf, parent.getConnectionManager(),
+              this);

Review comment:
       this line is not consistent with other lines in this switch block, although checkstyle pass, we should still keep code style consistent to the exist style.

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+      ConfigurationSource conf, OzoneContainer ozoneContainer,
+      StateContext context) {
+    this.rpcEndPoint = rpcEndPoint;
+    this.conf = conf;
+    this.datanodeContainerManager = ozoneContainer;
+    this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+    return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+      DatanodeDetails datanodeDetails) {
+    this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+    if (getDatanodeDetails() == null) {
+      LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+          + "shutting down the endpoint.");
+      return rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+    }
+
+    rpcEndPoint.lock();
+    try {
+
+      if (rpcEndPoint.getState()
+          .equals(EndPointStates.SHUTDOWN)) {
+        ContainerReportsProto containerReport =
+            datanodeContainerManager.getController().getContainerReport();
+        NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+        PipelineReportsProto pipelineReportsProto =
+            datanodeContainerManager.getPipelineReport();
+        // TODO : Add responses to the command Queue.
+        SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()
+            .unregister(datanodeDetails.getProtoBufMessage(), nodeReport,
+                containerReport, pipelineReportsProto);
+        Preconditions.checkState(UUID.fromString(response.getDatanodeUUID())
+                .equals(datanodeDetails.getUuid()),
+            "Unexpected datanode ID in the response.");
+        Preconditions.checkState(!StringUtils.isBlank(response.getClusterID()),
+            "Invalid cluster ID in the response.");
+        if (response.hasHostname() && response.hasIpAddress()) {

Review comment:
       Not sure if the datanodeDetails updating is necessary for this command is a unregister command which is a last connection to SCM.

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.

Review comment:
       `register ` -> `unregister`

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+      ConfigurationSource conf, OzoneContainer ozoneContainer,
+      StateContext context) {
+    this.rpcEndPoint = rpcEndPoint;
+    this.conf = conf;
+    this.datanodeContainerManager = ozoneContainer;
+    this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+    return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+      DatanodeDetails datanodeDetails) {
+    this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+    if (getDatanodeDetails() == null) {
+      LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+          + "shutting down the endpoint.");
+      return rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+    }
+
+    rpcEndPoint.lock();
+    try {
+
+      if (rpcEndPoint.getState()
+          .equals(EndPointStates.SHUTDOWN)) {
+        ContainerReportsProto containerReport =
+            datanodeContainerManager.getController().getContainerReport();
+        NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+        PipelineReportsProto pipelineReportsProto =
+            datanodeContainerManager.getPipelineReport();
+        // TODO : Add responses to the command Queue.
+        SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()
+            .unregister(datanodeDetails.getProtoBufMessage(), nodeReport,

Review comment:
       Do we still need container report when unregister? If not, we an remove the report logic and container member from this class

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+      ConfigurationSource conf, OzoneContainer ozoneContainer,
+      StateContext context) {
+    this.rpcEndPoint = rpcEndPoint;
+    this.conf = conf;
+    this.datanodeContainerManager = ozoneContainer;
+    this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+    return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+      DatanodeDetails datanodeDetails) {
+    this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+    if (getDatanodeDetails() == null) {
+      LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+          + "shutting down the endpoint.");
+      return rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+    }
+
+    rpcEndPoint.lock();
+    try {
+
+      if (rpcEndPoint.getState()
+          .equals(EndPointStates.SHUTDOWN)) {
+        ContainerReportsProto containerReport =
+            datanodeContainerManager.getController().getContainerReport();
+        NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+        PipelineReportsProto pipelineReportsProto =
+            datanodeContainerManager.getPipelineReport();
+        // TODO : Add responses to the command Queue.
+        SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()

Review comment:
       In fact, if there is a new message SCMUnRegisteredResponseProto, it would be better.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java
##########
@@ -244,12 +247,61 @@ public SCMRegisteredResponseProto register(
     }
   }
 
+  @Override
+  public SCMUNRegisteredResponseProto unregister(

Review comment:
       I think you just copy some blocks here, you can make some modification.

##########
File path: hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto
##########
@@ -113,6 +116,32 @@ message SCMRegisteredResponseProto {
   optional string networkLocation = 8;
 }
 
+message SCMUNRegisterRequestProto {
+  required DatanodeDetailsProto datanodeDetails = 1;
+  required NodeReportProto nodeReport = 2;

Review comment:
       Are all these memebers fo this message necessary?

##########
File path: hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto
##########
@@ -113,6 +116,32 @@ message SCMRegisteredResponseProto {
   optional string networkLocation = 8;
 }
 
+message SCMUNRegisterRequestProto {
+  required DatanodeDetailsProto datanodeDetails = 1;
+  required NodeReportProto nodeReport = 2;
+  required ContainerReportsProto containerReport = 3;
+  required PipelineReportsProto pipelineReports = 4;
+}
+
+/**
+ * Datanode ID returned by the SCM. This is similar to name node
+ * unregisteration of a datanode.
+ */
+message SCMUNRegisteredResponseProto {
+  enum ErrorCode {
+    success = 1;
+    errorNodeNotPermitted = 2;
+  }
+  required ErrorCode errorCode = 1;
+  required string datanodeUUID = 2;

Review comment:
       Nit, clear the unnecessary fields




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] elek commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-651667340


   /pending See questions from @maobaolong 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli edited a comment on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli edited a comment on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-660770625


   ![image](https://user-images.githubusercontent.com/3837921/87893676-250ed400-ca73-11ea-8f9c-e7b57714603a.png)
   
   this failed ut should be unrelated to this patch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r459203434



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java
##########
@@ -164,9 +171,29 @@ public Void call() throws Exception {
           HddsDatanodeService.class, args, LOG);
     }
     start(createOzoneConfiguration());
+    stopDataNodeShutdownHook = () ->
+            stopDataNode();
+    ShutdownHookManager.get().addShutdownHook(stopDataNodeShutdownHook,
+            SHUTDOWN_HOOK_PRIORITY);
     join();
     return null;
   }
+  
+  private void stopDataNode() {
+    Collection<EndpointStateMachine> endpointStateMachines =
+        datanodeStateMachine.getConnectionManager().getValues();
+    for (EndpointStateMachine endpointStateMachine : endpointStateMachines) {
+      try {
+        // Make sure the heartbeat is not sent

Review comment:
       done
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r437524947



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -384,10 +384,9 @@ public void addPipelineActionIfAbsent(PipelineAction pipelineAction) {
       return new InitDatanodeState(this.conf, parent.getConnectionManager(),
           this);
     case RUNNING:
-      return new RunningDatanodeState(this.conf, parent.getConnectionManager(),
-          this);
     case SHUTDOWN:
-      return null;
+      return new RunningDatanodeState(this.conf, parent.getConnectionManager(),
+              this);

Review comment:
       done

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.

Review comment:
       done

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##########
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+    Callable<EndpointStateMachine.EndPointStates> {
+  static final Logger LOG =
+          LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future<EndpointStateMachine.EndPointStates> result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+      ConfigurationSource conf, OzoneContainer ozoneContainer,
+      StateContext context) {
+    this.rpcEndPoint = rpcEndPoint;
+    this.conf = conf;
+    this.datanodeContainerManager = ozoneContainer;
+    this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+    return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+      DatanodeDetails datanodeDetails) {
+    this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+    if (getDatanodeDetails() == null) {
+      LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+          + "shutting down the endpoint.");
+      return rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+    }
+
+    rpcEndPoint.lock();
+    try {
+
+      if (rpcEndPoint.getState()
+          .equals(EndPointStates.SHUTDOWN)) {
+        ContainerReportsProto containerReport =
+            datanodeContainerManager.getController().getContainerReport();
+        NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+        PipelineReportsProto pipelineReportsProto =
+            datanodeContainerManager.getPipelineReport();
+        // TODO : Add responses to the command Queue.
+        SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()
+            .unregister(datanodeDetails.getProtoBufMessage(), nodeReport,

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r459992764



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -708,6 +707,21 @@ private void updateNodeState(DatanodeInfo node, Predicate<Long> condition,
     }
   }
 
+  /**
+   *
+   * @param datanodeDetails
+   */
+  public void stopNode(DatanodeDetails datanodeDetails)
+      throws NodeNotFoundException {
+    UUID uuid = datanodeDetails.getUuid();
+    NodeState state = nodeStateMap.getNodeState(uuid);
+    DatanodeInfo node = nodeStateMap.getNodeInfo(uuid);
+    nodeStateMap.updateNodeState(datanodeDetails.getUuid(), state,
+        NodeState.DEAD);
+    node.updateLastHeartbeatTime(0L);

Review comment:
       Yeah, this problem really exists.
   i will create a new jira to solve this prolem.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-660771536


   hi @elek Could you help review this patch?Thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] nandakumar131 commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-689082204


   /pending 
   What advantage do we get by sending a stop signal from the datanode to SCM? This will require additional logic in Node Management code of SCM. If there is not much of value addition, we don't have to add additional complexity in SCM Node Management.
   
   This change will introduce a new path from HEALTHY to DEAD directly (skipping STALE) adding complexity and maintenance overhead.
   
   FYI: @leosunli, State change is not properly handled.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-660770625


   ![image](https://user-images.githubusercontent.com/3837921/87893676-250ed400-ca73-11ea-8f9c-e7b57714603a.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r439781866



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -265,7 +265,9 @@ private void initializeStateMachine() {
     stateMachine.addTransition(
         NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
         NodeLifeCycleEvent.DECOMMISSIONED);
-
+    stateMachine.addTransition(

Review comment:
       How about the STALE, DECOMMISSIONING, DECOMMISSIONED to DEAD? Do nothing or do something?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -265,7 +265,9 @@ private void initializeStateMachine() {
     stateMachine.addTransition(
         NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
         NodeLifeCycleEvent.DECOMMISSIONED);
-
+    stateMachine.addTransition(

Review comment:
       If we received unregister request, do you think it shouldn't change state from XXX to DEAD only? Or we can remove the nodes at all? How about your idea?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
##########
@@ -279,6 +281,24 @@ public RegisteredCommand register(
         .build();
   }
 
+  @Override
+  public UnRegisteredCommand unRegister(DatanodeDetails datanodeDetails) {
+    try {
+      nodeStateManager.removeNode(datanodeDetails);

Review comment:
       Not sure we should remove the node from nodeStateManager, or only change the state from STATEX to DEAD 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-660645820


   /ready


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] github-actions[bot] commented on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-701719760


   Thank you very much for the patch. I am closing this PR __temporarily__ as there was no activity recently and it is waiting for response from its author.
   
   It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.
   
   It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.
   
   If you need ANY help to finish this PR, please [contact the community](https://github.com/apache/hadoop-ozone#contact) on the mailing list or the slack channel."


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] github-actions[bot] closed pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
maobaolong commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r459897552



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -708,6 +707,21 @@ private void updateNodeState(DatanodeInfo node, Predicate<Long> condition,
     }
   }
 
+  /**
+   *
+   * @param datanodeDetails
+   */
+  public void stopNode(DatanodeDetails datanodeDetails)
+      throws NodeNotFoundException {
+    UUID uuid = datanodeDetails.getUuid();
+    NodeState state = nodeStateMap.getNodeState(uuid);
+    DatanodeInfo node = nodeStateMap.getNodeInfo(uuid);
+    nodeStateMap.updateNodeState(datanodeDetails.getUuid(), state,
+        NodeState.DEAD);
+    node.updateLastHeartbeatTime(0L);

Review comment:
       Yeah, i got it, i agree that it can works, but the downside, we lost the real lastHeartBeatTime.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] leosunli edited a comment on pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

Posted by GitBox <gi...@apache.org>.
leosunli edited a comment on pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#issuecomment-660770625


   ![image](https://user-images.githubusercontent.com/3837921/87893676-250ed400-ca73-11ea-8f9c-e7b57714603a.png)
   
   this failed ut should be unrelated to this patch.
   this ut failed  on matser branch in my local


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org