You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/04/25 18:02:51 UTC

[GitHub] [ozone] umamaheswararao opened a new pull request, #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

umamaheswararao opened a new pull request, #3345:
URL: https://github.com/apache/ozone/pull/3345

   ## What changes were proposed in this pull request?
   
   This JIRA is to add the necessary information from SCM for the reconstruction at the DN. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6586
   
   ## How was this patch tested?
   
   Will be adding few tests .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r860789266


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   @guihecheng Thank you for the review.
   I added the replica indexes now. For the pipelineID, I am not sure we have any significance for the closed containers. So, I am holding this to add. We can add later if it really needed?
   
   Coming to missing indexes:
   Target nodes needed not be fixed with any indexes. As we try node per rack, there is no strict requirement to place the node with given index only.
   But partial stripe case is very good point. Seems like we don't have a way for SCM to determine at stripe level. This is simply at the container level. Missing index means, the container which is missed at that index position. Stripe level determination should be done by recovery logic, whether that index is required to recover for that stripe or not based on block group len. 
   
   I was thinking to give some additional nodes as targets, just incase if one of the target node is down at the time of starting recovery, we could simply use one of the additional target node. We donlt need this in first version, but that is one thought. So, I feel we donlt need to mix that missing indexes with target node indexes. 
   Missing indexes are nothing but the missing container index positions.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861973549


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Thats a good point Stephen. In the offline, @guihecheng also has the similar or same question. I think I did not use it because it has more than what we need (ex: bytes used, keycount, seqID, placeOfBirth, state etc...). Are you ok including with them? Not sure we really use all of them. I am also ok to create small proto structure including only DNDetails and Index. 
   Whats your opinion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861678220


##########
hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto:
##########
@@ -407,6 +409,20 @@ message ReplicateContainerCommandProto {
   required int64 cmdId = 3;
 }
 
+/**
+* This command asks the datanode to reconstruct the missing EC containers by
+* using remaining containers from sources.
+*/
+message ReconstructECContainersCommandProto {
+  required int64 containerID = 1;
+  repeated DatanodeDetailsProto sources = 2;

Review Comment:
   I think it could be worth creating a simple type to pass datanodeDetails and replicaIndex together, eg:
   
   ```
   message DatanodeDetailsAndReplicaIndexProto {
       required DatanodeDetailsProto datanodeDetails = 1;
       required int32 replicaIndex = 2;
   }
   ```
   
   Then have `repeated DatanodeDetailsAndReplicaIndexProto sources = 2`.
   
   That way, we always send the two as a pair, and its less likely a mistake can be made serializing or de-serializing with order or one missing etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861973549


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Thats a good point Stephen. my bad, why did I use Long? sigh :-). 
   In the offline, @guihecheng also has the similar or same question. I think I did not use existing ContainerReplica because it has more than what we need (ex: bytes used, keycount, seqID, placeOfBirth, state etc...). Are you ok including with them? Not sure we really use all of them. I am also ok to create small proto structure including only DNDetails and Index. 
   Whats your opinion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r860815910


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   srcNodes Indexes are the indexes where are nodes available. Which are nothing but replica indexes. We could have a map also for srcNode and srcNode Index, but looks like we need to have another custom proto class to represent that. Just avoiding another class for this purpose. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861976432


##########
hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto:
##########
@@ -407,6 +409,20 @@ message ReplicateContainerCommandProto {
   required int64 cmdId = 3;
 }
 
+/**
+* This command asks the datanode to reconstruct the missing EC containers by
+* using remaining containers from sources.
+*/
+message ReconstructECContainersCommandProto {
+  required int64 containerID = 1;
+  repeated DatanodeDetailsProto sources = 2;

Review Comment:
   Ah I got the answer for my above question. Let me create one structure now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
guihecheng commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r858469466


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Hi Uma, the info looks good, I wonder whether there are additional info needed, for example:
   - ReplicaIndexes for sourceDatanodes, maybe we could learn this from the HealthyDNs in the group, but the `readContainer` grpc call don't provides this index for now, so if SCM carries this, it then we don't have to worry about it.
   - The pipelineID may be needed to reconstruct the container, see `ContainerData#originPipelineId`. SCM manages the pipelineID, so it is reasonable for it to deliver this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863821001


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   When we say containerIndex, it means one of the index position container missed. Not saying "MissingContainers" only. Example, we don't call ReplicateReplica, instead we are calling ReplicateContainer. Most of the times, Replica represents, the same copy. In EC each replica is not a same copy. However we tried to reuse parts like ConrainerReplica etc, so named it as ReplicaIndexes there. But ideally they are missing container indexes. So, I don't really see them very different IMO. That is the reason, I just put as missingContainerIndex, as Class name is saying "ReconstructECContainer". Let's me know if that is ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861676265


##########
hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto:
##########
@@ -407,6 +409,20 @@ message ReplicateContainerCommandProto {
   required int64 cmdId = 3;
 }
 
+/**
+* This command asks the datanode to reconstruct the missing EC containers by
+* using remaining containers from sources.
+*/
+message ReconstructECContainersCommandProto {
+  required int64 containerID = 1;
+  repeated DatanodeDetailsProto sources = 2;
+  repeated DatanodeDetailsProto targets = 3;
+  repeated int64 srcNodesIndexes = 4;

Review Comment:
   In the ContainerReplicaProto, the replicaIndex is a int32:
   
   >   optional int32 replicaIndex = 14;



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r860789266


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   @guihecheng Thank you for the review.
   I added the replica indexes now. For the pipelineID, I am not sure we have any significance for the closed containers.
   
   Coming to missing indexes:
   Target nodes needed not be fixed with any indexes. As we try node per rack, there is no strict requirement to place the node with given index only.
   But partial stripe case is very good point. Seems like we don't have a way for SCM to determine at stripe level. This is simply at the container level. Missing index means, the container which is missed at that index position. Stripe level determination should be done by recovery logic, whether that index is required to recover for that stripe or not based on block group len. 
   
   I was thinking to give some additional nodes as targets, just incase if one of the target node is down at the time of starting recovery, we could simply use one of the additional target node. We donlt need this in first version, but that is one thought. So, I feel we donlt need to mix that missing indexes with target node indexes. 
   Missing indexes are nothing but the missing container index positions.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
guihecheng commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r860808885


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Sure, let's keep it as missing container indexes.
   And another small question, any special reason to have `List<Long>` for `srcNodesIndexes`, and `byte[]` for `missingContainerIndexes`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
guihecheng commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r859617085


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Hi Uma, I'd like to add one more nit: `missingContainerIndexes` may better be just `targetIndexes` because when we recover each stripe we also have the concept of `missingIndexes`. But `missingContainerIndexes` and `missingIndexes on the stripe level` are different for partial stripes, so may be `targetIndexes` is better which just means those indexes are assigned to the `targetDatanodes`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863821001


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   When you say containerIndex, it means one of the index position container missed. Not saying "MissingContainer" only. Example, we don't call ReplicateReplica, instead we are calling ReplicateContainer. Most of the times, Replica represents, the same copy. In EC each replica is not a same copy. However we tried to reuse parts like ConrainerReplica etc, so named it as ReplicaIndexes. That is the reason, I just put as missingContainerIndex, as Class name is saying "ReconstructECContainer". Let's me know if that is ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r862098861


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   I think this command can receive the ContainerReplica, but we don't need to send it all over the RPC. Just pull the index and datanodeDetails out of it and send it over. The command on the DN side receives the proto message I think, so its parameters can be different.
   
   In replicationManager, where this command will be triggered, we will have the list of existing ContainerReplica, so they can be passed to this command without any transformation at the replicationManager side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861672953


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Its a bit strange to have a `List<Long>` for `srcIndexes` and `byte[]` for missing? Would it be better to have a `List<Int>` for both to be consistent? I don't think they need to be long, as the index will never be more than 14 for 10-4.
   
   I also worry about some bug resulting in a different size of `srcNodesIndexes` and `srcDatanodes`, or the order being wrong. What about this command receiving `List<ContainerReplica>` - that object contains both the index and datanodeDetails, so we avoid a map and just pass 1 parameter instead of two.
   
   When forming the protobuf, we don't need to send ContainerReplica - we can still send replicaIndex and DatanodeDetails as they are now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861672953


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Its a bit strange to have a List<Long> for srcIndexes and byte[] for missing? Would it be better to have a List<Int> for both to be consistent? I don't think they need to be long, as the index will never be more than 14 for 10-4.
   
   I also worry about some bug resulting in a different size of srcNodesIndexes and srcDatanodes, or the order being wrong. What about this command receiving List<ContainerReplica> - that object contains both the index and datanodeDetails, so we avoid a map and just pass 1 parameter instread of two.
   
   When forming the protobuf, we don't need to send ContainerReplica - we can still send replicaIndex and DatanodeDetails as they are now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861973549


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Thats a good point Stephen. my bad, why did I use Long? sigh :-). 
   In the offline, @guihecheng also has the similar or same question. I think I did not use it because it has more than what we need (ex: bytes used, keycount, seqID, placeOfBirth, state etc...). Are you ok including with them? Not sure we really use all of them. I am also ok to create small proto structure including only DNDetails and Index. 
   Whats your opinion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863659308


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   @umamaheswararao thanks for the work! 
   i am not sure why `byte[]` for missing container index? seems int[] or something like this may be more easy to understand? or is there any special reason for `byte[]`? 
   NIT:  i prefer `missingReplicaIndexes ` instead of `missingContainerIndexes` , since the specific replica of container is missing actually, not container itself. 
   
   for the current implementation of ozone EC , we don`t shuffle the blocks among EC container group , it means some containers will only have blocks of data chunks , and others will only have blocks of parity chunks.  shuflle can scatter the read of data blocks to different datanode, so i believe it has some significance for the performance of the whole ozone cluster , and we need to support it in the future. at that time , the replica index will not be equal to the chunk index in a stripe , and we need some other mechanism to identify the chunk index in a stripe. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
guihecheng commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r858469465


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Hi Uma, the info looks good, I wonder whether there are additional info needed, for example:
   - ReplicaIndexes for sourceDatanodes, maybe we could learn this from the HealthyDNs in the group, but the `readContainer` grpc call don't provides this index for now, so if SCM carries this, it then we don't have to worry about it.
   - The pipelineID may be needed to reconstruct the container, see `ContainerData#originPipelineId`. SCM manages the pipelineID, so it is reasonable for it to deliver this.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Hi Uma, the info looks good, I wonder whether there are additional info needed, for example:
   - ReplicaIndexes for sourceDatanodes, maybe we could learn this from the HealthyDNs in the group, but the `readContainer` grpc call don't provides this index for now, so if SCM carries this, it then we don't have to worry about it.
   - The pipelineID may be needed to reconstruct the container, see `ContainerData#originPipelineId`. SCM manages the pipelineID, so it is reasonable for it to deliver this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r861976432


##########
hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto:
##########
@@ -407,6 +409,20 @@ message ReplicateContainerCommandProto {
   required int64 cmdId = 3;
 }
 
+/**
+* This command asks the datanode to reconstruct the missing EC containers by
+* using remaining containers from sources.
+*/
+message ReconstructECContainersCommandProto {
+  required int64 containerID = 1;
+  repeated DatanodeDetailsProto sources = 2;

Review Comment:
   Ah I got the answer for my above question. Let me create one structure now. Together is better, I agree this avoid some inherent issues. I think, I was just blinded by current pipeline object which will rebuild as separate from nodeOrder and Nodedetails. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r862156263


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   Looks like we need move ContainerReplica to container-service module. Currently its part of scm-server only. 
   Currently commands are part of the container-service module. 
   I am not sure its worth doing that. It may be simple to create command with it's required info. Let me know if that's ok with you. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863884747


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   >so each time a new container is allocated, it gets new nodes in a random order. So over many containers we will see the data fairly evenly spread without having to worry about shuffling writes inside one container group.
   
   @sodonnel thanks for the explanation, yes , at this level of view, no need to worry about the shuffling writes inside one container group. it indeed makes sense。
   
   >Replica represents, the same copy.
   
   @umamaheswararao thanks for the explanation, i get this point , and i am ok with this ,



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863633552


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   OK - probably not worth the effort to start moving classes around to just save a parameter in the method spec.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao merged pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
umamaheswararao merged PR #3345:
URL: https://github.com/apache/ozone/pull/3345


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3345: HDDS-6586: EC: Implement the EC Reconstruction Command with necessary information

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3345:
URL: https://github.com/apache/ozone/pull/3345#discussion_r863668395


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   > shuflle can scatter the read of data blocks to different datanode, so i believe it has some significance for the performance of the whole ozone cluster , and we need to support it in the future. 
   There is no plan to try to support this. It makes things too complex for negligible gains. We only have one container per pipeline, so each time a new container is allocated, it gets new nodes in a random order. So over many containers we will see the data fairly evenly spread without having to worry about shuffling writes inside one container group.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/ReconstructECContainersCommand.java:
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+package org.apache.hadoop.ozone.protocol.commands;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import com.google.protobuf.ByteString;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.ReconstructECContainersCommandProto
+    .Builder;
+import org.apache.hadoop.hdds.protocol.proto
+    .StorageContainerDatanodeProtocolProtos.SCMCommandProto.Type;
+
+import com.google.common.base.Preconditions;
+
+/**
+ * SCM command to request reconstruction of EC containers.
+ */
+public class ReconstructECContainersCommand
+    extends SCMCommand<ReconstructECContainersCommandProto> {
+
+  private final long containerID;
+  private final List<DatanodeDetails> sourceDatanodes;
+  private final List<DatanodeDetails> targetDatanodes;
+  private final byte[] missingContainerIndexes;

Review Comment:
   > shuflle can scatter the read of data blocks to different datanode, so i believe it has some significance for the performance of the whole ozone cluster , and we need to support it in the future. 
   
   There is no plan to try to support this. It makes things too complex for negligible gains. We only have one container per pipeline, so each time a new container is allocated, it gets new nodes in a random order. So over many containers we will see the data fairly evenly spread without having to worry about shuffling writes inside one container group.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org