You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/01/05 00:05:27 UTC

[GitHub] [ozone] kerneltime opened a new pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

kerneltime opened a new pull request #2963:
URL: https://github.com/apache/ozone/pull/2963


   ## What changes were proposed in this pull request?
   
   This is the DB side of changes. It includes
   1. Serialization for ICR and FCR generation.
   2. Collapsing ICR and FCR if part of the same HB. Such that FCR has the most recent changes.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5267
   
   ## How was this patch tested?
   
   ToDo: 
   
   - [ ] add relevant tests 
   - [ ] manually test with load


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1047961461


   Looking at the current code, are things still as they were when I last reviewed? Ie, we check for the presence of any ICRs when sending the full report, and if there are any ICRs, we have to generate the FCR again? I still believe this is going to cause the full report to often be generated twice, as it is quite likely there will be an ICR after the FCR gets generated, but before it gets sent. In fact, as we have many endpoints, it could be regenerated "endPoint" times I think.
   
   In OzoneContainer, we can see that when a new container is constructed, it triggers a heartbeat immediately:
   
   ```
       IncrementalReportSender<Container> icrSender = container -> {
         synchronized (containerSet) {
           ContainerReplicaProto containerReport = container.getContainerReport();
   
           IncrementalContainerReportProto icr = IncrementalContainerReportProto
               .newBuilder()
               .addReport(containerReport)
               .build();
   // Heartbeat is triggered here - sending a full report if needed or just pending ICRs.
           context.addIncrementalReport(icr);  
           context.getParent().triggerHeartbeat();
         }
       };
   ```
   This means that there is up to a 30 second window (heartbeart interval) from when a FCR gets generated until it gets sent, but the arrival of any ICR in that window will cause it to be sent immediately but also refreshed as there will be a new ICR in the queue.
   
   If we change things, so that when we are generating a FCR, we purge all pending ICRs and block new ones getting generated, we avoid this double refresh problem. The heartbeat trigger will first send the FCR, and then any newer ICRs, which must have been generated after it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a change in pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#discussion_r812337062



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
##########
@@ -235,18 +235,20 @@ public void listContainer(long startContainerId, long count,
   public ContainerReportsProto getContainerReport() throws IOException {
     LOG.debug("Starting container report iteration.");
 
+    ContainerReportsProto.Builder crBuilder =
+        ContainerReportsProto.newBuilder();
     // No need for locking since containerMap is a ConcurrentSkipListMap
     // And we can never get the exact state since close might happen
     // after we iterate a point.
     List<Container<?>> containers = new ArrayList<>(containerMap.values());
-
-    ContainerReportsProto.Builder crBuilder =
-        ContainerReportsProto.newBuilder();
-
-    for (Container<?> container: containers) {
-      crBuilder.addReports(container.getContainerReport());
+    // Incremental Container reports can read stale container information

Review comment:
       > but could we get a snapshot of the containers here and meanwhile a new container is added and ICR queued that is not in our snapshot and hence never gets added to the FCR?
   
   Seeing how this is synchronized in StateContext, I don't think this could happen.
   
   I do think we could end up with a scenario where the new container is added to the set, but the ICR is blocked from being queued due to the synchronization. Then we generate teh FCR that contains it, and then we also end up with a duplicate ICR. However I doubt that matters - on SCM it will just process it again, so it should be fine.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1022689526


   Updated with SCM changes, need to add more tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1018075771


   @sodonnel moving it to time of generation seems fine. Will move to the SCM side of changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime edited a comment on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime edited a comment on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1048148901


   @sodonnel this PR was updated based on the following logic
   1. When FCR is generated, it clears all pending ICRs
   2. During the generation, there are no new ICRs that are generated.
   
   Thus, FCR will be generated only once for a given heartbeat.
   The FCR and ICR generation do serialize. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel merged pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel merged pull request #2963:
URL: https://github.com/apache/ozone/pull/2963


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1038696090


   @sodonnel please take a look, the changes have been updated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a change in pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#discussion_r786754897



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
##########
@@ -235,18 +235,39 @@ public void listContainer(long startContainerId, long count,
   public ContainerReportsProto getContainerReport() throws IOException {
     LOG.debug("Starting container report iteration.");
 
+    ContainerReportsProto.Builder crBuilder =
+        ContainerReportsProto.newBuilder();
     // No need for locking since containerMap is a ConcurrentSkipListMap
     // And we can never get the exact state since close might happen
     // after we iterate a point.
     List<Container<?>> containers = new ArrayList<>(containerMap.values());
+    // Incremental Container reports can read stale container information
+    // This is to make sure FCR and ICR can be linearized and processed by
+    // consumers such as SCM.
+    synchronized (this) {
+      for (Container<?> container : containers) {
+        crBuilder.addReports(container.getContainerReport());
+      }
+    }
+    return crBuilder.build();
+  }
 
+  /**
+   * Get container report without lock.
+   *
+   * @return The container report.
+   */
+  public ContainerReportsProto getContainerReportUnlocked() throws IOException {

Review comment:
       I don't think we need this Unlocked method, and someone may mistakenly use it for something in the futher if we leave it. It seems to be used only in StateContext.java where we already synchronize on the ContainerSet, but its fine to call another method which synchronizes on the same object, as the synchronization is reentrant.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime edited a comment on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime edited a comment on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1018075771


   @sodonnel moving it to time of generation seems fine. Will work on the SCM side of changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a change in pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#discussion_r812129713



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java
##########
@@ -235,18 +235,20 @@ public void listContainer(long startContainerId, long count,
   public ContainerReportsProto getContainerReport() throws IOException {
     LOG.debug("Starting container report iteration.");
 
+    ContainerReportsProto.Builder crBuilder =
+        ContainerReportsProto.newBuilder();
     // No need for locking since containerMap is a ConcurrentSkipListMap
     // And we can never get the exact state since close might happen
     // after we iterate a point.
     List<Container<?>> containers = new ArrayList<>(containerMap.values());
-
-    ContainerReportsProto.Builder crBuilder =
-        ContainerReportsProto.newBuilder();
-
-    for (Container<?> container: containers) {
-      crBuilder.addReports(container.getContainerReport());
+    // Incremental Container reports can read stale container information

Review comment:
       Is there a potential race condition on line 243 where the containers array list is formed? The map is concurrent, so we probably cannot get a concurrentModification error, but could we get a snapshot of the containers here and meanwhile a new container is added and ICR queued that is not in our snapshot and hence never gets added to the FCR?
   
   I think we need to synchronize around the forming of this list.
   
   I also wonder why we need to form the new list? I guess we can release any lock faster if we are not forming the protobuf under the lock, but then if we synchronize here to block ICRs, I think we are effectively blocks new container creation anyway. Maybe we can avoid the new array list?
   
   Need to think about this carefully.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1048182797


   I was looking at this PR using the Visual Code editor, and I think it somehow did not show me the most recent commits. As now I have refreshed it, I can see it has changed. I will take another look!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime edited a comment on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime edited a comment on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1016994616






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1048208439


   I think this change looks good. I will give it another check tomorrow to see if I have missed anything.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1048148901


   @sodonnel this PR was updated based on the following logic
   1. When FCR is generated, it clears all pending ICRs
   2. During the generation, there are no new ICRs that are generated.
   Thus, FCR will be generated only once for a given heartbeat.
   The FCR and ICR generation do serialize. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1015548867


   Prior to looking at this PR, I was not too sure how ICR and FCRs get generated and sent. Let me summarise my understanding and please correct me if I am wrong anywhere.
   
   ICRs get generated "on demand" as containers are added, deleted and change state on the datanode. They get added to the ICR queue on the datanode for sending later.
   
   FBRs get generated by the ReportPublisher based on a timer (60 minutes by default). Each time the timer expires, a new FCR is generated and stored into the "type2Reports" map inside StateContext.
   
   There is another map inside StateContext that has an entry for each endpoint receiving reports (multiple SCMs, Recon, etc) and effectively stores a boolean for each report type, indicating if it is ready to send or not.
   
   When the heartbeat is generated, every 30 seconds by default it will do the following for each endpoint:
   
   1. Add the full report if it is available to send, and then toggle the ready-to-send boolean to false.
   
   2. Add any ICRs queued and remove them from the queue.
   
   The issue we are trying to solve here, is that a single heartbeat can have a report which is slightly old, and contains the latest container id of 1000, but there is an ICR with container id 1001. When the report hits SCM, there are two threads processing. One handles ICRs and one FCRs, so the ICR goes first, adds 1001 to SCM, but then the FCR is processed and removes it again.
   
   The solution in this PR, is to synchronize both ICR and FCR generation on the same object (container set). Then when we go to send a report, we check to see if the FCR is going to be sent too. If it is, we check for ICRs, remove them and then "refresh" the FCR, which looks to me like we just generate it again. While it does this, we block any new ICRs and hence new containers being created. The blocking time should not be too long as the number of containers is not too high generally.
   
   I also don't think we care if an ICR contains a duplicate to the FCR, provided it is always processed after the ICR. Eg we send a FCR with container 1000 and then also send a ICR with 1000 too.
   
   My initial thought on the solution here, is that we are almost always going to have to regenerate the FCR - its very likely there will be at least 1 ICR in the heartbeat interval, and that means we are doubling the FCR generation.
   
   On the SCM side, if we can guarantee that FCRs are processed before the ICRs in the same heartbeat and also in order of heartbeats, we don't really care if there are ICRs in the heartbeat, that were generated after the FCR.
   
   Therefore, could we synchronized on generating the FCR (the ReportPublisher task), and generate it and then remove the pending ICRs at the same time under the same lock. Then when we come to send it, we know that the FCR has everything that came before it, and any later ICRs are ok, provided we process them after the FCR on SCM.
   
   If we changed this to work as I suggested above, as things stand, the synchronization is not tight enough to rule out duplicates as we mutate the container object before we lock, and then synchronize on generating the ICR. For example, in KeyValueContainer.java:
   
   ```
     @Override
     public void markContainerForClose(Container container)
         throws IOException {
       container.writeLock();
       try {
         // Move the container to CLOSING state only if it's OPEN
         if (container.getContainerState() == State.OPEN) {
           container.markContainerForClose();  // -> FCR gets lock here.
           sendICR(container);  // -> We only lock here, so FCR *might* see the above change
         }
       } finally {
         container.writeUnlock();
       }
     }
   ```
   
   Its possible we change the container object, then the FCR gets the lock and generates a FCR with the changed state. Then sendICR gets the lock and sends the ICR. If we wanted to avoid duplicated, we would need synchronize before modifying the container. However I am not sure we would need to do that - does it matter if we send later duplicate ICRs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime edited a comment on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime edited a comment on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1022689526


   Updated with SCM changes, need to add more end to end tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #2963: HDDS-5267 Merge ICR and FCR for containers if part of the same HB

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #2963:
URL: https://github.com/apache/ozone/pull/2963#issuecomment-1016994616


   1. We do need to serialize FCR and ICR on a per DN basis at the SCM. Since the HB threads on the DN sends one at a time, serializing them on a per DN basis on SCM should be sufficient.
   2. Let me think about what you said about serializing and discarding ICR at the time of generating FCR vs. at the time of sending the HB (this PR). SCM would need to know if should process ICR first or FCR when a HB contains both. If we discard at the time or reporting, the HB will only contain FCR. 
   
   Overall the difference between FCR and ICR is the knowledge that missing containers can be computed on the SCM and to insure SCM has all the containers. Having 2 completely different code paths for reporting on the same entity and then trying to synchronize them for correctly might not be the best approach. We essentially need a stream of containers for which we send reports to SCM. Periodically that stream should include all containers and one of the reports to SCM should have a flag to indicate that all containers have been reported.. this should simplify the report generation, publishing and consuming.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org