You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "ashishkumar50 (via GitHub)" <gi...@apache.org> on 2023/06/13 16:41:36 UTC

[GitHub] [ozone] ashishkumar50 opened a new pull request, #4655: HDDS-8115. Do not use block count to determine if containers are empty.

ashishkumar50 opened a new pull request, #4655:
URL: https://github.com/apache/ozone/pull/4655

   ## What changes were proposed in this pull request?
   
   Ozone currently uses a block count metadata field for each container replica to determine whether it is empty. This counter can go wrong sometime and also difficult to maintain.
   
   In this PR we simplify check for container flow, which will now make use of isEmpty boolean flag to determine whether container replica is empty or not. When we create container, isEmpty flag starts as false and is set to true when a delete block is processed and the block table is empty. This will ease in making decision for checking empty container and later we can delete those empty container. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-8115
   
   ## How was this patch tested?
   
   Added new test case as well as verified using existing test case.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sumitagrawl commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sumitagrawl (via GitHub)" <gi...@apache.org>.
sumitagrawl commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1221451914


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1298,64 +1298,21 @@ private void deleteInternal(Container container, boolean force)
         // If the container is not empty, it should not be deleted unless the
         // container is being forcefully deleted (which happens when
         // container is unhealthy or over-replicated).
-        if (container.getContainerData().getBlockCount() != 0) {
-          metrics.incContainerDeleteFailedBlockCountNotZero();
+        if (container.hasBlocks()) {
+          metrics.incContainerDeleteFailedNonEmpty();
           LOG.error("Received container deletion command for container {} but" +
                   " the container is not empty with blockCount {}",
               container.getContainerData().getContainerID(),
               container.getContainerData().getBlockCount());
+          // blocks table for future debugging.
+          // List blocks
+          logBlocksIfNonZero(container);
+          // Log chunks
+          logBlocksFoundOnDisk(container);
           throw new StorageContainerException("Non-force deletion of " +
               "non-empty container is not allowed.",
               DELETE_ON_NON_EMPTY_CONTAINER);
         }
-
-        // This is a defensive check to make sure there is no data loss if
-        // 1. There are one or more blocks on the filesystem
-        // 2. There are one or more blocks in the block table
-        // This can lead to false positives as
-        // 1. Chunks written to disk that did not get recorded in RocksDB can
-        //    occur due to failures during write
-        // 2. Blocks that were deleted from blocks table but the deletion of
-        //    the underlying file could not be completed
-        // 3. Failures between files being deleted from disk but not being
-        //    cleaned up.
-        // 4. Bugs in the code.
-        // Blocks stored on disk represent data written by a client and should
-        // be treated with care at the expense of creating artifacts on disk
-        // that  might be unreferenced.
-        // https://issues.apache.org/jira/browse/HDDS-8138 will move the
-        // implementation to only depend on consistency of the chunks folder
-
-        // First check if any files are in the chunks folder. If there are
-        // to help with debugging also dump the blocks table data.
-        if (checkIfNoBlockFiles) {

Review Comment:
   checkIfNoBlockFiles is unused, so can remove member variable and initialization



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1554580608

   @errose28 if you are happy with the most recent changes, please hit the "Approve and run" button, too.  [CI passed](https://github.com/ashishkumar50/ozone/actions/runs/5009952551) in @ashishkumar50's fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1231167551


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   OK - it is fine to remove the compatibility part. I am sure I have seen (and had review comments against my own code) code in the past that does this compatibility thing, which I why I raised it, but if we really don't need to worry about it for now, then that is good to know for the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1589666709

   Where are we with getting this committed? Its been open a long time now.
   
   Also, I am concerned the checks on the DN to remove a container are too conservative. If RocksDB has no entry for a block, I am not sure we should by default check for chunks on disk - with no RocksDB data we cannot access them anyway and the checksum data will be lost (it is stored in RocksDB).
   
   See my comment earlier today and also more detail in https://issues.apache.org/jira/browse/HDDS-8838. This is an example of a container that should be deleted, but it will not delete (even before this change, but I don't think this change will alter that behavior).
   
   I also wonder if the default state for a containerData object should be "empty" and then it gets toggled to "non empty" on the first successful put block to it - that may be one way to catch this scenario, but also it is possible for a container to have an orphan block like this, and other blocks. The other blocks could get removed over time, leaving the container empty except for this. orphan block.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1192089704


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   That makes sense but then we're not checking whether SCM expects a container to be empty on the basis of `ContainerInfo` (0 keys currently) against what the replicas are actually reporting. For example, consider an edge case where some replicas are empty and some are non empty because of arbitrary reasons. If the non empty replicas are lost, then `EmptyContainerHandler` will now mark this container as `EMPTY` on the basis of reports from other replicas. Whereas we probably want to call the container `MISSING`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1199259694


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   @errose28 @sodonnel, I have kept this change for now. So can we go ahead we this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1545204367

   > Somehow github has only run the title check, and not all the others. Could you push an empty commit to this PR trigger the jobs again please?
   
   Pushed empty commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1568829956

   Hi @sodonnel I will try to review this PR soon, sorry to hold it up for so long. If I do not get to it by EOD June 1 then you can merge without me, I don't want to block it longer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212758245


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:
##########
@@ -435,6 +441,257 @@ public void testContainerStatisticsAfterDelete() throws Exception {
     LOG.info(metrics.toString());
   }
 
+  @Test
+  public void testContainerStateAfterDNRestart() throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(10 * 10);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);

Review Comment:
   Updated to use `TestHelper#waitForContainerClose`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel merged pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel merged PR #4655:
URL: https://github.com/apache/ozone/pull/4655


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1549494227

   > 
   
   This will be used by container report, which will read eventually in same time or in the next interval which doesn't have any impact. Now also blockcount becomes 0 and it goes in container report eventually by other thread.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1195815737


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerInfo.java:
##########
@@ -60,6 +60,7 @@ public class ContainerInfo implements Comparator<ContainerInfo>,
   */
   private volatile long usedBytes;
   private long numberOfKeys;
+  private boolean isAllReplicaEmpty;

Review Comment:
   This information can be learned from the replicas. I don't think we should persist it with the main container object. What happens if a non-empty replica shows up after this flag has been set?



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerData.java:
##########
@@ -537,6 +540,18 @@ public long getBlockCount() {
     return this.blockCount.get();
   }
 
+  public boolean isEmpty() {
+    return isEmpty;
+  }

Review Comment:
   HDDS-8142 also added an `isEmpty` method to the Container class itself. Can we reconcile these two methods to check for the flag being set and the presence of any blocks?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   This value should be populated on DN startup when it is loading the containers into memory. There should not be a case where this value is unset since we do not support rolling upgrades with mixed versions yet.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,

Review Comment:
   This is yet a third `isEmpty` method. We don't want any errors when checking if a container is empty, so there should be one clear method to call that handles all the checks.



##########
hadoop-hdds/interface-client/src/main/proto/hdds.proto:
##########
@@ -237,6 +237,7 @@ message ContainerInfoProto {
     optional ReplicationFactor replicationFactor  = 10;
     required ReplicationType replicationType  = 11;
     optional ECReplicationConfig ecReplicationConfig = 12;
+    optional bool isAllReplicaEmpty = 13 [default = true];

Review Comment:
   Similar comment as above, I don't think we should persist this value as part of the main container state in SCM. And it *definitely* should not default to true.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   > Do we need to have an isEmpty flag on the containerInfo. This would be set to true when all the replicas report as empty.
   
   I don't think we should have this. We should only track each piece of information in one place. We track if all the replicas are empty based on the `isEmpty` flag of each replica. Now there would be a duplicate value that may not match the state of all the current replicas. The safest thing is to check the isEmpty flag on all replicas before deleting, making this extra flag redundant.
   
   > For example, consider an edge case where some replicas are empty and some are non empty because of arbitrary reasons. If the non empty replicas are lost, then EmptyContainerHandler will now mark this container as EMPTY on the basis of reports from other replicas.
   
   Having another `areAllEmpty` flag on the `ContainerInfo` does not change this case, because that flag is still set based on all the replicas that SCM sees. It is no different from checking all replicas at the time of container delete.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1199101781


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   What about when we do support rolling upgrades? Someone may be on a version where the DNs don't have the new field and then jump to a future version that does support rolling upgrades. IMO it is safer to try to be compatible, rather than saying "we don't support that" as that support may change. There have been quite a few places in the code where I have seen this sort of compatibility code, and I think it starts a dangerous precedent if we are not saying we don't need to do it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1185824708


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;
+    }
+    try (BlockIterator<BlockData> blockIterator =

Review Comment:
   This PR is being done and added a flag due to inconsistency in block count data, so what brings inconsistency and why that inconsistency will not impact above logic ?



##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationTestUtil.java:
##########
@@ -160,6 +160,7 @@ public static ContainerReplica createContainerReplica(ContainerID containerID,
     builder.setDatanodeDetails(datanodeDetails);
     builder.setSequenceId(0);
     builder.setOriginNodeId(originNodeId);
+    builder.setEmpty(keyCount == 0);

Review Comment:
   Should this be keyCount or blockCount ?



##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/HddsTestUtils.java:
##########
@@ -736,7 +736,8 @@ public static ContainerReplica.ContainerReplicaBuilder getReplicaBuilder(
             .setDatanodeDetails(datanodeDetails)
             .setOriginNodeId(originNodeId).setSequenceId(sequenceId)
             .setBytesUsed(usedBytes)
-            .setKeyCount(keyCount);
+            .setKeyCount(keyCount)

Review Comment:
   Should this be keyCount or blockCount ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1535366245

   Please resolve the conflict


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1545222179

   I triggered CI, it was asking for approval.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1192100457


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   This is a good point. Do we need to have an isEmpty flag on the containerInfo. This would be set to true when all the replicas report as empty. Right now, we are depending on the keyCount reported in the replicas, and when the count drops to zero in all replicas, then we update the keycount in containerInfo to zero. Perhaps we should update an empty flag in a similar way, when non of the replicas are reporting "isEmpty=false".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1192200847


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   > Do we need to have an isEmpty flag on the containerInfo. This would be set to true when all the replicas report as empty.
   
   I agree with this suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1199103576


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -361,6 +361,7 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
         .setKeyCount(replicaProto.getKeyCount())
         .setReplicaIndex(replicaProto.getReplicaIndex())
         .setBytesUsed(replicaProto.getUsed())
+        .setEmpty(replicaProto.getIsEmpty())

Review Comment:
   I'm not sure we should have removed this change, as I think we should try to be compatible between mixed versions, as we don't know when rolling upgrades may happen, or what someone will try for upgrading.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1187259369


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/HddsTestUtils.java:
##########
@@ -736,7 +736,8 @@ public static ContainerReplica.ContainerReplicaBuilder getReplicaBuilder(
             .setDatanodeDetails(datanodeDetails)
             .setOriginNodeId(originNodeId).setSequenceId(sequenceId)
             .setBytesUsed(usedBytes)
-            .setKeyCount(keyCount);
+            .setKeyCount(keyCount)

Review Comment:
   This is in server-scm which doesn't use block count. Here existing test case is just to determine whether replica has some keys or not. If some keys are present then setEmpty as false.



##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationTestUtil.java:
##########
@@ -160,6 +160,7 @@ public static ContainerReplica createContainerReplica(ContainerID containerID,
     builder.setDatanodeDetails(datanodeDetails);
     builder.setSequenceId(0);
     builder.setOriginNodeId(originNodeId);
+    builder.setEmpty(keyCount == 0);

Review Comment:
   This is in server-scm which doesn't use block count. Here existing test case is just to determine whether replica has some keys or not. If some keys are present then setEmpty as false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1535688417

   > Please resolve the conflict
   
   Resolved conflict.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1561923923

   @errose28 Do you want to take another look at this? Most of the last comments were from yourself, so I'd like to give you a chance to check them before we commit. This PR has been open a long time so it would be good to get it done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212191418


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestLegacyReplicationManager.java:
##########
@@ -544,8 +544,7 @@ public void testContainerWithMissingReplicas()
             throws IOException, TimeoutException {
       createContainer(LifeCycleState.CLOSED);
       assertReplicaScheduled(0);
-      assertUnderReplicatedCount(1);
-      assertMissingCount(1);
+      assertEmptyReplicatedCount(1);

Review Comment:
   It looks like the functionality of this test was changed to test something different. Can we restore the old behavior to test missing containers and make a change elsewhere if the test is failing?



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestDeleteContainerHandler.java:
##########
@@ -216,8 +236,6 @@ public void testDeleteNonEmptyContainerDir()
         5 * 2000);
     Assert.assertTrue(!isContainerDeleted(hddsDatanodeService,
         containerId.getId()));
-    Assert.assertEquals(1,
-        metrics.getContainerDeleteFailedNonEmptyDir());

Review Comment:
   We should still check the updated metric here right?



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestDeleteContainerHandler.java:
##########
@@ -301,26 +319,13 @@ public void testDeleteNonEmptyContainerBlockTable()
     Assert.assertFalse(isContainerDeleted(hddsDatanodeService,
         containerId.getId()));
 
-    // Set container blockCount to 0 to mock that it is empty as per RocksDB
-    getContainerfromDN(hddsDatanodeService, containerId.getId())
-        .getContainerData().setBlockCount(0);
-    // Write entries to the block Table.
-    try (DBHandle dbHandle
-             = BlockUtils.getDB(
-                 (KeyValueContainerData)getContainerfromDN(hddsDatanodeService,
-                     containerId.getId()).getContainerData(),
-        conf)) {
-      BlockData blockData = new BlockData(new BlockID(1, 1));
-      dbHandle.getStore().getBlockDataTable().put("block1", blockData);
-    }
-
-    long containerDeleteFailedNonEmptyBefore =
-        metrics.getContainerDeleteFailedNonEmptyDir();

Review Comment:
   This test is `testDeleteNonEmptyContainerBlockTable` but this part that actually modifies the block table has been deleted. It looks like the test no longer does what it says and overlaps with `testDeleteNonEmptyContainerDir`.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:
##########
@@ -435,6 +441,257 @@ public void testContainerStatisticsAfterDelete() throws Exception {
     LOG.info(metrics.toString());
   }
 
+  @Test
+  public void testContainerStateAfterDNRestart() throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(10 * 10);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);
+    // make sure the containers are closed on the dn
+    omKeyLocationInfoGroupList.forEach((group) -> {
+      List<OmKeyLocationInfo> locationInfo = group.getLocationList();
+      locationInfo.forEach(
+          (info) -> cluster.getHddsDatanodes().get(0).getDatanodeStateMachine()
+              .getContainer().getContainerSet()
+              .getContainer(info.getContainerID()).getContainerData()
+              .setState(ContainerProtos.ContainerDataProto.State.CLOSED));
+    });
+
+    ContainerID containerId = ContainerID.valueOf(
+        containerInfos.get(0).getContainerID());
+    // Before restart container state is non-empty
+    Assertions.assertFalse(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+    // Restart DataNode
+    cluster.restartHddsDatanode(0, true);
+
+    // After restart also container state remains non-empty.
+    Assertions.assertFalse(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    // Delete key
+    writeClient.deleteKey(keyArgs);
+    Thread.sleep(10000);
+
+    GenericTestUtils.waitFor(() -> {
+      try {
+        return scm.getContainerManager().getContainerReplicas(
+            containerId).stream().
+            allMatch(replica -> replica.isEmpty());
+      } catch (ContainerNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    },
+        100, 10 * 1000);
+
+    // Container state should be empty now as key got deleted
+    Assertions.assertTrue(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    // Restart DataNode
+    cluster.restartHddsDatanode(0, true);
+    // Container state should be empty even after restart
+    Assertions.assertTrue(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    GenericTestUtils.waitFor(() -> {
+      replicationManager.processAll();
+      ((EventQueue)scm.getEventQueue()).processAll(1000);
+      List<ContainerInfo> infos = scm.getContainerManager().getContainers();
+      try {
+        infos.stream().forEach(container -> {
+          Assertions.assertEquals(HddsProtos.LifeCycleState.DELETED,
+              container.getState());
+          try {
+            Assertions.assertEquals(HddsProtos.LifeCycleState.DELETED,
+                scm.getScmMetadataStore().getContainerTable()
+                    .get(container.containerID()).getState());
+          } catch (IOException e) {
+            Assertions.fail(
+                "Container from SCM DB should be marked as DELETED");
+          }
+        });
+      } catch (Throwable e) {
+        LOG.info(e.getMessage());
+        return false;
+      }
+      return true;
+    }, 500, 30000);
+    LOG.info(metrics.toString());
+  }
+
+  /**
+   * Return the container for the given containerID from the given DN.
+   */
+  private Container getContainerFromDN(HddsDatanodeService hddsDatanodeService,
+                                       long containerID) {
+    return hddsDatanodeService.getDatanodeStateMachine().getContainer()
+        .getContainerSet().getContainer(containerID);
+  }
+
+  @Test
+  public void testContainerDeleteWithInvalidKeyCount()
+      throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(1024 * 1024);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);
+    // make sure the containers are closed on the dn
+    omKeyLocationInfoGroupList.forEach((group) -> {
+      List<OmKeyLocationInfo> locationInfo = group.getLocationList();
+      locationInfo.forEach(
+          (info) -> cluster.getHddsDatanodes().get(0).getDatanodeStateMachine()
+              .getContainer().getContainerSet()
+              .getContainer(info.getContainerID()).getContainerData()
+              .setState(ContainerProtos.ContainerDataProto.State.CLOSED));
+    });
+
+    ContainerStateManager containerStateManager = scm.getContainerManager()
+        .getContainerStateManager();
+    ContainerID containerId = ContainerID.valueOf(
+        containerInfos.get(0).getContainerID());
+    // Get all the replicas state from SCM
+    Set<ContainerReplica> replicas
+        = scm.getContainerManager().getContainerReplicas(containerId);
+
+    // Ensure for all replica isEmpty are false in SCM
+    Assert.assertTrue(scm.getContainerManager().getContainerReplicas(
+            containerId).stream().
+        allMatch(replica -> !replica.isEmpty()));
+
+    // Delete key
+    writeClient.deleteKey(keyArgs);
+    Thread.sleep(5000);
+
+    // Ensure isEmpty are true for all replica after delete key
+    GenericTestUtils.waitFor(() -> {
+      try {
+        return scm.getContainerManager().getContainerReplicas(
+            containerId).stream()
+            .allMatch(replica -> replica.isEmpty());
+      } catch (ContainerNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    },
+        500, 5 * 2000);

Review Comment:
   Once this runs, the container could be deleted at any time. We probably want to explicitly set the replication manager run interval very high so we know it won't run before we invoke it manually.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1298,64 +1298,21 @@ private void deleteInternal(Container container, boolean force)
         // If the container is not empty, it should not be deleted unless the
         // container is being forcefully deleted (which happens when
         // container is unhealthy or over-replicated).
-        if (container.getContainerData().getBlockCount() != 0) {
-          metrics.incContainerDeleteFailedBlockCountNotZero();
+        if (!container.isEmpty()) {

Review Comment:
   Should we change the name of this method so it is not confused with `KeyValueContainerData#isEmpty`? Maybe changing this method to `hasBlocks` or something like that?



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainer.java:
##########
@@ -769,4 +786,93 @@ public void testAutoCompactionSmallSstFile() throws IOException {
       }
     }
   }
+
+  @Test
+  public void testIsEmptyContainerStateWhileImport() throws Exception {
+    long containerId = keyValueContainer.getContainerData().getContainerID();
+    createContainer();
+    long numberOfKeysToWrite = 1;
+    closeContainer();
+    populate(numberOfKeysToWrite);
+
+    //destination path
+    File folderToExport = folder.newFile("export.tar");
+    for (CopyContainerCompression compr : CopyContainerCompression.values()) {
+      TarContainerPacker packer = new TarContainerPacker(compr);
+
+      //export the container
+      try (FileOutputStream fos = new FileOutputStream(folderToExport)) {
+        keyValueContainer
+            .exportContainerData(fos, packer);
+      }
+
+      //delete the original one
+      keyValueContainer.delete();
+
+      //create a new one
+      KeyValueContainerData containerData =
+          new KeyValueContainerData(containerId,
+              keyValueContainerData.getLayoutVersion(),
+              keyValueContainerData.getMaxSize(), UUID.randomUUID().toString(),
+              datanodeId.toString());
+      containerData.setSchemaVersion(keyValueContainerData.getSchemaVersion());
+      KeyValueContainer container = new KeyValueContainer(containerData, CONF);
+
+      HddsVolume containerVolume = volumeChoosingPolicy.chooseVolume(
+          StorageVolumeUtil.getHddsVolumesList(volumeSet.getVolumesList()), 1);
+
+      container.populatePathFields(scmId, containerVolume);

Review Comment:
   Is this step necessary? Shouldn't this be handled by `KeyValueContainer#importContainerData` like in `testEmptyContainerImportExport`?



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:
##########
@@ -435,6 +441,257 @@ public void testContainerStatisticsAfterDelete() throws Exception {
     LOG.info(metrics.toString());
   }
 
+  @Test
+  public void testContainerStateAfterDNRestart() throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(10 * 10);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);

Review Comment:
   Can we use `TestHelper#waitForContainerClose` to make this more reliable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1191996901


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   Why remove the check for number of keys being zero?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1192245188


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   @siddhantsangwan, @sodonnel Thanks for pointing out. I have added flag for container info as well to determine whether all the replicas are empty. Can you please look again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212764409


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestDeleteContainerHandler.java:
##########
@@ -301,26 +319,13 @@ public void testDeleteNonEmptyContainerBlockTable()
     Assert.assertFalse(isContainerDeleted(hddsDatanodeService,
         containerId.getId()));
 
-    // Set container blockCount to 0 to mock that it is empty as per RocksDB
-    getContainerfromDN(hddsDatanodeService, containerId.getId())
-        .getContainerData().setBlockCount(0);
-    // Write entries to the block Table.
-    try (DBHandle dbHandle
-             = BlockUtils.getDB(
-                 (KeyValueContainerData)getContainerfromDN(hddsDatanodeService,
-                     containerId.getId()).getContainerData(),
-        conf)) {
-      BlockData blockData = new BlockData(new BlockID(1, 1));
-      dbHandle.getStore().getBlockDataTable().put("block1", blockData);
-    }
-
-    long containerDeleteFailedNonEmptyBefore =
-        metrics.getContainerDeleteFailedNonEmptyDir();

Review Comment:
   Since we have already created key before and not deleted, Key will be in block table. So again writing into rocks db block table may not be required. We are just deleting blocks from the disk here. In this case block table has entry but disk don't have, So in this case container will not be empty and delete is not allowed unless we do force delete.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1534854799

   @errose28 , Can you please help to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1185668569


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;

Review Comment:
   When container is open it is considered as not empty. When in any other state(other than open state) we are checking in block table to see if there are any block present and then we decide container is empty or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1221766204


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1298,64 +1298,21 @@ private void deleteInternal(Container container, boolean force)
         // If the container is not empty, it should not be deleted unless the
         // container is being forcefully deleted (which happens when
         // container is unhealthy or over-replicated).
-        if (container.getContainerData().getBlockCount() != 0) {
-          metrics.incContainerDeleteFailedBlockCountNotZero();
+        if (container.hasBlocks()) {
+          metrics.incContainerDeleteFailedNonEmpty();
           LOG.error("Received container deletion command for container {} but" +
                   " the container is not empty with blockCount {}",
               container.getContainerData().getContainerID(),
               container.getContainerData().getBlockCount());
+          // blocks table for future debugging.
+          // List blocks
+          logBlocksIfNonZero(container);
+          // Log chunks
+          logBlocksFoundOnDisk(container);
           throw new StorageContainerException("Non-force deletion of " +
               "non-empty container is not allowed.",
               DELETE_ON_NON_EMPTY_CONTAINER);
         }
-
-        // This is a defensive check to make sure there is no data loss if
-        // 1. There are one or more blocks on the filesystem
-        // 2. There are one or more blocks in the block table
-        // This can lead to false positives as
-        // 1. Chunks written to disk that did not get recorded in RocksDB can
-        //    occur due to failures during write
-        // 2. Blocks that were deleted from blocks table but the deletion of
-        //    the underlying file could not be completed
-        // 3. Failures between files being deleted from disk but not being
-        //    cleaned up.
-        // 4. Bugs in the code.
-        // Blocks stored on disk represent data written by a client and should
-        // be treated with care at the expense of creating artifacts on disk
-        // that  might be unreferenced.
-        // https://issues.apache.org/jira/browse/HDDS-8138 will move the
-        // implementation to only depend on consistency of the chunks folder
-
-        // First check if any files are in the chunks folder. If there are
-        // to help with debugging also dump the blocks table data.
-        if (checkIfNoBlockFiles) {

Review Comment:
   Removed, Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1593275243

   > LGTM. We still need to resolve [this thread](https://github.com/apache/ozone/pull/4655#discussion_r1195824586) on upgrades before deciding to merge though.
   
   @errose28, This conversation has been resolved. As currently we don't support rolling upgrade, incompatible code fix has been removed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] Xushaohong commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "Xushaohong (via GitHub)" <gi...@apache.org>.
Xushaohong commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1186132844


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;

Review Comment:
   Got the intention now. 
   One problem seems to be the inconsistent block count across replicas in some cases. Using the real block count from DB instead of the in-memory block count could determine the real block count in the container and then help SCM to delete empty containers. Is it right?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1187255006


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;
+    }
+    try (BlockIterator<BlockData> blockIterator =

Review Comment:
   We keep block count metadata information separately about container, which can go wrong sometime due to various reasons. Now we decouple metadata information to judge if container is empty or not by checking directly block table if any block exists or not for the particular container.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1187251218


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;

Review Comment:
   Yes keeping block count metadata can go wrong sometime which leads to wrong interpretation of empty container. 
   Now we check if any real block exists in container(not counting all block count), Which is done in isEmpty() method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1190245279


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -361,6 +361,7 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
         .setKeyCount(replicaProto.getKeyCount())
         .setReplicaIndex(replicaProto.getReplicaIndex())
         .setBytesUsed(replicaProto.getUsed())
+        .setEmpty(replicaProto.getIsEmpty())

Review Comment:
   Thanks for the review @sodonnel, I have handled this scenario now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel closed pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel closed pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.
URL: https://github.com/apache/ozone/pull/4655


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1228797806


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -181,21 +182,37 @@ public static void removeContainer(KeyValueContainerData containerData,
 
   /**
    * Returns if there are no blocks in the container.
+   * @param store DBStore
    * @param containerData Container to check
+   * @param bCheckChunksFilePath Whether to check chunksfilepath has any blocks
    * @return true if the directory containing blocks is empty
    * @throws IOException
    */
-  public static boolean noBlocksInContainer(KeyValueContainerData
-                                                containerData)
+  public static boolean noBlocksInContainer(DatanodeStore store,
+                                            KeyValueContainerData
+                                            containerData,
+                                            boolean bCheckChunksFilePath)
       throws IOException {
+    Preconditions.checkNotNull(store);
     Preconditions.checkNotNull(containerData);
-    File chunksPath = new File(containerData.getChunksPath());
-    Preconditions.checkArgument(chunksPath.isDirectory());
-
-    try (DirectoryStream<Path> dir
-             = Files.newDirectoryStream(chunksPath.toPath())) {
-      return !dir.iterator().hasNext();
+    if (containerData.isOpen()) {
+      return false;
+    }
+    try (BlockIterator<BlockData> blockIterator =
+             store.getBlockIterator(containerData.getContainerID())) {
+      if (blockIterator.hasNext()) {
+        return false;
+      }
     }
+    if (bCheckChunksFilePath) {

Review Comment:
   We can resolve this by defaulting `hdds.datanode.check.empty.container.delete` to false. That might better in a separate PR. We could also change the config name to specify it only controls checking the directory. RocksDB will always be checked.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212756769


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1298,64 +1298,21 @@ private void deleteInternal(Container container, boolean force)
         // If the container is not empty, it should not be deleted unless the
         // container is being forcefully deleted (which happens when
         // container is unhealthy or over-replicated).
-        if (container.getContainerData().getBlockCount() != 0) {
-          metrics.incContainerDeleteFailedBlockCountNotZero();
+        if (!container.isEmpty()) {

Review Comment:
   Updated to `hasBlocks`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212757086


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestDeleteContainerHandler.java:
##########
@@ -216,8 +236,6 @@ public void testDeleteNonEmptyContainerDir()
         5 * 2000);
     Assert.assertTrue(!isContainerDeleted(hddsDatanodeService,
         containerId.getId()));
-    Assert.assertEquals(1,
-        metrics.getContainerDeleteFailedNonEmptyDir());

Review Comment:
   Added updated metric.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1199248358


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -361,6 +361,7 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
         .setKeyCount(replicaProto.getKeyCount())
         .setReplicaIndex(replicaProto.getReplicaIndex())
         .setBytesUsed(replicaProto.getUsed())
+        .setEmpty(replicaProto.getIsEmpty())

Review Comment:
   Agree, If rolling upgrade is planned now or in future this can impact. I think we can handle this change now. This change is an extra precaution and doesn't impact anything else.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1190014508


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -361,6 +361,7 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
         .setKeyCount(replicaProto.getKeyCount())
         .setReplicaIndex(replicaProto.getReplicaIndex())
         .setBytesUsed(replicaProto.getUsed())
+        .setEmpty(replicaProto.getIsEmpty())

Review Comment:
   I wonder if there is a compatibility problem between versions here?
   
   If the DN software is newer than SCM, then things would work as before. That is fine.
   
   If SCM is upgraded to have this logic, but the DN is on an old version, it will never send "empty=true". and I think that if the container becomes empty at this time, it will never set empty=true later. So we could get a scenario where the container goes empty on the DN, the ICR is sent, but this logic defaults the isEmpty to false and then the container never gets its remove triggered by SCM.
   
   I think we need to check if the DN is sending isEmpty(), and if it is not, put back the old logic here, eg something like:
   
   ```
   if (replicaProto.hasIsEmpty()) {
       builder.setEmpty(replicaProto.getIsEmpty())
   } else {
       if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed == 0) {
           builder.setEmpty(true);
       } else {
           builder.setEmpty(false);
       }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1544589568

   Somehow github has only run the title check, and not all the others. Could you push an empty commit to this PR trigger the jobs again please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1552329317

   > Thanks for taking this up @ashishkumar50. I have some comments on the delete flag handling on SCM. Also the following cases need test coverage:
   > 
   > * isEmpty flag is correctly set on a container when it is imported as part of replication.
   > * isEmpty flag is correctly set on a container after loading it from memory (restart)
   > * SCM will only send delete command when isEmpty is true on all replicas, regardless of block count or other metadata.
   
   Hi @errose28, Thanks for the review. All review comments are addressed and test cases added. Can you please look again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1590191032

   > I also wonder if the default state for a containerData object should be "empty" and then it gets toggled to "non empty" on the first successful put block to it - that may be one way to catch this scenario, 
   
   This would complicate SCM code. Instead of sending delete commands to containers with the empty flag set, SCM would have to figure out if the container is empty because it needs to be deleted, or empty because it has not accepted any writes yet. I guess the allocated state could be used to make this distinction, but it would add another complication on the delete path. It also does not resolve your other point below.
   
   > it is possible for a container to have an orphan block like this, and other blocks. The other blocks could get removed over time, leaving the container empty except for this. orphan block.
   
   As I mentioned above, I think changing the default value of the config to check the directory on container delete is the best solution here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212427230


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   Compatibility guarantees have to be strictly defined and tested. For example, we support non-rolling upgrades from all versions since 1.0.0 and non-roling downgrades pre-finalization to all versions since 1.1.0. We have tests for this and require changes to adhere to these rules. The compatibility code you are referring to is to comply to these guidelines.
   
   Rolling upgrade has not been planned yet. When it is, we will define an earliest version that supports rolling upgrade, tests, and requirements that developers must follow, just like we have for non-rolling upgrade. Without this, enforcement of rolling upgrade compatibility will be arbitrary. We cannot attempt to support the feature in this state. There are also issues that guarantee no code getting checked in to master right now will be eligible for rolling upgrade support:
   
   - OM HA commit model is incompatible with rolling upgrades. There may be other areas of the code as well that currently prevent us from doing rolling upgrades from this version, but we have not done an invovled investigation yet.
   
   - There are no test suites to catch any rolling upgrade incompatibilities that are introduced, so it is impossible to guarantee the current version would work with a rolling upgrade. If it happend to work it would be purely by coincidence.
   
   - There is no agreement among developers on rolling upgrade version guarantees so requirements are not being enforced in code reviews.
   
   Once the rolling upgrade feature is proposed and requirements are defined then we will have explicit rules to guide changes and versions. Without this framework in place, enforcing rolling upgrade requirements in PRs just leads to confusing code, because it is documenting and designing for a situation that cannot happen. For that reason I propose removing the extra check, even though it does not affect correctness in the current version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212757648


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:
##########
@@ -435,6 +441,257 @@ public void testContainerStatisticsAfterDelete() throws Exception {
     LOG.info(metrics.toString());
   }
 
+  @Test
+  public void testContainerStateAfterDNRestart() throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(10 * 10);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);
+    // make sure the containers are closed on the dn
+    omKeyLocationInfoGroupList.forEach((group) -> {
+      List<OmKeyLocationInfo> locationInfo = group.getLocationList();
+      locationInfo.forEach(
+          (info) -> cluster.getHddsDatanodes().get(0).getDatanodeStateMachine()
+              .getContainer().getContainerSet()
+              .getContainer(info.getContainerID()).getContainerData()
+              .setState(ContainerProtos.ContainerDataProto.State.CLOSED));
+    });
+
+    ContainerID containerId = ContainerID.valueOf(
+        containerInfos.get(0).getContainerID());
+    // Before restart container state is non-empty
+    Assertions.assertFalse(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+    // Restart DataNode
+    cluster.restartHddsDatanode(0, true);
+
+    // After restart also container state remains non-empty.
+    Assertions.assertFalse(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    // Delete key
+    writeClient.deleteKey(keyArgs);
+    Thread.sleep(10000);
+
+    GenericTestUtils.waitFor(() -> {
+      try {
+        return scm.getContainerManager().getContainerReplicas(
+            containerId).stream().
+            allMatch(replica -> replica.isEmpty());
+      } catch (ContainerNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    },
+        100, 10 * 1000);
+
+    // Container state should be empty now as key got deleted
+    Assertions.assertTrue(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    // Restart DataNode
+    cluster.restartHddsDatanode(0, true);
+    // Container state should be empty even after restart
+    Assertions.assertTrue(getContainerFromDN(
+        cluster.getHddsDatanodes().get(0), containerId.getId())
+        .getContainerData().isEmpty());
+
+    GenericTestUtils.waitFor(() -> {
+      replicationManager.processAll();
+      ((EventQueue)scm.getEventQueue()).processAll(1000);
+      List<ContainerInfo> infos = scm.getContainerManager().getContainers();
+      try {
+        infos.stream().forEach(container -> {
+          Assertions.assertEquals(HddsProtos.LifeCycleState.DELETED,
+              container.getState());
+          try {
+            Assertions.assertEquals(HddsProtos.LifeCycleState.DELETED,
+                scm.getScmMetadataStore().getContainerTable()
+                    .get(container.containerID()).getState());
+          } catch (IOException e) {
+            Assertions.fail(
+                "Container from SCM DB should be marked as DELETED");
+          }
+        });
+      } catch (Throwable e) {
+        LOG.info(e.getMessage());
+        return false;
+      }
+      return true;
+    }, 500, 30000);
+    LOG.info(metrics.toString());
+  }
+
+  /**
+   * Return the container for the given containerID from the given DN.
+   */
+  private Container getContainerFromDN(HddsDatanodeService hddsDatanodeService,
+                                       long containerID) {
+    return hddsDatanodeService.getDatanodeStateMachine().getContainer()
+        .getContainerSet().getContainer(containerID);
+  }
+
+  @Test
+  public void testContainerDeleteWithInvalidKeyCount()
+      throws Exception {
+    ReplicationManager replicationManager = scm.getReplicationManager();
+    String volumeName = UUID.randomUUID().toString();
+    String bucketName = UUID.randomUUID().toString();
+
+    String value = RandomStringUtils.random(1024 * 1024);
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    String keyName = UUID.randomUUID().toString();
+    OzoneOutputStream out = bucket.createKey(keyName,
+        value.getBytes(UTF_8).length, ReplicationType.RATIS,
+        ReplicationFactor.THREE, new HashMap<>());
+    out.write(value.getBytes(UTF_8));
+    out.close();
+
+    OmKeyArgs keyArgs = new OmKeyArgs.Builder().setVolumeName(volumeName)
+        .setBucketName(bucketName).setKeyName(keyName).setDataSize(0)
+        .setReplicationConfig(
+            RatisReplicationConfig
+                .getInstance(HddsProtos.ReplicationFactor.THREE))
+        .build();
+    List<OmKeyLocationInfoGroup> omKeyLocationInfoGroupList =
+        om.lookupKey(keyArgs).getKeyLocationVersions();
+    Thread.sleep(5000);
+    List<ContainerInfo> containerInfos =
+        scm.getContainerManager().getContainers();
+    final int valueSize = value.getBytes(UTF_8).length;
+    final int keyCount = 1;
+    containerInfos.stream().forEach(container -> {
+      Assertions.assertEquals(valueSize, container.getUsedBytes());
+      Assertions.assertEquals(keyCount, container.getNumberOfKeys());
+    });
+
+    OzoneTestUtils.closeAllContainers(scm.getEventQueue(), scm);
+    // Wait for container to close
+    Thread.sleep(2000);
+    // make sure the containers are closed on the dn
+    omKeyLocationInfoGroupList.forEach((group) -> {
+      List<OmKeyLocationInfo> locationInfo = group.getLocationList();
+      locationInfo.forEach(
+          (info) -> cluster.getHddsDatanodes().get(0).getDatanodeStateMachine()
+              .getContainer().getContainerSet()
+              .getContainer(info.getContainerID()).getContainerData()
+              .setState(ContainerProtos.ContainerDataProto.State.CLOSED));
+    });
+
+    ContainerStateManager containerStateManager = scm.getContainerManager()
+        .getContainerStateManager();
+    ContainerID containerId = ContainerID.valueOf(
+        containerInfos.get(0).getContainerID());
+    // Get all the replicas state from SCM
+    Set<ContainerReplica> replicas
+        = scm.getContainerManager().getContainerReplicas(containerId);
+
+    // Ensure for all replica isEmpty are false in SCM
+    Assert.assertTrue(scm.getContainerManager().getContainerReplicas(
+            containerId).stream().
+        allMatch(replica -> !replica.isEmpty()));
+
+    // Delete key
+    writeClient.deleteKey(keyArgs);
+    Thread.sleep(5000);
+
+    // Ensure isEmpty are true for all replica after delete key
+    GenericTestUtils.waitFor(() -> {
+      try {
+        return scm.getContainerManager().getContainerReplicas(
+            containerId).stream()
+            .allMatch(replica -> replica.isEmpty());
+      } catch (ContainerNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    },
+        500, 5 * 2000);

Review Comment:
   Added high time interval for RM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] Xushaohong commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "Xushaohong (via GitHub)" <gi...@apache.org>.
Xushaohong commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1185658173


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -392,6 +397,24 @@ private static void initializeUsedBytesAndBlockCount(DatanodeStore store,
     kvData.setBlockCount(blockCount);
   }
 
+  /**
+   * A container is empty if:
+   * - The container is closed
+   * - There are no blocks in its block table.
+   *
+   * Empty containers are eligible for deletion.
+   */
+  public static boolean isEmpty(DatanodeStore store,
+      KeyValueContainerData container) throws IOException {
+    if (container.isOpen()) {
+      return false;

Review Comment:
   Why the container is considered to be empty when it is not open?  It is common that the container has blocks when it transitions into states like  CLOSING.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1594028681

   > Thanks for the continuous updates @ashishkumar50. LGTM. I triggered CI and we should be good to merge when it passes.
   > 
   > Can you do a follow up PR to change the default value of the directory check to false, leaving only the RocksDB check? It should hopefully be a small change.
   
   Ok @errose28, I will create a Jira and raise PR for that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1228397909


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -181,21 +182,37 @@ public static void removeContainer(KeyValueContainerData containerData,
 
   /**
    * Returns if there are no blocks in the container.
+   * @param store DBStore
    * @param containerData Container to check
+   * @param bCheckChunksFilePath Whether to check chunksfilepath has any blocks
    * @return true if the directory containing blocks is empty
    * @throws IOException
    */
-  public static boolean noBlocksInContainer(KeyValueContainerData
-                                                containerData)
+  public static boolean noBlocksInContainer(DatanodeStore store,
+                                            KeyValueContainerData
+                                            containerData,
+                                            boolean bCheckChunksFilePath)
       throws IOException {
+    Preconditions.checkNotNull(store);
     Preconditions.checkNotNull(containerData);
-    File chunksPath = new File(containerData.getChunksPath());
-    Preconditions.checkArgument(chunksPath.isDirectory());
-
-    try (DirectoryStream<Path> dir
-             = Files.newDirectoryStream(chunksPath.toPath())) {
-      return !dir.iterator().hasNext();
+    if (containerData.isOpen()) {
+      return false;
+    }
+    try (BlockIterator<BlockData> blockIterator =
+             store.getBlockIterator(containerData.getContainerID())) {
+      if (blockIterator.hasNext()) {
+        return false;
+      }
     }
+    if (bCheckChunksFilePath) {

Review Comment:
   Today I came across a scenario:
   
   1. Client tries to write an EC block, but at roughly the same time a node is stopped it was about to write to.
   2. The client tries to write out the stripe to the data and parity nodes, but if fails to goto all the nodes as one node is down. The client notices this and abandons the stripe and writes the data again to a new container.
   3. Now we have a case were some containers have a chunk written, but no "put block" got executed so RocksDB has no entry to the chunk / blocks.
   4. The container is reported to SCM and closed (trigged by the stale node handler closing the pipeline) with zero keys and zero bytes (due to the missing put block).
   5. Goes into delete handing in SCM, which sends a delete command over and over, as it is skipped by the DN due to the container not being empty.
   
   Feels like the existing check is too conservative, as this is a scenario that is valid to happen, and we need to be able to remove these replicas.
   
   I am wondering if the changes here would help or would the behavior be the same, leaving the container stuck in the DELETING state forever?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1229007856


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java:
##########
@@ -370,6 +384,26 @@ private void updateContainerReplica(final DatanodeDetails datanodeDetails,
     }
   }
 
+  /**
+   * Determines whether container replica is empty or not.
+   *
+   * @param replicaProto container replica details.
+   * @return true if container replica is empty else false
+   */
+  private boolean fillEmpty(final ContainerReplicaProto replicaProto) {
+    if (replicaProto.hasIsEmpty()) {
+      return replicaProto.getIsEmpty();
+    } else {
+      // Handled when DN version is old then there will not be isEmpty field.
+      // In this case judge container empty based on keyCount
+      // and bytesUsed field.
+      if (replicaProto.getKeyCount() == 0 && replicaProto.getUsed() == 0) {
+        return true;
+      }

Review Comment:
   Since rolling upgrade is not yet supported, Once it starts supporting that will be first version and future code needs to handle accordingly. So before support there will not be recommendation to do rolling upgrade. 
   @sodonnel @errose28, Please confirm shall I remove this compatibility change? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #4655:
URL: https://github.com/apache/ozone/pull/4655#issuecomment-1542327900

   This change looks mostly good. I have added one comment about a potential compatibility issue we might have, but it should be easily fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212780740


##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestLegacyReplicationManager.java:
##########
@@ -544,8 +544,7 @@ public void testContainerWithMissingReplicas()
             throws IOException, TimeoutException {
       createContainer(LifeCycleState.CLOSED);
       assertReplicaScheduled(0);
-      assertUnderReplicatedCount(1);
-      assertMissingCount(1);
+      assertEmptyReplicatedCount(1);

Review Comment:
   Added check in `isContainerEmpty()` to also verify replica should not be empty. Restored this test case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1212731462


##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainer.java:
##########
@@ -769,4 +786,93 @@ public void testAutoCompactionSmallSstFile() throws IOException {
       }
     }
   }
+
+  @Test
+  public void testIsEmptyContainerStateWhileImport() throws Exception {
+    long containerId = keyValueContainer.getContainerData().getContainerID();
+    createContainer();
+    long numberOfKeysToWrite = 1;
+    closeContainer();
+    populate(numberOfKeysToWrite);
+
+    //destination path
+    File folderToExport = folder.newFile("export.tar");
+    for (CopyContainerCompression compr : CopyContainerCompression.values()) {
+      TarContainerPacker packer = new TarContainerPacker(compr);
+
+      //export the container
+      try (FileOutputStream fos = new FileOutputStream(folderToExport)) {
+        keyValueContainer
+            .exportContainerData(fos, packer);
+      }
+
+      //delete the original one
+      keyValueContainer.delete();
+
+      //create a new one
+      KeyValueContainerData containerData =
+          new KeyValueContainerData(containerId,
+              keyValueContainerData.getLayoutVersion(),
+              keyValueContainerData.getMaxSize(), UUID.randomUUID().toString(),
+              datanodeId.toString());
+      containerData.setSchemaVersion(keyValueContainerData.getSchemaVersion());
+      KeyValueContainer container = new KeyValueContainer(containerData, CONF);
+
+      HddsVolume containerVolume = volumeChoosingPolicy.chooseVolume(
+          StorageVolumeUtil.getHddsVolumesList(volumeSet.getVolumesList()), 1);
+
+      container.populatePathFields(scmId, containerVolume);

Review Comment:
   We need that you can look in `KeyValueHandler#importContainer`
   we first call `populateContainerPathFields(...)`
   and then `container.importContainerData(...)`
   Even the other test(`testContainerImportExport`) has done the same.
   
   In `testEmptyContainerImportExport` we are creating new container, and then reusing the same which was already initialised in `createContainer()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ashishkumar50 commented on a diff in pull request #4655: HDDS-8115. Do not use block count to determine if containers are empty.

Posted by "ashishkumar50 (via GitHub)" <gi...@apache.org>.
ashishkumar50 commented on code in PR #4655:
URL: https://github.com/apache/ozone/pull/4655#discussion_r1192046443


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/EmptyContainerHandler.java:
##########
@@ -88,9 +88,9 @@ public boolean handle(ContainerCheckRequest request) {
   private boolean isContainerEmptyAndClosed(final ContainerInfo container,
       final Set<ContainerReplica> replicas) {
     return container.getState() == HddsProtos.LifeCycleState.CLOSED &&
-        container.getNumberOfKeys() == 0 && replicas.stream()

Review Comment:
   This is the intention of change, We don't want to depend on keyCount(which can be non zero even though container is empty) to determine whether container is empty or not, in some situation keyCount has gone wrong. Instead now we use empty flag for checking whether replica is empty or not based on the results from all replica, and that will judge container is empty or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org