You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/04/28 01:37:41 UTC

[GitHub] [ozone] JacksonYao287 opened a new pull request, #3360: HDDS-6663. remove container reference from scm if its state is DELETED

JacksonYao287 opened a new pull request, #3360:
URL: https://github.com/apache/ozone/pull/3360

   ## What changes were proposed in this pull request?
   
   when the state of a container in scm is deleting , and the all the replica of this container is deleted from the corresponding datanode, a CLEANUP event will be fired in scm, and the state of the container in scm will be changed to DELETED. when the state of a container is DELETED, the reference should be removed from scm.  
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6663
   
   ## How was this patch tested?
   
   UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
errose28 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1287340191

   Hi @JacksonYao287 @sodonnel I still have some concerns about the way this change is implemented.
   
   ## No clear problem statement
   
   Is the motive to reduce memory usage or RocksDB usage in SCM? If it is memory usage, then we can leave the deleted containers in RocksDB only and not load them into the `ContainerStateMap`. If it is to reduce RocksDB footprint I think this might be a pre-mature optimization as I'm not aware of any RocksDB scalability issues encountered on SCM currently. 
   
   ## Changing the config key fragments the container delete path
   
   Containers deleted before this change is applied remain in SCM forever. Similarly if the config is turned off and on and containers are deleted, they will not be cleaned up when the config is restored to on. I guess this is more of a debugging inconvenience than an error, but will make examining existing clusters confusing none the less.
   
   ## No integration with unknown container handling
   
   - This change only works when unknown container handling is set to delete. There would need to be some validation that the two configs are set in a way that works with each other, or they would need to be combined to a new config.
   
   - Even if the configs are unified, turning the unified config on then off will cause problems. SCM would log reports of unknown containers that were deleted while the config was on. Currently this log only happens when there is a bug in the system, but now there is no way to know whether the config changes or a bug caused the unknown containers.
   
   ## Confusion about container delete vs. block delete
   
   This is not really a comment on the code but a response to some of the discussion on this PR. The second half of[ this comment](https://github.com/apache/ozone/pull/3360#issuecomment-1117404215) from Stephen and [this comment](https://github.com/apache/ozone/pull/3360#issuecomment-1119204766) from Jackson are definitely valid points, but they concern block deletion, not container deletion. Changes proposed here will not solve these problems. A change in this area will only affect SCM memory usage or rocksDB usage with potential debugging tradeoffs.
   
   IMO the best thing to do is:
   - Do not load the deleted containers into SCM memory. Leave them in RocksDB to be queried if needed.
   - No config key for this behavior.
   - Leave unknown container handling as is.
       - The code path would only be triggered if there is a bug in the system.
       - In this case the config can be used to resolve the issue if it is determined the containers are not relevant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
errose28 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1308066256

   Hi @JacksonYao287 thanks for continuing the work on this. I do still have some concerns with this proposal though.
   
   > first, in this patch, DELETED container is removed from scm by default(this can be configured). and for those unknown container , no matter it is empty or not, we will deleted it by default(this can be configured).
   
   This doesn't address my [original point](https://github.com/apache/ozone/pull/3360#issuecomment-1287340191) that these two configs conflict for some values. I also do not think that config keys should be used to fragment core internal operations like deletion. Ozone only needs one delete path and we should be confident that the one we choose works. If we are not confident in this delete container proposal and are trying to hide it behind a config then that is a sign we should not go forward with the change.
   
   > although Deleting a non-empty unknown container by default is an aggressive operation, i think it makes sense here. in development environment, we can set the default operation to log WARN, so it give us a chance to investigate the problem.
   
   This assumes the issue shows up in dev first, which may not be the case. There's so many variables in an Ozone deployment it is possible that the dev environments work fine, but the perfect storm of issues does not happen until some production deployment. Having debugged multiple dev and prod Ozone deployments I can confirm this does happen for corner case bugs. This is why I have been advocating for a more cautious approach.
   
   > my idea here is after container is closed, only deleteblockLog can shrink the container size, not calculateUsage in containerReport handler. the size of container is reduced only after the deleteTransaction is committed in deleteblockLog. so even if a container replica has some orphan blocks and a bigger size, scm has a definite size of the container. when the size of a container in scm is shrinked to zero, scm will delete all the replicas of the container.
   
   We should use block count to determine if the container is empty, as used bytes is not exact like you mentioned. Block count is currently used by SCM to determine if a container is empty. There are some invariants that I believe make it impossible to solve this problem without using Recon:
   - Deleted block log entries must eventually be deleted.
   - SCM cannot know all possible replicas at a given time. An older one could be resurrected at any time.
   - SCM's delete block API used by OM must be idempotent.
   
   Given these properties I believe we cannot know the definite size of the container once blocks start getting deleted. We can only know the current size of each replica. For example, say a block is deleted from all known container replicas and SCM's "definite" block count decreases by 1. SCM removes the entry from the block log. Now maybe OM missed the ack for that delete so it resends it. SCM would repeat the same steps, incorrectly decrementing the "definite" block count again. It has no permanent record of which blocks were already deleted. Since old datanodes can be resurrected, they are also not a source of truth for which block deletions have already been issued by SCM.
   
   In summary, I feel that my [proposal](https://github.com/apache/ozone/pull/3360#issuecomment-1297759990) has the following advantages:
   
   - No config changes
   Existing config keys remain with their original defaults and functionality. The only difference is that `hdds.scm.unknown-container.action` now only applies to unknown *and empty* containers. No new configs need to be added. No need to define behavior for unintuitive configuration combinations.
   
   - Only confirmed empty containers are deleted.
   This minimizes the risk of data loss due to current code bugs or future regressions that could for any reason cause SCM to "forget" a container, or datanodes to misreport a container.
   
   The disadvantage of my proposal is that for the [block deletion corner case](https://github.com/apache/ozone/pull/3360#issuecomment-1119204766) discussed earlier, containers with those blocks will remain in the system. However this is true even today with the current container delete flow, since all replicas must have zero block count for deletion to happen. The orphan block problem is a block problem, not a container problem. IMO this means it should not be solved as an extension of container deletion by deleting non-empty containers. I think orphan block cleanup with Recon is the correct answer to this, even if it may not be implemented in the near future.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1398484867

   /pending


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1286417215

   @sodonnel @errose28 do you have any other comments ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1118807667

   thanks @sodonnel for the comments, i agree with your opinion.
   >Then the SCM memory will get used with not just empty container references, but far more containers than we expected (small container problem).
   
   this is a very good point.
   
   >One suggestion was to create a mapping in SCM. Another idea may be to find the keys associated with a container in Recon (reverse index of containerID -> key) and then update OM.
   
   since recon may hold stale  inform, creating a mapping in scm may be better.  
   
   >perhaps removing the deleted container reference in SCM could be tied to the same config 
   
   yea, agree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1407752073

   @JacksonYao287 Are you working on this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1290598823

   I can kind of see this both ways. 
   
   If we take it to one extreme - if we want to ensure stuff is kept around for debugging, why are we letting the empty replicas get deleted? On a real cluster blocks will get deleted all the time, and we will end up with empty containers. Some bug (which I think we already had) could erroneously delete replicas it should not, but its also not scalable to keep all empty replicas forever. They will grow slowly but surely to a large number over months and years and cause problems in the reporting and report processing.
   
   So if we have deleted the replicas, what use is a ContainerInfo reference in SCM? Even if we have it, we cannot do anything with it - there is no replica for it. All it tells us is that a container once existed with that ID, and its empty.
   
   Its bad that it will use up memory slowly over time. Replication Manager will also need to check it on each pass.
   
   We can keep it in the Rocks DB table unloaded on SCM startup, but (I think) we will still need to read out the record and decide on whether or not to load it into memory. Even will millions of such references the overhead should be small - it will only slowdown the SCM startup by a tiny amount, but its still a small overhead.
   
   I'm in favor of just removing them, as I see little value in keeping them. We should certainly log they have been removed - of course the logs will roll off, but at least that is some evidence they have been removed for a reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1197702875

   @sodonnel i have updated this patch, please take a look!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1119204766

   Another question comes to my mind. suppose we have contianer c1 and three replicas r1, r2,r3 on datanode d1, d2,d3 ,respectively. if d3 is dead, r3 will lost, so RM will find this situation and copy r2 to datanode d4 as r4. after a while,  some block deletion commands comes to d1, d2,d4 to delete some blocks in r1, r2, r4 and the delete transaction will be removed from the deleteBlockLog. if d3 resurrects , r3 comes back and has more blocks than r1, r2 ,r4, since d3 does not receive those deletion commands. when d3 sends a container report to scm , the following function in containerReportHandler will change the container stats to the same as r3, since r3 has a bigger KeyCounts and UsedBytes.
   ```
     private long calculateUsage(ContainerInfo containerInfo, long lastValue,
         long thisValue) {
       if (containerInfo.getState().equals(HddsProtos.LifeCycleState.OPEN)) {
         // Open containers are generally growing in key count and size, the
         // overall size should be the min of all reported replicas.
         return Math.min(lastValue, thisValue);
       } else {
         // Containers which are not open can only shrink in size, so use the
         // largest values reported.
         return Math.max(lastValue, thisValue);
       }
     }
   ```
   since those zombie blocks will not be deleted from r3(no deletion command for these blocks will be received) , the KeyCounts and UsedBytes will not be zero, and thus , the container replica at datanodes and the reference at scm will never be removed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3360:
URL: https://github.com/apache/ozone/pull/3360#discussion_r1001599903


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java:
##########
@@ -260,11 +260,23 @@ private void initialize() throws IOException {
   }
 
   private Map<LifeCycleEvent, CheckedConsumer<ContainerInfo, IOException>>
-      getContainerStateChangeActions() {
+      getContainerStateChangeActions(final Configuration conf) {
     final Map<LifeCycleEvent, CheckedConsumer<ContainerInfo, IOException>>
         actions = new EnumMap<>(LifeCycleEvent.class);
     actions.put(FINALIZE, info -> pipelineManager
         .removeContainerFromPipeline(info.getPipelineID(), info.containerID()));
+
+    boolean shouldRemoveDeletedContainer = conf.getBoolean(
+        ScmConfigKeys.OZONE_SCM_REMOVE_DELETED_CONTAINER_ENABLED,
+        ScmConfigKeys.OZONE_SCM_REMOVE_DELETED_CONTAINER_ENABLED_DEFAULT);
+    if (shouldRemoveDeletedContainer) {
+      // when a CLEANUP event if fired , the state of the container will be
+      // changed to DELETED. at this time, we should remove the reference
+      // of the container from scm. this happens immediately, thus DELETED
+      // is a transient state and hardly be captured.
+      actions.put(CLEANUP, info ->
+          removeContainer(info.containerID().getProtobuf()));

Review Comment:
   I am not sure if this is logged anywhere else, but could we ensure a lot message is written at INFO level saying container xxx has been deleted as it is empty.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1302895601

   my plan is to complete this work in two steps.
   first, in this patch, DELETED container is removed from scm by default(this can be configured). and for those unknown container ,  no matter it is empty or not, we will deleted it by default(this can be configured).  in production environment,which use the default configuration, no matter an unknown container is empty, it will not be referenced by any key, we can just delete it. so although Deleting a non-empty unknown container by default is an aggressive operation, i think it makes sense here. in development environment, we can set the default operation to log WARN, so it give us a chance to investigate the problem. and for those old DELETED container reference in rocksDB, we can just left them there and do nothing and later newly deployed cluster will not have this problem.
   
   second, i will create a new patch to solve [this problem.](https://github.com/apache/ozone/pull/3360#issuecomment-1119204766). my idea here is after container is closed, only deleteblockLog can shrink the container size, not `calculateUsage ` in containerReport handler. the size of container is reduced only after the deleteTransaction is committed in deleteblockLog. so even if a container replica has some orphan blocks and a bigger size, scm has a definite size of the container. when the size of a container in scm is shrinked to zero, scm will delete all the replicas of the container . if a container is deleted , and scm receives a container report from a resurrected datanode, it will be treated as an unknown container , and will be deleted by scm by default.
   
   @errose28 @sodonnel what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 closed pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by "JacksonYao287 (via GitHub)" <gi...@apache.org>.
JacksonYao287 closed pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED
URL: https://github.com/apache/ozone/pull/3360


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1117404215

   I think it is reasonable to remove containers that are empty. There isn't a good reason to keep them beyond debugging and if a DN comes back up with a container on it, we should probably handle it as an unknown container and remove it. I know the default Ozone behaviour is to ignore unknown containers and not remove them. I understand why (we don't want to accidentality delete data), but I feel that over time the number of empty containers in a cluster will grow. Also if the cluster is out of safemode, it seems overly pessimistic to keep unknown containers.
   
   It may seem unrealistic that we will end up with a log of empty containers, but if we have Hive workloads for example. They can have many phases and stage the temp data from between jobs on the cluster. This will fill containers and then they will be purged. A job could stage 100's of GB potentially in a large multi-table join.
   
   Related to this, and a problem I foresee, is that a cluster running these Hive type workloads may end up filling containers partly with temp data and partly with real data. The containers get closed. The tmp data gets removed and we end up with lots of small containers that are never totally empty, but may have much less than 5GB of data in them. Then the SCM memory will get used with not just empty container references, but far more containers than we expected (small container problem).
   
   We ultimately need to solve that small container problem by merging containers, but there is at least one major problem. We cannot easily update OM with the new containerID. One suggestion was to create a mapping in SCM. Another idea may be to find the keys associated with a container in Recon (reverse index of containerID -> key) and then update OM.
   
   For this PR, perhaps removing the deleted container reference in SCM could be tied to the same config that lets us delete or not delete unknown containers, so it either logs a warning or deletes the SCM reference depending on the value?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
errose28 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1297759990

   Ok we can continue with the proposal to remove the deleted containers from RocksDB. Does this description line up with what is being proposed?
   
   - When replication manager finds an empty container to delete, it will send the delete commands to replicas and remove the container from in memory and the DB.
       - If datanodes miss the command, the container will be deleted if empty on the next FCR by the container report handler.
   - When SCM gets a report for a container that it does not have in the DB it will delete it if it is empty (block count is 0), otherwise it will handle the container based on the value of `hdds.scm.unknown-container.action`.
   - DELETED containers already in SCM DB will remain there forever.
       - If we want to get rid of them on startup, we would need a layout feature for this as it would affect downgrades otherwise.
   - This flow will not be behind a config key.
   
   The catch here is that the a container will get removed from SCM if all replicas in the current replica set are empty, but SCM cannot know about all replicas that may have a copy of the container. As @JacksonYao287  pointed out in [this comment](https://github.com/apache/ozone/pull/3360#issuecomment-1119204766), an old replica whose container has been deleted could resurface, and have blocks that already got deleted and cleared. If we are only deleting unknown empty containers, then this container would not get deleted. However, deleting unknown non-empty containers seems dangerous. Removal of these containers might have to wait until we have orphan block cleanup on Recon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
errose28 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1116767515

   @JacksonYao287 what is the motivation behind this change? Maintaining a record of all deleted containers in SCM has the following benefits:
   - If a dead node comes back online after a container is deleted and starts reporting a deleted container, SCM can delete it instead of falling back to the logic for unknown containers in `ContainerReportHandler`.
   - Easier debug-ability when working through potential HDDS bugs.
   
   I do not see problems with keeping deleted container entries. Even if we had to track 10 billion containers during the system's lifetime (which I suspect might be pushing the limits of what RocksDB can handle) that would correspond to about 5gb * 10billion = 50 exabytes of data deleted from the cluster during its lifetime, which is a massive amount of churn. If we are concerned about SCM memory usage since it is tracking all the containers in memory, a better approach might be to track deleted containers in RocksDB only.
   
   This is actually a somewhat major change in the way HDDS functions, so if we want to go through with this, I think we would need to answer some more questions:
   1. What are the specific problems/use cases where the current approach will cause problems?
   2. How serious are the problems with the existing approach compared to the potential problems of this new approach?
   3. What happens when a dead datanode comes back and starts reporting a container that was deleted and removed?
   4. How is case 3 reconciled with legitimate unknown containers that may be reported to SCM due to bugs or improper configuration?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] Xushaohong commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
Xushaohong commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1231303893

   > a better approach might be to track deleted containers in RocksDB only.
   
   @errose28 
   My thought is also to do this. Some heavy load scenarios could be benefited from this. If the cluster runs for more than several months, it would have lots of deleted container info in memory, which is nearly not used anymore. Saving the record in the RDB, we could have a backup path to search the container. This backup solution would bring some query overhead but saves the memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by "JacksonYao287 (via GitHub)" <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1411931647

   > @JacksonYao287 Are you working on this PR?
   
   @swamirishi , sorry, i am now busy with some internal stuff and not have enough time to continue this work! I think it is better to close this patch for now, and reopen it later. also, anyone interesting with this issue can take this work after it is cloesd


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1111646031

   @sodonnel @ChenSammi  PTAL, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1293099153

   > 
   
   @errose28 does this make sense to you? or do you have any other points?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1117104859

   thanks @errose28 for the review!
   the main motivation behind this change is the SCM memory usage. i originally thought it will take too much RAM if there are two many deleted containers.  here, seems i have overthought it. will close this PR shortly.
   
   >a better approach might be to track deleted containers in RocksDB only.
   
   can you please explain this in more detail? do you mean if we have to track a massive amount of containers, no matter put or get the state of container , we just hit rocksdb directly, but not maintain a state map in RAM? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3360: HDDS-6663. remove container reference from scm if its state is DELETED

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3360:
URL: https://github.com/apache/ozone/pull/3360#issuecomment-1286723890

   IMO I think this change looks good. Perhaps we should add one more test that confirms the reference is not deleted if the config flag is false, just to catch any future changes which may break this.
   
   I also believe that the best thing is to have this "delete empty container references" on by default as in the normal case, there is no real reason to want to keep an empty reference around.
   
   We should probably log it is being removed however, so there is some trace or indication about why it disappeared.
   
   @errose28 what do you think? You had some differing opinions on this before?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org