You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/02/20 00:43:10 UTC

[GitHub] [samza] bkonold opened a new pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs

bkonold opened a new pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs
URL: https://github.com/apache/samza/pull/1283
 
 
   **Symptom:** Container shuts down with exception on invocation of TaskStorageManager.removeOldCheckpoints from TaskInstance.commit
    
   **Cause:** Concurrent modification of checkpoint directories by other processes / threads may cause FileNotFoundException to be thrown and shutdown the container. IOException may be thrown for other miscellaneous failures; these should not cause the container to shutdown but be logged and allow processing to continue.
    
   **Tests:** Added a unit test which fails when exception from removeOldCheckpoints is not caught.
   
   **API Changes:** None
   **Upgrade instructions:** None
   **Usage instructions:** None

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] prateekm commented on a change in pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs

Posted by GitBox <gi...@apache.org>.
prateekm commented on a change in pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs
URL: https://github.com/apache/samza/pull/1283#discussion_r381630922
 
 

 ##########
 File path: samza-core/src/main/scala/org/apache/samza/container/TaskInstance.scala
 ##########
 @@ -274,7 +274,11 @@ class TaskInstance(
 
     if (storageManager != null) {
       trace("Remove old checkpoint stores for taskName: %s" format taskName)
-      storageManager.removeOldCheckpoints(checkpointId)
+      try {
+        storageManager.removeOldCheckpoints(checkpointId)
+      } catch {
+        case e: Exception => error("Failed to remove old checkpoints", e)
 
 Review comment:
   Minor: include current checkpoint id and task name in message.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] bkonold commented on a change in pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs

Posted by GitBox <gi...@apache.org>.
bkonold commented on a change in pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs
URL: https://github.com/apache/samza/pull/1283#discussion_r381633029
 
 

 ##########
 File path: samza-core/src/main/scala/org/apache/samza/container/TaskInstance.scala
 ##########
 @@ -274,7 +274,11 @@ class TaskInstance(
 
     if (storageManager != null) {
       trace("Remove old checkpoint stores for taskName: %s" format taskName)
-      storageManager.removeOldCheckpoints(checkpointId)
+      try {
+        storageManager.removeOldCheckpoints(checkpointId)
+      } catch {
+        case e: Exception => error("Failed to remove old checkpoints", e)
 
 Review comment:
   ack, i'll add those

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] prateekm merged pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs

Posted by GitBox <gi...@apache.org>.
prateekm merged pull request #1283: SAMZA-2464: Container shuts down when task fails to remove old state checkpoint dirs
URL: https://github.com/apache/samza/pull/1283
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services