You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/02/25 01:27:26 UTC

[GitHub] [hadoop-ozone] bharatviswa504 opened a new pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

bharatviswa504 opened a new pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598
 
 
   ## What changes were proposed in this pull request?
   
   Fix destroy of pipelines after SCM restart, due to a bug in loading pipeline setting timestamp with actual pipeline time. In the case of a restart, we should set time to when it is loading from pipeline DB.
    
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3067
   
   ## How was this patch tested?
   
   Tested it on the cluster and now SCM is not destroying pipelines after the restart. And also fixed RatisPipelineProvider test.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on a change in pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598#discussion_r383679859
 
 

 ##########
 File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/SCMPipelineManager.java
 ##########
 @@ -153,6 +153,8 @@ protected void initializePipelineState() throws IOException {
           .newBuilder(HddsProtos.Pipeline.PARSER.parseFrom(entry.getValue()));
       Pipeline pipeline = Pipeline.getFromProtobuf(pipelineBuilder.setState(
           HddsProtos.PipelineState.PIPELINE_ALLOCATED).build());
+      // When SCM is restarted, set Creation time with current time.
+      pipeline.setCreationTimestamp(Instant.now());
 
 Review comment:
   Nit: this should go after `Preconditions.checkNotNull` in the next line, otherwise the check becomes unnecessary (will never fail).  The result is the same anyway (NPE).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598#issuecomment-590970141
 
 
   > I'm not sure about the other part: maybe it's just me, but it seems to be a workaround for the behavior of the scrubber, and we are losing information (maybe not important information, I don't know). I think other possible fixes should be considered (eg. adding a new state for the restored pipelines, or introducing a "grace period" in the scrubber after startup).
   
   
   From my understanding, creation time stamp is set when pipeline is created. This is what we do when SCM is setup fresh and started. 
   Now SCM is restarted, all pipelines will be in allocated state, until pipeline reports are received from datanode. So, I assume we can set the creation time with SCM start time and use this to scrub the pipeline. As scrub main purpose is to detect a pipeline is allocated state for more than configured ozone.scm.pipeline.allocated.timeout. But one thing is in DB we don't change the timestamp it will be always with creation timestamp if the pipeline is not destroyed.
   
   Let me know if you have any other approach, from above comment I have not understood the proposal clearly.
   
   
   > It seems that the two changes are not strictly related: the fix for the test is correct by itself, the test passes with or without the call to pipeline.setCreationTimestamp.
   
   Yes, test fix is not related to this, as I am looking for any tests testing Ratis pipeline I have found this and fixed it as part of this PR.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 edited a comment on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598#issuecomment-590970141
 
 
   > I'm not sure about the other part: maybe it's just me, but it seems to be a workaround for the behavior of the scrubber, and we are losing information (maybe not important information, I don't know). I think other possible fixes should be considered (eg. adding a new state for the restored pipelines, or introducing a "grace period" in the scrubber after startup).
   
   
   From my understanding, creation time stamp is set when pipeline is created. This is what we do when SCM is setup fresh and started. 
   Now SCM is restarted, all pipelines will be in allocated state, until pipeline reports are received from datanode. So, I assume we can set the creation time with SCM start time and use this to scrub the pipeline. As scrub main purpose is to detect a pipeline is allocated state for more than configured ozone.scm.pipeline.allocated.timeout. But one thing is in DB we don't change the timestamp it will be always with creation timestamp if the pipeline is not destroyed.
   
   Let me know if you have any other approach, from above comment I have not understood the proposal clearly.
   
   
   > It seems that the two changes are not strictly related: the fix for the test is correct by itself, the test passes with or without the call to pipeline.setCreationTimestamp.
   
   Yes, test fix is not related to this, as I am looking for any tests testing Ratis pipeline I have found this and fixed it as part of this PR.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 merged pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 merged pull request #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on issue #598: HDDS-3067. Fix Bug in Scrub Pipeline causing destory pipelines after SCM restart.
URL: https://github.com/apache/hadoop-ozone/pull/598#issuecomment-590978711
 
 
   Thank You @adoroszlai for the review and offline discussion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org