You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/02/20 23:51:44 UTC

[GitHub] [hadoop-ozone] swagle opened a new pull request #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

swagle opened a new pull request #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576
 
 
   ## What changes were proposed in this pull request?
   If readStateMachine call fails there is no way to recover for the follower and RATIS-795 results in calling notifyLogFailed() which closes the pipeline.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2716
   
   ## How was this patch tested?
   Only integration test added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bshashikant commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
bshashikant commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-591943391
 
 
   > Hi @adoroszlai, I am not able to repro this problem and the test exception trace is not sufficient to identify what is wrong:
   > java.io.IOException: INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks. Requested 1 blocks
   > @bharatviswa504 any idea if a recent regression was introduced?
   
   Thanks @swagle for continuing with this and @adoroszlai for testing it out. @swagle , Let's add a container creation call to the after creating the ozone key and before shutting down the datanode and use ContainerTestHelper.isRatisFollower/isRatisLeader Api to selectively shhutdown dn in the test instead of relying on the leader info from the pipeline object.. Creating the container before writing data will make sure the pipeline is healthy and way before writing and using dn to determine the ratis role will always ensure you are shutting down a follower always rather than relying on scm to tell which may be outdated?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-592664780
 
 
   > Thanks @swagle for the latest update. I think it is OK: 19/20 runs passed, the remaining one crashed, filed [HDDS-3104](https://issues.apache.org/jira/browse/HDDS-3104) for that, I observed it elsewhere, too.
   > 
   > https://github.com/adoroszlai/hadoop-ozone/runs/474218807
   
   Thanks @adoroszlai can this be merged?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-592584294
 
 
   /retest

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-591384895
 
 
   Thank @swagle for updating the patch.  Looks more stable now, it improved to 1 failure out of 20 runs:
   
   https://github.com/adoroszlai/hadoop-ozone/runs/469165226

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-591673022
 
 
   Hi @adoroszlai, I am not able to repro this problem and the test exception trace is not sufficient to identify what is wrong:
   java.io.IOException: INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks. Requested 1 blocks
   @bharatviswa504 any idea if a recent regression was introduced?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-592217917
 
 
   > > Hi @adoroszlai, I am not able to repro this problem and the test exception trace is not sufficient to identify what is wrong:
   > > java.io.IOException: INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks. Requested 1 blocks
   > > @bharatviswa504 any idea if a recent regression was introduced?
   > 
   > Thanks @swagle for continuing with this and @adoroszlai for testing it out. @swagle , Let's add a container creation call to the after creating the ozone key and before shutting down the datanode and use ContainerTestHelper.isRatisFollower/isRatisLeader Api to selectively shhutdown dn in the test instead of relying on the leader info from the pipeline object.. Creating the container before writing data will make sure the pipeline is healthy and way before writing and using dn to determine the ratis role will always ensure you are shutting down a follower always rather than relying on scm to tell which may be outdated?
   
   Thanks @bshashikant, made those changes hopefully helps!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] adoroszlai merged pull request #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
adoroszlai merged pull request #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure

Posted by GitBox <gi...@apache.org>.
swagle commented on issue #576: HDDS-2716. Add integration test to verify pipeline closed on read statemachine failure
URL: https://github.com/apache/hadoop-ozone/pull/576#issuecomment-591123475
 
 
   > https://github.com/adoroszlai/hadoop-ozone/runs/459688379
   
   Hi Attila, I rebase this with master and ran 20 times with minor changes similar to TestDeletedWithSlowFollower and I get it to work locally, could you please relaunch the 20 test run you did?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org