You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/10/27 17:31:11 UTC

[GitHub] [accumulo] Manno15 opened a new issue #1753: `admin stop` created deadlock in TabletServer being stopped

Manno15 opened a new issue #1753:
URL: https://github.com/apache/accumulo/issues/1753


   Taken from https://issues.apache.org/jira/browse/ACCUMULO-3898 and confirmed to still exist in accumulo. If you have multiple tservers and try using 'admin stop' to shutdown each individiaul tserver, the last tserver active will hang while trying to shut down. The issue on jira goes into more detail on the specificties happening during this process. 
   
   I updated the IT that was attached to test this and that is on my fork here: 
   [AdminStopTabletServersIT](https://github.com/Manno15/accumulo/blob/dcfb85a7a1096c238e3d694e49aa7ebd1f944c41/test/src/main/java/org/apache/accumulo/test/AdminStopTabletServersIT.java)
   
   This IT sets up three tservers and when it trys to shutdown the last one, will hang and the test will timeout. 
   
   -Affected version(s) of this project: 2.0.0 for certain and a safe assumption that previous version are affected as well. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] puchreiner commented on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
puchreiner commented on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-750490607


   `1.10.0` seems affected too.
   I had no problem with `1.9.0`, but after upgrade to `1.10.0` "admin stop" often hangs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
Manno15 commented on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-872611269


   @ctubbsii Sorry, this was a mistake. I was cleaning out branches and I was mainly going off staleness and closed this one by mistake


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-872604475


   @Manno15 I didn't see a comment about this, but saw that this WIP PR #1770 was automatically closed when you deleted your branch. Are you going to table this work, or are you going to take a different approach?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
Manno15 commented on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-718769232


   It seems [ShutdownTserver.java](https://github.com/apache/accumulo/blob/c43471a725bc2259aa80d041d233997c8844c37a/server/manager/src/main/java/org/apache/accumulo/master/tserverOps/ShutdownTServer.java) is what is called right before this hangs. We wait for completion in [MasterClientServiceHandler.java](https://github.com/apache/accumulo/blob/f88cb3bcebb744d7d1f3150877243c756d717ddb/server/manager/src/main/java/org/apache/accumulo/master/MasterClientServiceHandler.java#L283) before proceeding which is where I expect the hangup to be. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] puchreiner edited a comment on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
puchreiner edited a comment on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-750490607


   `1.10.0` seems affected too.
   I had no problem with `1.9.0`, but after upgrade to `1.10.0` "admin stop" often hangs
   
   UPD: probably my bug is different: the hanging tserver does not have to be the last one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
Manno15 commented on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-718911000


   My current solution is to check if there is more than one tablet server before attempting to shutdown each tserver. I put in a error log as well so the user knows that 'admin stop' cannot stop the last tserver. I believe [ShutdownTserver.java](https://github.com/apache/accumulo/blob/c43471a725bc2259aa80d041d233997c8844c37a/server/manager/src/main/java/org/apache/accumulo/master/tserverOps/ShutdownTServer.java) is only used by the Admin class so it should only affect this and not any other stop tserver method in accumulo. 
   
   I will keep investgating this to find a better solution to how we should handle this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] puchreiner edited a comment on issue #1753: `admin stop` created deadlock in TabletServer being stopped

Posted by GitBox <gi...@apache.org>.
puchreiner edited a comment on issue #1753:
URL: https://github.com/apache/accumulo/issues/1753#issuecomment-750490607


   `1.10.0` seems affected too.
   I had no problem with `1.9.0`, but after upgrade to `1.10.0` "admin stop" often hangs
   
   UPD: probably my bug is different: the stopped tserver does not have to be the last one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org