You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/08/30 01:40:43 UTC

[jira] [Created] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

"Server not running" exception during meta verification causes RS abort
-----------------------------------------------------------------------

                 Key: HBASE-4288
                 URL: https://issues.apache.org/jira/browse/HBASE-4288
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.90.4
            Reporter: Todd Lipcon
            Priority: Critical
             Fix For: 0.92.0, 0.90.5


The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4288:
-------------------------

    Status: Patch Available  (was: Open)
    
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093343#comment-13093343 ] 

Todd Lipcon commented on HBASE-4288:
------------------------------------

This is a little tricky. I don't think we can just catch this and return false, since there's no verification that the server is dead _yet_, just that it's shutting down. If we were to return false, then the users of this code would delete the root location from ZK and start re-assigning even though the old server may have unflushed edits, etc.

Though, this makes me think: why is it _ever_ safe to delete the root location and reassign it before the old location's logs have split?

> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147508#comment-13147508 ] 

Hadoop QA commented on HBASE-4288:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503165/4288-v2.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -164 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 52 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/221//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/221//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/221//console

This message is automatically generated.
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093312#comment-13093312 ] 

Todd Lipcon commented on HBASE-4288:
------------------------------------

2011-08-29 16:37:50,485 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Server not running, aborting
        at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2407)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1648)
        at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)

        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:258)
        at $Proxy7.getRegionInfo(Unknown Source)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(CatalogTracker.java:426)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:273)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:333)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
        at org.apache.hadoop.hbase.catalog.MetaReader.getRegion(MetaReader.java:355)
        at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:312)
        at org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:263)
        at org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:227)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:421)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:294)


> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4288:
------------------------------------------

    Fix Version/s: 0.92.0

Already committed to 0.92 and trunk  But not into 0.90.  Hence not resolving just updating fix version.
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.7, 0.92.0
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148229#comment-13148229 ] 

Hudson commented on HBASE-4288:
-------------------------------

Integrated in HBase-TRUNK #2427 (See [https://builds.apache.org/job/HBase-TRUNK/2427/])
    HBASE-4288 "Server not running" exception during meta verification causes RS abort

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java

                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reopened HBASE-4288:
--------------------------


Hmm... yeah, still issue in 0.90 branch so leaving open.
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4288:
-------------------------

    Attachment: 4288.txt

I think this scenario unlikely now we have retries in place.  Writing a unit test to prove it (Its not done yet).
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147481#comment-13147481 ] 

stack commented on HBASE-4288:
------------------------------

Oops... let me reattach the change description:

{code}
M src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java                       
  Fix javadoc warnings.  Change some logging that could be profuse                        
  from DEBUG to TRACE level (now it works).                                               
  (setMetaLocation) Changed from private to package protected so could                    
  use in tests.                                                                           
                                                                                          
M src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java                   
  (constructAndStartCatalogTracker): Removed.  Has null Connection so NPEs.               
  (testInterruptWaitOnMetaAndRoot): Now we use HTables instead                            
  of server interfaces, we need to do a bit more work getting the right                   
  HConnection into place.  Use new mockConnection method.  Make sure we                   
  cleanup the connection when we are done.                                                
  (testServerNotRunningIOException): Added test to prove hbase-4288                       
  is fixed.  This is the important addition here.                                         
  (testGetMetaServerConnectionFails): Refactored to use new mockConnection                
  utility methods.                                                                        
  (mockConnection): utility mocking up a connection.                                      
  (getMetaTableRowResult): New utility method that fakes a Result.      
{code}


Here is the proof.  Here's logs from the test:

{code}
2011-11-09 19:50:29,433 INFO  [main] catalog.CatalogTracker(597): Failed verification of .META.,,1 at address=example.org,1234,1320897021185; java.io.IOException: Server not running, aborting
2011-11-09 19:50:29,433 DEBUG [main] catalog.CatalogTracker(492): Current cached META location, example.org,1234,1320897021185, is not valid, resetting
2011-11-09 19:50:29,438 DEBUG [main] catalog.CatalogTracker(504): Set new cached META location: example.org,1234,1320897021185
{code}

See how now when we get an IOE 'Server not running', it doesn't kill us -- instead, we have the .META. location as bad and we clear the bad value (In the third log line, we found a good value and set the new meta location).
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148298#comment-13148298 ] 

Hudson commented on HBASE-4288:
-------------------------------

Integrated in HBase-0.92 #123 (See [https://builds.apache.org/job/HBase-0.92/123/])
    HBASE-4288 "Server not running" exception during meta verification causes RS abort

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java

                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4288:
-------------------------

    Attachment: 4288-v2.txt

Test to prove that this no longer an issue.  I think we should commit this test because it does some test refactoring and fixes some tests NPEs.

First, here is description of the patch:


M src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java                       
  Fix javadoc warnings.  Change some logging that could be profuse                        
  from DEBUG to TRACE level (now it works).                                               
  (setMetaLocation) Changed from private to package protected so could                    
  use in tests.                                                                           
                                                                                          
M src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java                   
  (constructAndStartCatalogTracker): Removed.  Has null Connection so NPEs.               
  (testInterruptWaitOnMetaAndRoot): Now we use HTables instead                            
  of server interfaces, we need to do a bit more work getting the right                   
  HConnection into place.  Use new mockConnection method.  Make sure we                   
  cleanup the connection when we are done.                                                
  (testServerNotRunningIOException): Added test to prove hbase-4288                       
  is fixed.  This is the important addition here.                                         
  (testGetMetaServerConnectionFails): Refactored to use new mockConnection                
  utility methods.                                                                        
  (mockConnection): utility mocking up a connection.                                      
  (getMetaTableRowResult): New utility method that fakes a Result.   
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4288:
-------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.90.5)
           Status: Resolved  (was: Patch Available)

Committed the verification test to trunk and 0.92 branch.
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4288:
-------------------------

    Fix Version/s:     (was: 0.92.0)
                   0.90.5
    
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4288:
------------------------------------------

    Fix Version/s:     (was: 0.90.6)
                   0.90.7

Moving into 0.90.7
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.7
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148126#comment-13148126 ] 

stack commented on HBASE-4288:
------------------------------

Seems odd that adding a test to TestCatalogTracker would call TestLogRolling fail.  I tried it local w/ patch applied and TLR passes.  Going to commit this test (We can open new issue if you find hole in my reasoning above post-commit)
                
> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4288-v2.txt, 4288.txt
>
>
> The master tried to verify the META location just as that server was shutting down due to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira