You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "nkeywal (JIRA)" <ji...@apache.org> on 2012/05/04 18:28:49 UTC

[jira] [Created] (HBASE-5939) Add an autorestart option in the start scripts

nkeywal created HBASE-5939:
------------------------------

             Summary: Add an autorestart option in the start scripts
                 Key: HBASE-5939
                 URL: https://issues.apache.org/jira/browse/HBASE-5939
             Project: HBase
          Issue Type: Improvement
          Components: master, regionserver, scripts
    Affects Versions: 0.96.0
            Reporter: nkeywal
            Assignee: nkeywal
            Priority: Minor


When a binary dies on a server, we don't try to restart it while it would be possible in most cases.

We can have something as:
loop
 start
 wait
 if cleanStop then exit
 if already stopped less than 5 minutes ago sleep 1 minute
endloop

This is simple for master & backup master, a little bit more complex for the region server as it can be stopped by a script or by the shutdown procedure.

On a long long term it could allow a restart with exactly the same assignments.





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5939) Add an autorestart option in the start scripts

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273152#comment-13273152 ] 

Hadoop QA commented on HBASE-5939:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526498/5939.v4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.TestDrainingServer

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1846//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1846//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1846//console

This message is automatically generated.
                
> Add an autorestart option in the start scripts
> ----------------------------------------------
>
>                 Key: HBASE-5939
>                 URL: https://issues.apache.org/jira/browse/HBASE-5939
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver, scripts
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: 5939.v4.patch
>
>
> When a binary dies on a server, we don't try to restart it while it would be possible in most cases.
> We can have something as:
> loop
>  start
>  wait
>  if cleanStop then exit
>  if already stopped less than 5 minutes ago sleep 1 minute
> endloop
> This is simple for master & backup master, a little bit more complex for the region server as it can be stopped by a script or by the shutdown procedure.
> On a long long term it could allow a restart with exactly the same assignments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira