You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Roman Shaposhnik (JIRA)" <ji...@apache.org> on 2011/08/16 23:43:27 UTC

[jira] [Created] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

The HBase hbase-daemon.sh SIGKILLs master when stopping it
----------------------------------------------------------

                 Key: HBASE-4209
                 URL: https://issues.apache.org/jira/browse/HBASE-4209
             Project: HBase
          Issue Type: Bug
          Components: master
            Reporter: Roman Shaposhnik


There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101674#comment-13101674 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

stack, I'm sorry for putting this on a backburner, but at least I now have a better understanding of what's going on. 

Basically I got confused in a situation where suppressHdfsShutdownHook would be called multiple times on the same filesystem object. The first call would succeed, but all the other ones would fail. This is, obviously, just a problem with my patch, not the minihdfs cluster. I'll cook up an alternative and once I run the tests will attach an updated version.

P.S. Thanks for the encouragement!

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091246#comment-13091246 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

stack, I have tried it with a standalone HBase writing to a local filesystem (I think there's no way making a standalone instance of HBase use HDFS). And that works quite nicely. Tables get flushed when I SIGTERM the process, etc. Once it came to testing I had to run it with miniDFS and that's where things got interesting. suppressHdfsShutdownHook can't disable the HDFS shutdown hook of the miniDFS, simply because there's nothing to disable. 

So here's the question, should I work around this one special case by simply testing for the 
{noformat}
LocalHBaseCluster.isLocal(hrs.getConfiguration())
{noformat}
or would you propose something else?

I'd appreciate your advice, since I'm just learning my way around HBase.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113616#comment-13113616 ] 

stack commented on HBASE-4209:
------------------------------

This is not good:

{code}
+    } catch (Throwable ex) {}
{code}

At least log the exception.

And similar here:

{code}
+        try {
+          HRegionServer.startRegionServer(t.getRegionServer());
+        } catch (Throwable ex) {}
+        // t.start();
{code}

What is the exception we are suppressing?  Should we at least log it here too?

Otherwise, I'm fine with this referencecounting stuff.  Its contained to the shutdown handler class.

Good stuff.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113859#comment-13113859 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

@stack

> At least log the exception.
> What is the exception we are suppressing?  Should we at least log it here too?

Basically, the only root cause of possible exceptions here is the code in 
suppressHdfsShutdownHook() (which is quite fragile and overly aggressive in
throwing exceptions if things don't look quite right). Now, if you look at 
how this code is executed in a non-standalone case -- HBase does bail on exceptions 
thrown from there. We can adopt the same approach in a standalone and unit testing
case, but I wasn't sure it was the right thing to do. After all, NOT bailing out
(and lets say simply logging these things) will be no worse than what we currently
have (not calling shutdown hooks at all). On the other hand -- if we do bail out
aggressively we get a better chance of catching incompatibilities between 
suppressHdfsShutdownHook() logic and future hadoop releases during unit tests.

So perhaps -- bailing out would make the most sense of all.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087564#comment-13087564 ] 

stack commented on HBASE-4209:
------------------------------

stop-hbase.sh does shutdown hook plus shutting down the cluster.  Its different.  The hbase-daemon.sh sends signal to kill the daemon (triggering shutdown hook).  Thats how I remember it.  If I'm off, say so and I'll dig in and give you a more detailed response.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118629#comment-13118629 ] 

Hudson commented on HBASE-4209:
-------------------------------

Integrated in HBase-TRUNK #2274 (See [https://builds.apache.org/job/HBase-TRUNK/2274/])
    HBASE-4209 The HBase hbase-daemon.sh SIGKILLs master when stopping it

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ShutdownHook.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java

                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>             Fix For: 0.92.0
>
>         Attachments: 4209-v3.txt, HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086513#comment-13086513 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

stack, before I submit the patch, I would really appreciate if you could let me know the relationship between bin/stop-hbase.sh and bin/hbase-daemon.sh. I was under the impression that whatever stop-hbase.sh triggers in the master code would also be triggered by JVM shutdown hook upon receiving SIGTERM, but it doesn't seem to be that way. Do we have to call bin/stop-hbase.sh manually from within the hbase-daemon.sh before stopping daemons?

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118587#comment-13118587 ] 

Hudson commented on HBASE-4209:
-------------------------------

Integrated in HBase-0.92 #35 (See [https://builds.apache.org/job/HBase-0.92/35/])
    HBASE-4209 The HBase hbase-daemon.sh SIGKILLs master when stopping it

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/ShutdownHook.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java

                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>             Fix For: 0.92.0
>
>         Attachments: 4209-v3.txt, HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4209:
-------------------------

    Attachment: 4209-v3.txt

Patch with the Todd fix and I swapped an unorthodox public AFTER a static qualifier.  I tried this patch and it works for me.  Committing this.
                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: 4209-v3.txt, HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087966#comment-13087966 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

stack, thanks for the clarifications -- much appreciated! Now, as I said, I only see addShutdownHook call in regionserver code:
{noformat}
$ git grep addShutdownHook
src/main/java/org/apache/hadoop/hbase/regionserver/ShutdownHook.java:    Runtime.getRuntime().addShutdownHook(t);
{noformat}
and then
{noformat}
 /**
   * @param hrs
   * @param name
   * @return Thread the RegionServer is running in correctly named.
   * @throws IOException
   */
  public static Thread startRegionServer(final HRegionServer hrs,
      final String name) throws IOException {
    Thread t = new Thread(hrs);
    t.setName(name);
    t.start();
    // Install shutdown hook that will catch signals and run an orderly shutdown
    // of the hrs.
    ShutdownHook.install(hrs.getConfiguration(), FileSystem.get(hrs
        .getConfiguration()), hrs, t);
    return t;
  }
{noformat}

Now, the behavior that I observe is that when HBase is running in a distributed fashion, flushing of the newly created tables happen when regionserve receives a SIGTERM. When it is running in a standalone configuration tables don't get flushed UNLESS I explicitly call stop-hbase.sh before sending a SIGTERM.

Once would assume that the same code would be called in both cases since it is registering that hook unconditionally. What am I missing here?

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118319#comment-13118319 ] 

Todd Lipcon commented on HBASE-4209:
------------------------------------

+   * to be executed after the last regioserver referring to a given filesystem
typo (regioserver)

Otherwise looks pretty good. Thanks for fixing this messy bug.
                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109824#comment-13109824 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

I was asked yesterday how this can be tested. I'm not quite sure I can come up with a traditional unit test for this change, but here's how to verify it manually:
{noformat}
$ bin/hbase master start & PID=$! ; sleep 15 ; kill $PID
{noformat}

And look for LOG messages of the following nature:

{noformat}
INFO regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-18,5,main]
..............
INFO regionserver.ShutdownHook: Starting fs shutdown hook thread.
INFO regionserver.ShutdownHook: Shutdown hook finished.
{noformat}

Without the patch -- the HBase master simply dies (as in -- no messages get produced)

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik updated HBASE-4209:
------------------------------------

    Attachment: HBASE-4209.patch.txt

I'm attaching a sample patch that seems to be working well enough. The idea behind it is to reference count the fsShutdownHooks that we have to run from shutdown sequences of multiple RegionServers and only *really* run each when the last registered regionserver shuts down. Please let me know what do you think.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>         Attachments: HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik updated HBASE-4209:
------------------------------------

    Attachment: HBASE-4209.final.patch.txt

Latest version of patch attached. Unfortunately, it has grown since last time I've submitted it to incorporated feedback from this JIRA. 

All tests pass on my local workstation.
                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090423#comment-13090423 ] 

Roman Shaposhnik commented on HBASE-4209:
-----------------------------------------

Perhaps this will be a naive suggestion, but wouldn't it make sense to install Shutdown hooks regardless of whether the Regioserver was started in a standalone mode. Will a trivial change like this one make sense:

{noformat}
--- src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
+++ src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
@@ -179,7 +179,11 @@ public class JVMClusterUtil {
     }
     if (regionservers != null) {
       for (JVMClusterUtil.RegionServerThread t: regionservers) {
-        t.start();
+          try {
+              HRegionServer.startRegionServer(t.getRegionServer());
+          } catch (IOException e) {
+              // Nothing to do, really
+          }
       }
     }
     if (masters == null || masters.isEmpty()) {
{noformat}

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4209:
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Committed to branch 0.92 and trunk.  Thank you for the patch Roman.
                
> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>             Fix For: 0.92.0
>
>         Attachments: 4209-v3.txt, HBASE-4209.final.patch.txt, HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090793#comment-13090793 ] 

stack commented on HBASE-4209:
------------------------------

Have you tried it Roman?  Do all the tests pass?  Seems like an innocuous enough change.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086004#comment-13086004 ] 

stack commented on HBASE-4209:
------------------------------

I do not not remember why it SIGKILLs (I was going to blame hadoop saying we copied all from there but I just checked and don't see it there so it must be an addition of ours). If plain kill works for you, make a patch and we'll add it.  Thanks Roman.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092174#comment-13092174 ] 

stack commented on HBASE-4209:
------------------------------

bq. suppressHdfsShutdownHook can't disable the HDFS shutdown hook of the miniDFS, simply because there's nothing to disable.

There is no hdfs shutdown hook when running minihdfs?

And I don't follow how testing for localfs helps.  I'm not clear on why testing for local fs makes a differenc when the shutdown hook is not in minidfs (?).  I'm missing something.

bq. I'd appreciate your advice, since I'm just learning my way around HBase.

You seem to be doing just fine to me.  Obviously from above exchange you know more about this area of the code than I do (and I think I wrote it?).  That suppressHdfsShutdownHook is really pretty (smile).  Thats some ugly gymnastics going on in there.


> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088152#comment-13088152 ] 

stack commented on HBASE-4209:
------------------------------

Is it because no shutdown hook in master and when in standalone mode all runs in the one jvm, the master's effectively?

In start-hbase.sh, if distmode is false, we ONLY start master:

{code}
if [ "$distMode" == 'false' ]
then
  "$bin"/hbase-daemon.sh start master
else
  "$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" start zookeeper
  "$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" start master
  "$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
    --hosts "${HBASE_REGIONSERVERS}" start regionserver
  "$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
    --hosts "${HBASE_BACKUP_MASTERS}" start master-backup
fi
{code}

Inside in master it will take care of starting up all the other beasties if distmode == false.

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik updated HBASE-4209:
------------------------------------

    Assignee: Roman Shaposhnik
      Status: Patch Available  (was: Open)

> The HBase hbase-daemon.sh SIGKILLs master when stopping it
> ----------------------------------------------------------
>
>                 Key: HBASE-4209
>                 URL: https://issues.apache.org/jira/browse/HBASE-4209
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments: HBASE-4209.patch.txt
>
>
> There's a bit of code in hbase-daemon.sh that makes HBase master being SIGKILLed when stopping it rather than trying SIGTERM (like it does for other daemons). When HBase is executed in a standalone mode (and the only daemon you need to run is master) that causes newly created tables to go missing as unflushed data is thrown out. If there was not a good reason to kill master with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira