You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Matthias Hofschen <ho...@gmail.com> on 2011/08/11 11:33:13 UTC

Check permissions on unix filesystem?

Hi,
we had an interesting failure yesterday on the old 0.20.4 version of hbase.
I realize that this is a very old version but am wondering whether this is
an issue that is still present and should be fixed.
We added a new node to a 44 node cluster starting the datanode and
regionserver processes on it. The Unix filesystem was configured
incorrectly, i.e. /tmp was not writable to hadoop process. Both datanode and
regionserver processes had issues with the permissions.

The datanode process stopped with an error message:

2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir
java.io.IOException: Permission denied
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.checkAndCreate(File.java:1704)
        at java.io.File.createTempFile(File.java:1792)
        at java.io.File.createTempFile(File.java:1828)
        at
org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
        ....
2011-08-06 23:37:20,471 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hdpxxx
************************************************************/

The regionserver did not stop even though the error was logged:

2011-08-07 00:07:39.742::WARN:  tmpdir
java.io.IOException: Permission denied
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.checkAndCreate(File.java:1704)
        at java.io.File.createTempFile(File.java:1792)
        at java.io.File.createTempFile(File.java:1828)
        at
org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
       .......
        at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430)

In fact to the master process the regionserver looked fine, so it was trying
to send regions its way. Regionserver rejected them. So the master/balancer
was going into a assign/reassign cycle destabilizing the cluster. Many puts
and gets simply failed with NotServingRegionExceptions and took a long time
to complete.

Please advise whether this may be a problem in 0.90x code.

Cheers Matthias

Re: Check permissions on unix filesystem?

Posted by Matthias Hofschen <ho...@gmail.com>.
Hi,
I added HBASE-4202 for this.
Hope stacktraces are enough.
Matthias

On 8/11/11, Stack <st...@duboce.net> wrote:
> Mind making an issue and pasting full stack traces with some
> surrounding log.  My guess is we likely will do same in 0.90.  Your
> snippets will help us figure where to dig in.
>
> Thanks Matthias,
> St.Ack
>
> On Thu, Aug 11, 2011 at 2:33 AM, Matthias Hofschen <ho...@gmail.com>
> wrote:
>> Hi,
>> we had an interesting failure yesterday on the old 0.20.4 version of
>> hbase.
>> I realize that this is a very old version but am wondering whether this is
>> an issue that is still present and should be fixed.
>> We added a new node to a 44 node cluster starting the datanode and
>> regionserver processes on it. The Unix filesystem was configured
>> incorrectly, i.e. /tmp was not writable to hadoop process. Both datanode
>> and
>> regionserver processes had issues with the permissions.
>>
>> The datanode process stopped with an error message:
>>
>> 2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir
>> java.io.IOException: Permission denied
>>        at java.io.UnixFileSystem.createFileExclusively(Native Method)
>>        at java.io.File.checkAndCreate(File.java:1704)
>>        at java.io.File.createTempFile(File.java:1792)
>>        at java.io.File.createTempFile(File.java:1828)
>>        at
>> org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
>>        ....
>> 2011-08-06 23:37:20,471 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down DataNode at hdpxxx
>> ************************************************************/
>>
>> The regionserver did not stop even though the error was logged:
>>
>> 2011-08-07 00:07:39.742::WARN:  tmpdir
>> java.io.IOException: Permission denied
>>        at java.io.UnixFileSystem.createFileExclusively(Native Method)
>>        at java.io.File.checkAndCreate(File.java:1704)
>>        at java.io.File.createTempFile(File.java:1792)
>>        at java.io.File.createTempFile(File.java:1828)
>>        at
>> org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
>>       .......
>>        at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792)
>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430)
>>
>> In fact to the master process the regionserver looked fine, so it was
>> trying
>> to send regions its way. Regionserver rejected them. So the
>> master/balancer
>> was going into a assign/reassign cycle destabilizing the cluster. Many
>> puts
>> and gets simply failed with NotServingRegionExceptions and took a long
>> time
>> to complete.
>>
>> Please advise whether this may be a problem in 0.90x code.
>>
>> Cheers Matthias
>>
>

Re: Check permissions on unix filesystem?

Posted by Stack <st...@duboce.net>.
Mind making an issue and pasting full stack traces with some
surrounding log.  My guess is we likely will do same in 0.90.  Your
snippets will help us figure where to dig in.

Thanks Matthias,
St.Ack

On Thu, Aug 11, 2011 at 2:33 AM, Matthias Hofschen <ho...@gmail.com> wrote:
> Hi,
> we had an interesting failure yesterday on the old 0.20.4 version of hbase.
> I realize that this is a very old version but am wondering whether this is
> an issue that is still present and should be fixed.
> We added a new node to a 44 node cluster starting the datanode and
> regionserver processes on it. The Unix filesystem was configured
> incorrectly, i.e. /tmp was not writable to hadoop process. Both datanode and
> regionserver processes had issues with the permissions.
>
> The datanode process stopped with an error message:
>
> 2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir
> java.io.IOException: Permission denied
>        at java.io.UnixFileSystem.createFileExclusively(Native Method)
>        at java.io.File.checkAndCreate(File.java:1704)
>        at java.io.File.createTempFile(File.java:1792)
>        at java.io.File.createTempFile(File.java:1828)
>        at
> org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
>        ....
> 2011-08-06 23:37:20,471 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at hdpxxx
> ************************************************************/
>
> The regionserver did not stop even though the error was logged:
>
> 2011-08-07 00:07:39.742::WARN:  tmpdir
> java.io.IOException: Permission denied
>        at java.io.UnixFileSystem.createFileExclusively(Native Method)
>        at java.io.File.checkAndCreate(File.java:1704)
>        at java.io.File.createTempFile(File.java:1792)
>        at java.io.File.createTempFile(File.java:1828)
>        at
> org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
>       .......
>        at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430)
>
> In fact to the master process the regionserver looked fine, so it was trying
> to send regions its way. Regionserver rejected them. So the master/balancer
> was going into a assign/reassign cycle destabilizing the cluster. Many puts
> and gets simply failed with NotServingRegionExceptions and took a long time
> to complete.
>
> Please advise whether this may be a problem in 0.90x code.
>
> Cheers Matthias
>