You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Cristian Ivascu <ci...@adobe.com> on 2008/08/21 17:04:28 UTC

HBASE-766 still seems to reproduce on hbase 0.2.0

Hi all,

I have been trying to play with Hbase in the last few days and hit a snag - when trying to fill up an Hbase table, via  25 thrift-generated clients (the injectors) -  after a few hours it crashed. After reading through the logs and in the issue tracker, it sounds just like Hbase-766.

Errors: in the log file I get a bunch of NotServingRegionExceptions:

error: org.apache.hadoop.hbase.NotServingRegionException: Region test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711 closed.

Also, in the Hbase-hadoop-master log file, I get File does not exists exceptions, like the one below (but with different row keys).
2008-08-20 02:13:24,135 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127: java.io.FileNotFoundException: File does not exist: hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
    at org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
    at org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
    at org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
    at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
    at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
    at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
    at java.lang.Thread.run(Thread.java:619)
 from 192.168.1.103:60020

After a few of these the entire cluster (of 5 machines) went down.
I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.

Any ideas?

Thanks,
Cristian Ivascu

Re: FW: HBASE-766 still seems to reproduce on hbase 0.2.0

Posted by Ryan Smith <ry...@gmail.com>.
Try building the latest from branches/0.2 in the repository and re-run.  I
got this issue on the 0.2.0 release myself.  I rebuilt the branch on
revision 686335 and had better stability and performance from the 0.2.0
release.  Im going to try the 688817 revision which is the current right
now.

-Ryan

On Mon, Aug 25, 2008 at 2:28 AM, Cristian Ivascu <ci...@adobe.com> wrote:

> Re-posing this to a more appropriate list - this is something that's more
> likely to appear when using hbase.
>
> Hi all,
>
> I have been trying to play with Hbase in the last few days and hit a snag -
> when trying to fill up an Hbase table, via  25 thrift-generated clients (the
> injectors) -  after a few hours it crashed. After reading through the logs
> and in the issue tracker, it sounds just like Hbase-766.
>
> Errors: in the log file I get a bunch of NotServingRegionExceptions:
>
> error: org.apache.hadoop.hbase.NotServingRegionException: Region
> test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711 closed.
>
> Also, in the Hbase-hadoop-master log file, I get File does not exists
> exceptions, like the one below (but with different row keys).
> 2008-08-20 02:13:24,135 INFO org.apache.hadoop.hbase.master.ServerManager:
> Received MSG_REPORT_CLOSE:
> test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127:
> java.io.FileNotFoundException: File does not exist: hdfs://
> 192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
>    at
> org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
>    at
> org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
>    at
> org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
>    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
>    at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
>    at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
>    at java.lang.Thread.run(Thread.java:619)
>  from 192.168.1.103:60020
>
> After a few of these the entire cluster (of 5 machines) went down.
> I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.
>
> Any ideas?
>
> Thanks,
> Cristian Ivascu
> ------ End of Forwarded Message
>

Re: FW: HBASE-766 still seems to reproduce on hbase 0.2.0

Posted by Andrew Purtell <ap...@yahoo.com>.
Welcome back Stack.


--- On Mon, 8/25/08, stack <st...@duboce.net> wrote:

> From: stack <st...@duboce.net>
> Subject: Re: FW: HBASE-766 still seems to reproduce on hbase 0.2.0
> To: hbase-user@hadoop.apache.org
> Date: Monday, August 25, 2008, 1:23 PM
> If its not an hdfs issue, as Andrew speculates (check back
> earlier in 
> the logs particularly for region 
> test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711),
> it looks 
> like hbase-766 -- though this is supposed to be fixed.
> 
> If you try to do a './bin/hadoop fs -lsr 
> hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/',
> 
> there is nothing there?
> 
> Is this a fresh upload or an upload against data that has
> been migrated 
> or created with a version of hbase previous to 0.2.0
> release?
> 
> Thanks,
> St.Ack
> 
> 
> Cristian Ivascu wrote:
> > Re-posing this to a more appropriate list - this is
> something that's more likely to appear when using hbase.
> >
> > Hi all,
> >
> > I have been trying to play with Hbase in the last few
> days and hit a snag - when trying to fill up an Hbase table,
> via  25 thrift-generated clients (the injectors) -  after a
> few hours it crashed. After reading through the logs and in
> the issue tracker, it sounds just like Hbase-766.
> >
> > Errors: in the log file I get a bunch of
> NotServingRegionExceptions:
> >
> > error:
> org.apache.hadoop.hbase.NotServingRegionException: Region
> test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711
> closed.
> >
> > Also, in the Hbase-hadoop-master log file, I get File
> does not exists exceptions, like the one below (but with
> different row keys).
> > 2008-08-20 02:13:24,135 INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_CLOSE:
> test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127:
> java.io.FileNotFoundException: File does not exist:
> hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
> >     at
> org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
> >     at
> org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
> >     at
> org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
> >     at
> org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
> >     at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
> >     at
> org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
> >     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
> >     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
> >     at
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
> >     at java.lang.Thread.run(Thread.java:619)
> >  from 192.168.1.103:60020
> >
> > After a few of these the entire cluster (of 5
> machines) went down.
> > I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.
> >
> > Any ideas?
> >
> > Thanks,
> > Cristian Ivascu
> > ------ End of Forwarded Message
> >


      

Re: FW: HBASE-766 still seems to reproduce on hbase 0.2.0

Posted by stack <st...@duboce.net>.
If its not an hdfs issue, as Andrew speculates (check back earlier in 
the logs particularly for region 
test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711), it looks 
like hbase-766 -- though this is supposed to be fixed.

If you try to do a './bin/hadoop fs -lsr 
hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/', 
there is nothing there?

Is this a fresh upload or an upload against data that has been migrated 
or created with a version of hbase previous to 0.2.0 release?

Thanks,
St.Ack


Cristian Ivascu wrote:
> Re-posing this to a more appropriate list - this is something that's more likely to appear when using hbase.
>
> Hi all,
>
> I have been trying to play with Hbase in the last few days and hit a snag - when trying to fill up an Hbase table, via  25 thrift-generated clients (the injectors) -  after a few hours it crashed. After reading through the logs and in the issue tracker, it sounds just like Hbase-766.
>
> Errors: in the log file I get a bunch of NotServingRegionExceptions:
>
> error: org.apache.hadoop.hbase.NotServingRegionException: Region test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711 closed.
>
> Also, in the Hbase-hadoop-master log file, I get File does not exists exceptions, like the one below (but with different row keys).
> 2008-08-20 02:13:24,135 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127: java.io.FileNotFoundException: File does not exist: hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
>     at org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
>     at org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
>     at org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
>     at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
>     at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
>     at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
>     at java.lang.Thread.run(Thread.java:619)
>  from 192.168.1.103:60020
>
> After a few of these the entire cluster (of 5 machines) went down.
> I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.
>
> Any ideas?
>
> Thanks,
> Cristian Ivascu
> ------ End of Forwarded Message
>   


Re: FW: HBASE-766 still seems to reproduce on hbase 0.2.0

Posted by Andrew Purtell <ap...@yahoo.com>.
Hello Christian,

Is there anything that appears relevant in the DFS namenode or datanode logs? Please use Hadoop 0.17.2 instead of 0.17.1 however to avoid having the datanode logs filled with noise. Also 0.17.2 contains other critical fixes for DFS problems.

On our clusters I have found actually that the first failures we see when the cluster is under load are often at the DFS layer and what manifests at the HBase layer are only symptoms of that.

Hope that helps,

   - Andy


--- On Sun, 8/24/08, Cristian Ivascu <ci...@adobe.com> wrote:

> From: Cristian Ivascu <ci...@adobe.com>
> Subject: FW: HBASE-766 still seems to reproduce on hbase 0.2.0
> To: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
> Date: Sunday, August 24, 2008, 11:28 PM
> Re-posing this to a more appropriate list - this is
> something that's more likely to appear when using hbase.
> 
> Hi all,
> 
> I have been trying to play with Hbase in the last few days
> and hit a snag - when trying to fill up an Hbase table, via 
> 25 thrift-generated clients (the injectors) -  after a few
> hours it crashed. After reading through the logs and in the
> issue tracker, it sounds just like Hbase-766.
> 
> Errors: in the log file I get a bunch of
> NotServingRegionExceptions:
> 
> error: org.apache.hadoop.hbase.NotServingRegionException:
> Region
> test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711
> closed.
> 
> Also, in the Hbase-hadoop-master log file, I get File does
> not exists exceptions, like the one below (but with
> different row keys).
> 2008-08-20 02:13:24,135 INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_CLOSE:
> test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127:
> java.io.FileNotFoundException: File does not exist:
> hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
>     at
> org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
>     at
> org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
>     at
> org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
>     at
> org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
>     at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
>     at
> org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
>     at java.lang.Thread.run(Thread.java:619)
>  from 192.168.1.103:60020
> 
> After a few of these the entire cluster (of 5 machines)
> went down.
> I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.
> 
> Any ideas?
> 
> Thanks,
> Cristian Ivascu
> ------ End of Forwarded Message


      

FW: HBASE-766 still seems to reproduce on hbase 0.2.0

Posted by Cristian Ivascu <ci...@adobe.com>.
Re-posing this to a more appropriate list - this is something that's more likely to appear when using hbase.

Hi all,

I have been trying to play with Hbase in the last few days and hit a snag - when trying to fill up an Hbase table, via  25 thrift-generated clients (the injectors) -  after a few hours it crashed. After reading through the logs and in the issue tracker, it sounds just like Hbase-766.

Errors: in the log file I get a bunch of NotServingRegionExceptions:

error: org.apache.hadoop.hbase.NotServingRegionException: Region test_user,81B09FA2-9A9F-42CD-9D15-D1FE608D39D4,1219115748711 closed.

Also, in the Hbase-hadoop-master log file, I get File does not exists exceptions, like the one below (but with different row keys).
2008-08-20 02:13:24,135 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: test_user,C8441FEF-2809-4B1B-881A-6CDC5F21BECA,1219127080127: java.io.FileNotFoundException: File does not exist: hdfs://192.168.1.101:54310/hbase/test_user/203999734/subscribed_feed/mapfiles/5434845147549741098/data
    at org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:369)
    at org.apache.hadoop.hbase.regionserver.HStoreFile.length(HStoreFile.java:444)
    at org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:218)
    at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1653)
    at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:470)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:902)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:877)
    at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:817)
    at java.lang.Thread.run(Thread.java:619)
 from 192.168.1.103:60020

After a few of these the entire cluster (of 5 machines) went down.
I'm using Hbase 0.2.0 and Hadoop 0.1.7.1.

Any ideas?

Thanks,
Cristian Ivascu
------ End of Forwarded Message