You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Andrey Timerbaev <at...@gmx.net> on 2010/08/26 17:16:34 UTC

RegionServer can't recover after a failure

Dear experts,

Could you kindly suggest, how to help the RegionServer to complete
initialization in the following situation:

After a failure of one or RegionServers, which is running on a dedicated node in
a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize
available tables. The region server's log contains this exception:

2010-08-26 18:56:49,073 INFO
org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager: Region
log has 9 unfinished transactions. Going to the transaction log to resolve
2010-08-26 18:56:49,091 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit for
row <> in tableName .META.: location server 10.2.146.41:60020, location region
name .META.,,1
2010-08-26 18:56:49,178 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
STAT_STARTUPS_TABLE,,1282738349665
java.lang.RuntimeException: Table not created. Call createTable() first
        at
org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.
initTable(HBaseBackedTransactionLogger.java:76)
        at
org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.<init>
(HBaseBackedTransactionLogger.java:69)
        at
org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
getGlobalTransactionLog(THLogRecoveryManager.java:256)
        at
org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
resolvePendingTransaction(THLogRecoveryManager.java:225)
        at
org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
getCommitsFromLog(THLogRecoveryManager.java:206)
        at
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.
doReconstructionLog(TransactionalRegion.java:145)
        at org.apache.hadoop.hbase.regionserver.HRegion.initialize
(HRegion.java:326)
        at
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionServer.
instantiateRegion(TransactionalRegionServer.java:121)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion
(HRegionServer.java:1531)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run
(HRegionServer.java:1451)
        at java.lang.Thread.run(Thread.java:619)

After a look into HBase source code I found out, that the "Table not created.
Call createTable()" message appears, if the HBaseBackedTransactionLogger is
unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the
table should be, whether it is critical and what should I do in this situation.

Any help is appreciated.

Andrey



Re: RegionServer can't recover after a failure

Posted by Andrey Timerbaev <at...@gmx.net>.
> The issue is that the global transaction log table is not created yet. You can
do so simply by calling
> HBaseBackedTransactionLogger.createTable() at the time when you are seeding
the rest of your tables.

> James Kennedy
> Project Manager
> Troove Inc.

Hello James,

Thank you for the comment. Do I understand right, that the createTable() method
should be called before the very first transaction performed? In other words, in
our situation, when region servers report "Region
log has 9 unfinished transactions" and the just created table __GLOBAL_TRX_LOG__
exists, but it's empty, this won't help?

Maybe, just for the moment we can clean the log, where the number of unfinished
transactions is logged? And then maybe RegionServers won't try to recover the
unfinished transactions? Could you kindly suggest, where it is logged?

Thank you,
Andrey


Re: RegionServer can't recover after a failure

Posted by Naresh Rapolu <nr...@purdue.edu>.
Hello James,

Can you explain a bit more about the design for this global transaction 
log table ?  When is it supposed to be created and how does it solve the 
problem of missing COMMIT statement in the region server WAL, before 
dying ?

Thanks,
Naresh.

On 08/27/2010 07:26 PM, James Kennedy wrote:
> I can help. I'm a developer on the transactional hbase extension which you must be using.
>
> The issue is that the global transaction log table is not created yet. You can do so simply by calling
> HBaseBackedTransactionLogger.createTable() at the time when you are seeding the rest of your tables.
>
> I apologize that the extension as given in GitHub is not yet mature.  While it works (HBase 0.21) it is poorly documented and needs more thorough testing.
>
> We have recently updated it to work with HBase 0.89.20100726 and it is much more stable and very slightly better documented. We are waiting for an HBase patch submission before we push it to hbase-trx at github.
>
> Thanks,
>
> James Kennedy
> Project Manager
> Troove Inc.
>
> On 2010-08-26, at 8:16 AM, Andrey Timerbaev wrote:
>
>    
>> Dear experts,
>>
>> Could you kindly suggest, how to help the RegionServer to complete
>> initialization in the following situation:
>>
>> After a failure of one or RegionServers, which is running on a dedicated node in
>> a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize
>> available tables. The region server's log contains this exception:
>>
>> 2010-08-26 18:56:49,073 INFO
>> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager: Region
>> log has 9 unfinished transactions. Going to the transaction log to resolve
>> 2010-08-26 18:56:49,091 DEBUG
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit for
>> row<>  in tableName .META.: location server 10.2.146.41:60020, location region
>> name .META.,,1
>> 2010-08-26 18:56:49,178 ERROR
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
>> STAT_STARTUPS_TABLE,,1282738349665
>> java.lang.RuntimeException: Table not created. Call createTable() first
>>         at
>> org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.
>> initTable(HBaseBackedTransactionLogger.java:76)
>>         at
>> org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.<init>
>> (HBaseBackedTransactionLogger.java:69)
>>         at
>> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
>> getGlobalTransactionLog(THLogRecoveryManager.java:256)
>>         at
>> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
>> resolvePendingTransaction(THLogRecoveryManager.java:225)
>>         at
>> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
>> getCommitsFromLog(THLogRecoveryManager.java:206)
>>         at
>> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.
>> doReconstructionLog(TransactionalRegion.java:145)
>>         at org.apache.hadoop.hbase.regionserver.HRegion.initialize
>> (HRegion.java:326)
>>         at
>> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionServer.
>> instantiateRegion(TransactionalRegionServer.java:121)
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion
>> (HRegionServer.java:1531)
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run
>> (HRegionServer.java:1451)
>>         at java.lang.Thread.run(Thread.java:619)
>>
>> After a look into HBase source code I found out, that the "Table not created.
>> Call createTable()" message appears, if the HBaseBackedTransactionLogger is
>> unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the
>> table should be, whether it is critical and what should I do in this situation.
>>
>> Any help is appreciated.
>>
>> Andrey
>>
>>
>>      
>
>    


Re: RegionServer can't recover after a failure

Posted by James Kennedy <jk...@troove.net>.
I can help. I'm a developer on the transactional hbase extension which you must be using.

The issue is that the global transaction log table is not created yet. You can do so simply by calling
HBaseBackedTransactionLogger.createTable() at the time when you are seeding the rest of your tables.

I apologize that the extension as given in GitHub is not yet mature.  While it works (HBase 0.21) it is poorly documented and needs more thorough testing.

We have recently updated it to work with HBase 0.89.20100726 and it is much more stable and very slightly better documented. We are waiting for an HBase patch submission before we push it to hbase-trx at github.

Thanks,

James Kennedy
Project Manager
Troove Inc.

On 2010-08-26, at 8:16 AM, Andrey Timerbaev wrote:

> Dear experts,
> 
> Could you kindly suggest, how to help the RegionServer to complete
> initialization in the following situation:
> 
> After a failure of one or RegionServers, which is running on a dedicated node in
> a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize
> available tables. The region server's log contains this exception:
> 
> 2010-08-26 18:56:49,073 INFO
> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager: Region
> log has 9 unfinished transactions. Going to the transaction log to resolve
> 2010-08-26 18:56:49,091 DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit for
> row <> in tableName .META.: location server 10.2.146.41:60020, location region
> name .META.,,1
> 2010-08-26 18:56:49,178 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
> STAT_STARTUPS_TABLE,,1282738349665
> java.lang.RuntimeException: Table not created. Call createTable() first
>        at
> org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.
> initTable(HBaseBackedTransactionLogger.java:76)
>        at
> org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.<init>
> (HBaseBackedTransactionLogger.java:69)
>        at
> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
> getGlobalTransactionLog(THLogRecoveryManager.java:256)
>        at
> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
> resolvePendingTransaction(THLogRecoveryManager.java:225)
>        at
> org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
> getCommitsFromLog(THLogRecoveryManager.java:206)
>        at
> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.
> doReconstructionLog(TransactionalRegion.java:145)
>        at org.apache.hadoop.hbase.regionserver.HRegion.initialize
> (HRegion.java:326)
>        at
> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionServer.
> instantiateRegion(TransactionalRegionServer.java:121)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion
> (HRegionServer.java:1531)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run
> (HRegionServer.java:1451)
>        at java.lang.Thread.run(Thread.java:619)
> 
> After a look into HBase source code I found out, that the "Table not created.
> Call createTable()" message appears, if the HBaseBackedTransactionLogger is
> unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the
> table should be, whether it is critical and what should I do in this situation.
> 
> Any help is appreciated.
> 
> Andrey
> 
> 


Re: RegionServer can't recover after a failure

Posted by Andrey Timerbaev <at...@gmx.net>.
> You are running transactional hbase?  This is intentional I take it.

> Me neither.  Let me poke the transactional fellows and see if they can
> offer help.
> 
> Thanks,
> St.Ack

Yes, I'm running a transactional hbase.

Thanks in advance for involving any of transactional experts.

Andrey





Re: RegionServer can't recover after a failure

Posted by Stack <st...@duboce.net>.
On Thu, Aug 26, 2010 at 8:16 AM, Andrey Timerbaev <at...@gmx.net> wrote:
> Dear experts,
>
> Could you kindly suggest, how to help the RegionServer to complete
> initialization in the following situation:
>
> After a failure of one or RegionServers, which is running on a dedicated node in
> a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize
> available tables. The region server's log contains this exception:
>

You are running transactional hbase?  This is intentional I take it.

> After a look into HBase source code I found out, that the "Table not created.
> Call createTable()" message appears, if the HBaseBackedTransactionLogger is
> unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the
> table should be, whether it is critical and what should I do in this situation.
>
Me neither.  Let me poke the transactional fellows and see if they can
offer help.

Thanks,
St.Ack