You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ramkrishna S Vasudevan <ra...@huawei.com> on 2011/04/14 15:09:43 UTC

Cannot transition the unassinged node in opening state due to version mismatch

 We were runing the HBase cluster for one whole night.

 

Then we tried to restart the region  server.  That we got an error saying
transition of a region failed due to version mismatch 

in the unassigned node.

 

The META region was actually getting loaded as part of log replay that is
when we got the transition error.

The surprising part was that the difference in mismatch was sometimes 1 or
sometimes 2 for the same Region.

 

2011-04-14 11:07:19,786 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=106.56 KB,
free=12.28 MB, max=12.39 MB, blocks=2, accesses=2, hits=0, hitRatio=0.00%%,
cachingAccesses=2, cachingHits=0, cachingHitsRatio=0.00%%, evictions=0,
evicted=0, evictedPerRun=NaN 
2011-04-14 11:07:35,339 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
regionserver:60020-0x12f52766dcd0001 Attempting to transition node
1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING 
2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
failed, the node existed but was version 2 not the expected version 1 
2011-04-14 11:07:35,350 WARN
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
refreshing OPENING; region=1028785192, context=open_region_progress 
2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.regionserver.HRegion:
Progressable reporter failed, stopping replay 
2011-04-14 11:07:35,386 ERROR
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
of region=.META.,,1.1028785192 

regionserver:60020-0x12f52766dcd0001 Attempting to transition node
1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING 
2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
failed, the node existed but was version 5 not the expected version 3 
2011-04-14 11:08:33,744 WARN
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
refreshing OPENING; region=1028785192, context=open_region_progress 
2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.regionserver.HRegion:
Progressable reporter failed, stopping replay 
2011-04-14 11:08:33,775 ERROR
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
of region=.META.,,1.1028785192

 

 

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 


Re: Cannot transition the unassinged node in opening state due to version mismatch

Posted by Stack <st...@duboce.net>.
The 'version' in above is the znode version.  One RS tried to move a
state from OPENING to OPENING but it looks like it failed because the
version it expected to find in zk had moved on.  Likely the region has
been assumed by another because the first region took too long
opening.

St.Ack

On Fri, Apr 15, 2011 at 2:17 AM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
> Yes I am looking into the logs also.
>
> Here we are simulating a client that has 25 threads which is trying to
> insert data .
>
> It would be great if you can tell when a version mismatch can happen in the
> unassigned node ?
>
> Regards
> Ram
>
> ****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!
>
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
> Jean-Daniel Cryans
> Sent: Friday, April 15, 2011 12:23 AM
> To: user@hbase.apache.org; ramakrishnas@huawei.com
> Subject: Re: Cannot transition the unassinged node in opening state due to
> version mismatch
>
> Ok so this is the result of chain events, without which we can't
> really tell what was going on. You need to find more information about
> that region in the master log and region servers logs, try to find its
> story.
>
> BTW which version of HBase is this?
>
> J-D
>
> On Thu, Apr 14, 2011 at 6:09 AM, Ramkrishna S Vasudevan
> <ra...@huawei.com> wrote:
>>  We were runing the HBase cluster for one whole night.
>>
>>
>>
>> Then we tried to restart the region  server.  That we got an error saying
>> transition of a region failed due to version mismatch
>>
>> in the unassigned node.
>>
>>
>>
>> The META region was actually getting loaded as part of log replay that is
>> when we got the transition error.
>>
>> The surprising part was that the difference in mismatch was sometimes 1 or
>> sometimes 2 for the same Region.
>>
>>
>>
>> 2011-04-14 11:07:19,786 DEBUG
>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=106.56
> KB,
>> free=12.28 MB, max=12.39 MB, blocks=2, accesses=2, hits=0,
> hitRatio=0.00%%,
>> cachingAccesses=2, cachingHits=0, cachingHitsRatio=0.00%%, evictions=0,
>> evicted=0, evictedPerRun=NaN
>> 2011-04-14 11:07:35,339 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
>> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
>> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
>> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
>> failed, the node existed but was version 2 not the expected version 1
>> 2011-04-14 11:07:35,350 WARN
>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
>> refreshing OPENING; region=1028785192, context=open_region_progress
>> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.regionserver.HRegion:
>> Progressable reporter failed, stopping replay
>> 2011-04-14 11:07:35,386 ERROR
>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> open
>> of region=.META.,,1.1028785192
>>
>> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
>> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
>> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
>> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
>> failed, the node existed but was version 5 not the expected version 3
>> 2011-04-14 11:08:33,744 WARN
>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
>> refreshing OPENING; region=1028785192, context=open_region_progress
>> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.regionserver.HRegion:
>> Progressable reporter failed, stopping replay
>> 2011-04-14 11:08:33,775 ERROR
>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> open
>> of region=.META.,,1.1028785192
>>
>>
>>
>>
>>
>>
> ****************************************************************************
>> ***********
>> This e-mail and attachments contain confidential information from HUAWEI,
>> which is intended only for the person or entity whose address is listed
>> above. Any use of the information contained herein in any way (including,
>> but not limited to, total or partial disclosure, reproduction, or
>> dissemination) by persons other than the intended recipient's) is
>> prohibited. If you receive this e-mail in error, please notify the sender
> by
>> phone or email immediately and delete it!
>>
>>
>>
>>
>
>

RE: Cannot transition the unassinged node in opening state due to version mismatch

Posted by Ramkrishna S Vasudevan <ra...@huawei.com>.
Yes I am looking into the logs also.

Here we are simulating a client that has 25 threads which is trying to
insert data .

It would be great if you can tell when a version mismatch can happen in the
unassigned node ?

Regards
Ram

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!


-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
Jean-Daniel Cryans
Sent: Friday, April 15, 2011 12:23 AM
To: user@hbase.apache.org; ramakrishnas@huawei.com
Subject: Re: Cannot transition the unassinged node in opening state due to
version mismatch

Ok so this is the result of chain events, without which we can't
really tell what was going on. You need to find more information about
that region in the master log and region servers logs, try to find its
story.

BTW which version of HBase is this?

J-D

On Thu, Apr 14, 2011 at 6:09 AM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
>  We were runing the HBase cluster for one whole night.
>
>
>
> Then we tried to restart the region  server.  That we got an error saying
> transition of a region failed due to version mismatch
>
> in the unassigned node.
>
>
>
> The META region was actually getting loaded as part of log replay that is
> when we got the transition error.
>
> The surprising part was that the difference in mismatch was sometimes 1 or
> sometimes 2 for the same Region.
>
>
>
> 2011-04-14 11:07:19,786 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=106.56
KB,
> free=12.28 MB, max=12.39 MB, blocks=2, accesses=2, hits=0,
hitRatio=0.00%%,
> cachingAccesses=2, cachingHits=0, cachingHitsRatio=0.00%%, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2011-04-14 11:07:35,339 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> failed, the node existed but was version 2 not the expected version 1
> 2011-04-14 11:07:35,350 WARN
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> refreshing OPENING; region=1028785192, context=open_region_progress
> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> Progressable reporter failed, stopping replay
> 2011-04-14 11:07:35,386 ERROR
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
open
> of region=.META.,,1.1028785192
>
> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> failed, the node existed but was version 5 not the expected version 3
> 2011-04-14 11:08:33,744 WARN
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> refreshing OPENING; region=1028785192, context=open_region_progress
> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> Progressable reporter failed, stopping replay
> 2011-04-14 11:08:33,775 ERROR
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
open
> of region=.META.,,1.1028785192
>
>
>
>
>
>
****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender
by
> phone or email immediately and delete it!
>
>
>
>


Re: Cannot transition the unassinged node in opening state due to version mismatch

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Ok so this is the result of chain events, without which we can't
really tell what was going on. You need to find more information about
that region in the master log and region servers logs, try to find its
story.

BTW which version of HBase is this?

J-D

On Thu, Apr 14, 2011 at 6:09 AM, Ramkrishna S Vasudevan
<ra...@huawei.com> wrote:
>  We were runing the HBase cluster for one whole night.
>
>
>
> Then we tried to restart the region  server.  That we got an error saying
> transition of a region failed due to version mismatch
>
> in the unassigned node.
>
>
>
> The META region was actually getting loaded as part of log replay that is
> when we got the transition error.
>
> The surprising part was that the difference in mismatch was sometimes 1 or
> sometimes 2 for the same Region.
>
>
>
> 2011-04-14 11:07:19,786 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=106.56 KB,
> free=12.28 MB, max=12.39 MB, blocks=2, accesses=2, hits=0, hitRatio=0.00%%,
> cachingAccesses=2, cachingHits=0, cachingHitsRatio=0.00%%, evictions=0,
> evicted=0, evictedPerRun=NaN
> 2011-04-14 11:07:35,339 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> failed, the node existed but was version 2 not the expected version 1
> 2011-04-14 11:07:35,350 WARN
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> refreshing OPENING; region=1028785192, context=open_region_progress
> 2011-04-14 11:07:35,350 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> Progressable reporter failed, stopping replay
> 2011-04-14 11:07:35,386 ERROR
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
> of region=.META.,,1.1028785192
>
> regionserver:60020-0x12f52766dcd0001 Attempting to transition node
> 1028785192/.META. from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12f52766dcd0001 Attempt to transition the unassigned
> node for 1028785192 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> failed, the node existed but was version 5 not the expected version 3
> 2011-04-14 11:08:33,744 WARN
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> refreshing OPENING; region=1028785192, context=open_region_progress
> 2011-04-14 11:08:33,744 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> Progressable reporter failed, stopping replay
> 2011-04-14 11:08:33,775 ERROR
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
> of region=.META.,,1.1028785192
>
>
>
>
>
> ****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!
>
>
>
>