You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Doug Meil <do...@explorysmedical.com> on 2011/05/07 01:44:03 UTC

HBase book/Wiki update

Hi folks,

The HBase wiki PerformanceTuning page has been ported to the HBase book in 2 sections:  Performance (for tuning) and Troubleshooting (for the GC log reading part).  Therefore this page is now obsolete and is not on the main HBase Wiki page anymore.  Historians can find it on the "obsolete pages" link.

Next week, the HBase wiki Troubleshooting page will be ported to the HBase book and then that will be obsolete.

The wiki FAQ page will continue, though.  This page was refreshed this week pointing to the HBase book for a number of things, and it will also be reorganized a bit to be grouped by "question/problem area" (sort of like the Troubleshooting section in the HBase book).


Doug Meil
Chief Software Architect, Explorys
doug.meil@explorys.com


Re: Table can't disable

Posted by Gaojinchao <ga...@huawei.com>.
I See. It seems like a bug. This is my analysis.

// disable table and master sent Close message to region server, Region state was set PENDING_CLOSE

2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467

// received splitting message and cleared Region state (PENDING_CLOSE)

2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66

// region server had closed region, but the region state had cleared. So it printed warning log.

2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)

// The region state was set PENDING_CLOSE again.  the table couldn't disable and enable.
2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)



-----邮件原件-----
发件人: Gaojinchao 
发送时间: 2011年5月8日 19:59
收件人: dev@hbase.apache.org
抄送: Chenjian
主题: Table can't disable

Today I test Hbase verison 0.90.3
I seems like some bugs
1, if node exists and node state is RS_ZK_REGION_CLOSED 
  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server 

case PENDING_CLOSE:
                LOG.info("Region has been PENDING_CLOSE for too " +
                    "long, running forced unassign again on region=" +
                    regionInfo.getRegionNameAsString());
                  try {
                    // If the server got the RPC, it will transition the node
                    // to CLOSING, so only do something here if no node exists
                    if (!ZKUtil.watchAndCheckExists(watcher,
                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
                      // Queue running of an unassign -- do actual unassign
                      // outside of the regionsInTransition lock.
                      unassigns.add(regionInfo);
                    }
					else
                    {
						//It need to process for node state is RS_ZK_REGION_CLOSED 
					}
                  } catch (NoNodeException e) {
                    LOG.debug("Node no longer existed so not forcing another " +
                      "unassignment");
                  } catch (KeeperException e) {
                    LOG.warn("Unexpected ZK exception timing out a region " +
                      "close", e);
                  }
2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
above is the logs. I need to keep digging.

Shell logs:
hbase(main):003:0*
hbase(main):004:0* disable 'ufdr'
11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING

ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.


Region server logs:

2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66


Hmaster logs:
2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:5


re: Table can't disable

Posted by Gaojinchao <ga...@huawei.com>.
Hi, Stack. 

I try to make a patch and process it. 
I want to send a zk message again when close region is timeout.
I try to reproduce and verify it. But it is difficulty.
Do you give me some help and review it. 

Thanks.

case PENDING_CLOSE:
                LOG.info("Region has been PENDING_CLOSE for too " +
                    "long, running forced unassign again on region=" +
                    regionInfo.getRegionNameAsString());
                  try {
                    // If the server got the RPC, it will transition the node
                    // to CLOSING, so only do something here if no node exists
                    if (!ZKUtil.watchAndCheckExists(watcher,
                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
                      // Queue running of an unassign -- do actual unassign
                      // outside of the regionsInTransition lock.
                      unassigns.add(regionInfo);
                    }
                    else
                    {
										// I add some code 
                       try {
                          String node = ZKAssign.getNodeName(watcher,
                             regionInfo.getEncodedName());
                          Stat stat = new Stat();
                          RegionTransitionData data = ZKAssign.getDataNoWatch(watcher,
                             node, stat);
                          if (data == null) {
                             LOG.warn("Data is null, node " + node + " no longer exists");
                             break;
                          }
                          if (data.getEventType() != EventType.RS_ZK_REGION_CLOSED) {
                             LOG.debug("Region has transitioned to CLOEING, allowing " +
                                  "watched event handlers to process");
                             break;
                          } 
                            
                         //In this case, Region server has ClOSED 
                         data = new RegionTransitionData(
                                 EventType.RS_ZK_REGION_CLOSED, regionInfo.getRegionName(),
                                    master.getServerName());
                         if (ZKUtil.setData(watcher, node, data.getBytes(),
                                   stat.getVersion()+1 )) {
                            	
                            // Node is now closed, let's trigger another close handler
                            LOG.info("Try to finish closed region=" +
                               regionInfo.getRegionNameAsString() + "again" );
                         }
                            
                      } catch (KeeperException ke) {
                         LOG.error("Unexpected ZK exception timing out PENDING_CLOSE region",
                            ke);
                         break;
                        }
                    	
                     }
                  } catch (NoNodeException e) {
                    LOG.debug("Node no longer existed so not forcing another " +
                      "unassignment");
                  } catch (KeeperException e) {
                    LOG.warn("Unexpected ZK exception timing out a region " +
                      "close", e);
                  }
                  break;

-----邮件原件-----
发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
发送时间: 2011年5月10日 1:38
收件人: dev@hbase.apache.org
主题: Re: Table can't disable

I'm looking into this Gao.  Something similar seems to have happened
here over the w/e.  Let me look at our systems first and then I'l be
back to you on this below.  Thanks,
St.Ack

On Sun, May 8, 2011 at 4:58 AM, Gaojinchao <ga...@huawei.com> wrote:
> Today I test Hbase verison 0.90.3
> I seems like some bugs
> 1, if node exists and node state is RS_ZK_REGION_CLOSED
>  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server
>
> case PENDING_CLOSE:
>                LOG.info("Region has been PENDING_CLOSE for too " +
>                    "long, running forced unassign again on region=" +
>                    regionInfo.getRegionNameAsString());
>                  try {
>                    // If the server got the RPC, it will transition the node
>                    // to CLOSING, so only do something here if no node exists
>                    if (!ZKUtil.watchAndCheckExists(watcher,
>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>                      // Queue running of an unassign -- do actual unassign
>                      // outside of the regionsInTransition lock.
>                      unassigns.add(regionInfo);
>                    }
>                                        else
>                    {
>                                                //It need to process for node state is RS_ZK_REGION_CLOSED
>                                        }
>                  } catch (NoNodeException e) {
>                    LOG.debug("Node no longer existed so not forcing another " +
>                      "unassignment");
>                  } catch (KeeperException e) {
>                    LOG.warn("Unexpected ZK exception timing out a region " +
>                      "close", e);
>                  }
> 2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
> above is the logs. I need to keep digging.
>
> Shell logs:
> hbase(main):003:0*
> hbase(main):004:0* disable 'ufdr'
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
> 11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING
>
> ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.
>
>
> Region server logs:
>
> 2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
> 2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
> 2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
> 2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
> 2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
> 2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>
>
> Hmaster logs:
> 2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
> 2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:5
>
>

Re: Table can't disable

Posted by Gaojinchao <ga...@huawei.com>.
Thanks, Stack.
You are very busy. :)
I hope to try my best to do something for community.  



-----邮件原件-----
发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
发送时间: 2011年5月17日 3:21
收件人: dev@hbase.apache.org
抄送: Chenjian
主题: Re: Table can't disable

On Sat, May 14, 2011 at 7:08 PM, Gaojinchao <ga...@huawei.com> wrote:
> I try to make a patch and process it.


OK.  What are you trying to do in the patch?  (Turns out my issue was
different to what you see).



> I want to send a zk message again when close region is timeout.
> I try to reproduce and verify it. But it is difficulty.

Yes, this stuff is tricky.  Its not easy making a test.  Have you seen
TestZKBasedOpenCloseRegion and TestMasterFailover.  These do messing
about with zk state.  Perhaps they help?

> Do you give me some help and review it.
>

Did you change code in the below?  If so, can you send a patch only?
Or better, attach it to a JIRA?  Patch is good because it is the
difference between what is hbase and what you have added.  Otherwise,
it takes a bit of work figuring what you have added.

Thank you Gao,
St.Ack

> Thanks.
>
> case PENDING_CLOSE:
>                LOG.info("Region has been PENDING_CLOSE for too " +
>                    "long, running forced unassign again on region=" +
>                    regionInfo.getRegionNameAsString());
>                  try {
>                    // If the server got the RPC, it will transition the node
>                    // to CLOSING, so only do something here if no node exists
>                    if (!ZKUtil.watchAndCheckExists(watcher,
>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>                      // Queue running of an unassign -- do actual unassign
>                      // outside of the regionsInTransition lock.
>                      unassigns.add(regionInfo);
>                    }
>                    else
>                    {
>                                                                                // I add some code
>                       try {
>                          String node = ZKAssign.getNodeName(watcher,
>                             regionInfo.getEncodedName());
>                          Stat stat = new Stat();
>                          RegionTransitionData data = ZKAssign.getDataNoWatch(watcher,
>                             node, stat);
>                          if (data == null) {
>                             LOG.warn("Data is null, node " + node + " no longer exists");
>                             break;
>                          }
>                          if (data.getEventType() != EventType.RS_ZK_REGION_CLOSED) {
>                             LOG.debug("Region has transitioned to CLOEING, allowing " +
>                                  "watched event handlers to process");
>                             break;
>                          }
>
>                         //In this case, Region server has ClOSED
>                         data = new RegionTransitionData(
>                                 EventType.RS_ZK_REGION_CLOSED, regionInfo.getRegionName(),
>                                    master.getServerName());
>                         if (ZKUtil.setData(watcher, node, data.getBytes(),
>                                   stat.getVersion()+1 )) {
>
>                            // Node is now closed, let's trigger another close handler
>                            LOG.info("Try to finish closed region=" +
>                               regionInfo.getRegionNameAsString() + "again" );
>                         }
>
>                      } catch (KeeperException ke) {
>                         LOG.error("Unexpected ZK exception timing out PENDING_CLOSE region",
>                            ke);
>                         break;
>                        }
>
>                     }
>                  } catch (NoNodeException e) {
>                    LOG.debug("Node no longer existed so not forcing another " +
>                      "unassignment");
>                  } catch (KeeperException e) {
>                    LOG.warn("Unexpected ZK exception timing out a region " +
>                      "close", e);
>                  }
>                  break;
>
> -----邮件原件-----
> 发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
> 发送时间: 2011年5月10日 1:38
> 收件人: dev@hbase.apache.org
> 主题: Re: Table can't disable
>
> I'm looking into this Gao.  Something similar seems to have happened
> here over the w/e.  Let me look at our systems first and then I'l be
> back to you on this below.  Thanks,
> St.Ack
>
> On Sun, May 8, 2011 at 4:58 AM, Gaojinchao <ga...@huawei.com> wrote:
>> Today I test Hbase verison 0.90.3
>> I seems like some bugs
>> 1, if node exists and node state is RS_ZK_REGION_CLOSED
>>  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server
>>
>> case PENDING_CLOSE:
>>                LOG.info("Region has been PENDING_CLOSE for too " +
>>                    "long, running forced unassign again on region=" +
>>                    regionInfo.getRegionNameAsString());
>>                  try {
>>                    // If the server got the RPC, it will transition the node
>>                    // to CLOSING, so only do something here if no node exists
>>                    if (!ZKUtil.watchAndCheckExists(watcher,
>>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>>                      // Queue running of an unassign -- do actual unassign
>>                      // outside of the regionsInTransition lock.
>>                      unassigns.add(regionInfo);
>>                    }
>>                                        else
>>                    {
>>                                                //It need to process for node state is RS_ZK_REGION_CLOSED
>>                                        }
>>                  } catch (NoNodeException e) {
>>                    LOG.debug("Node no longer existed so not forcing another " +
>>                      "unassignment");
>>                  } catch (KeeperException e) {
>>                    LOG.warn("Unexpected ZK exception timing out a region " +
>>                      "close", e);
>>                  }
>> 2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
>> above is the logs. I need to keep digging.
>>
>> Shell logs:
>> hbase(main):003:0*
>> hbase(main):004:0* disable 'ufdr'
>> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
>> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
>> 11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING
>>
>> ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.
>>
>>
>> Region server logs:
>>
>> 2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
>> 2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
>> 2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
>> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
>> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
>> 2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
>> 2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
>> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
>> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
>> 2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>>
>>
>> Hmaster logs:
>> 2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
>> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
>> 2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
>> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:5
>>
>>
>

Re: Table can't disable

Posted by Stack <st...@duboce.net>.
On Sat, May 14, 2011 at 7:08 PM, Gaojinchao <ga...@huawei.com> wrote:
> I try to make a patch and process it.


OK.  What are you trying to do in the patch?  (Turns out my issue was
different to what you see).



> I want to send a zk message again when close region is timeout.
> I try to reproduce and verify it. But it is difficulty.

Yes, this stuff is tricky.  Its not easy making a test.  Have you seen
TestZKBasedOpenCloseRegion and TestMasterFailover.  These do messing
about with zk state.  Perhaps they help?

> Do you give me some help and review it.
>

Did you change code in the below?  If so, can you send a patch only?
Or better, attach it to a JIRA?  Patch is good because it is the
difference between what is hbase and what you have added.  Otherwise,
it takes a bit of work figuring what you have added.

Thank you Gao,
St.Ack

> Thanks.
>
> case PENDING_CLOSE:
>                LOG.info("Region has been PENDING_CLOSE for too " +
>                    "long, running forced unassign again on region=" +
>                    regionInfo.getRegionNameAsString());
>                  try {
>                    // If the server got the RPC, it will transition the node
>                    // to CLOSING, so only do something here if no node exists
>                    if (!ZKUtil.watchAndCheckExists(watcher,
>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>                      // Queue running of an unassign -- do actual unassign
>                      // outside of the regionsInTransition lock.
>                      unassigns.add(regionInfo);
>                    }
>                    else
>                    {
>                                                                                // I add some code
>                       try {
>                          String node = ZKAssign.getNodeName(watcher,
>                             regionInfo.getEncodedName());
>                          Stat stat = new Stat();
>                          RegionTransitionData data = ZKAssign.getDataNoWatch(watcher,
>                             node, stat);
>                          if (data == null) {
>                             LOG.warn("Data is null, node " + node + " no longer exists");
>                             break;
>                          }
>                          if (data.getEventType() != EventType.RS_ZK_REGION_CLOSED) {
>                             LOG.debug("Region has transitioned to CLOEING, allowing " +
>                                  "watched event handlers to process");
>                             break;
>                          }
>
>                         //In this case, Region server has ClOSED
>                         data = new RegionTransitionData(
>                                 EventType.RS_ZK_REGION_CLOSED, regionInfo.getRegionName(),
>                                    master.getServerName());
>                         if (ZKUtil.setData(watcher, node, data.getBytes(),
>                                   stat.getVersion()+1 )) {
>
>                            // Node is now closed, let's trigger another close handler
>                            LOG.info("Try to finish closed region=" +
>                               regionInfo.getRegionNameAsString() + "again" );
>                         }
>
>                      } catch (KeeperException ke) {
>                         LOG.error("Unexpected ZK exception timing out PENDING_CLOSE region",
>                            ke);
>                         break;
>                        }
>
>                     }
>                  } catch (NoNodeException e) {
>                    LOG.debug("Node no longer existed so not forcing another " +
>                      "unassignment");
>                  } catch (KeeperException e) {
>                    LOG.warn("Unexpected ZK exception timing out a region " +
>                      "close", e);
>                  }
>                  break;
>
> -----邮件原件-----
> 发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
> 发送时间: 2011年5月10日 1:38
> 收件人: dev@hbase.apache.org
> 主题: Re: Table can't disable
>
> I'm looking into this Gao.  Something similar seems to have happened
> here over the w/e.  Let me look at our systems first and then I'l be
> back to you on this below.  Thanks,
> St.Ack
>
> On Sun, May 8, 2011 at 4:58 AM, Gaojinchao <ga...@huawei.com> wrote:
>> Today I test Hbase verison 0.90.3
>> I seems like some bugs
>> 1, if node exists and node state is RS_ZK_REGION_CLOSED
>>  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server
>>
>> case PENDING_CLOSE:
>>                LOG.info("Region has been PENDING_CLOSE for too " +
>>                    "long, running forced unassign again on region=" +
>>                    regionInfo.getRegionNameAsString());
>>                  try {
>>                    // If the server got the RPC, it will transition the node
>>                    // to CLOSING, so only do something here if no node exists
>>                    if (!ZKUtil.watchAndCheckExists(watcher,
>>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>>                      // Queue running of an unassign -- do actual unassign
>>                      // outside of the regionsInTransition lock.
>>                      unassigns.add(regionInfo);
>>                    }
>>                                        else
>>                    {
>>                                                //It need to process for node state is RS_ZK_REGION_CLOSED
>>                                        }
>>                  } catch (NoNodeException e) {
>>                    LOG.debug("Node no longer existed so not forcing another " +
>>                      "unassignment");
>>                  } catch (KeeperException e) {
>>                    LOG.warn("Unexpected ZK exception timing out a region " +
>>                      "close", e);
>>                  }
>> 2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
>> above is the logs. I need to keep digging.
>>
>> Shell logs:
>> hbase(main):003:0*
>> hbase(main):004:0* disable 'ufdr'
>> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
>> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
>> 11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING
>>
>> ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.
>>
>>
>> Region server logs:
>>
>> 2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
>> 2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
>> 2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
>> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
>> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
>> 2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
>> 2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
>> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
>> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
>> 2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>>
>>
>> Hmaster logs:
>> 2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
>> 2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
>> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
>> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
>> 2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
>> 2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
>> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
>> 2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
>> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
>> 2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
>> 2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
>> 2011-05-08 17:5
>>
>>
>

re: Table can't disable

Posted by Gaojinchao <ga...@huawei.com>.
Hi, Stack. 

I try to make a patch and process it. 
I want to send a zk message again when close region is timeout.
I try to reproduce and verify it. But it is difficulty.
Do you give me some help and review it. 

Thanks.

case PENDING_CLOSE:
                LOG.info("Region has been PENDING_CLOSE for too " +
                    "long, running forced unassign again on region=" +
                    regionInfo.getRegionNameAsString());
                  try {
                    // If the server got the RPC, it will transition the node
                    // to CLOSING, so only do something here if no node exists
                    if (!ZKUtil.watchAndCheckExists(watcher,
                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
                      // Queue running of an unassign -- do actual unassign
                      // outside of the regionsInTransition lock.
                      unassigns.add(regionInfo);
                    }
                    else
                    {
										// I add some code 
                       try {
                          String node = ZKAssign.getNodeName(watcher,
                             regionInfo.getEncodedName());
                          Stat stat = new Stat();
                          RegionTransitionData data = ZKAssign.getDataNoWatch(watcher,
                             node, stat);
                          if (data == null) {
                             LOG.warn("Data is null, node " + node + " no longer exists");
                             break;
                          }
                          if (data.getEventType() != EventType.RS_ZK_REGION_CLOSED) {
                             LOG.debug("Region has transitioned to CLOEING, allowing " +
                                  "watched event handlers to process");
                             break;
                          } 
                            
                         //In this case, Region server has ClOSED 
                         data = new RegionTransitionData(
                                 EventType.RS_ZK_REGION_CLOSED, regionInfo.getRegionName(),
                                    master.getServerName());
                         if (ZKUtil.setData(watcher, node, data.getBytes(),
                                   stat.getVersion()+1 )) {
                            	
                            // Node is now closed, let's trigger another close handler
                            LOG.info("Try to finish closed region=" +
                               regionInfo.getRegionNameAsString() + "again" );
                         }
                            
                      } catch (KeeperException ke) {
                         LOG.error("Unexpected ZK exception timing out PENDING_CLOSE region",
                            ke);
                         break;
                        }
                    	
                     }
                  } catch (NoNodeException e) {
                    LOG.debug("Node no longer existed so not forcing another " +
                      "unassignment");
                  } catch (KeeperException e) {
                    LOG.warn("Unexpected ZK exception timing out a region " +
                      "close", e);
                  }
                  break;

-----邮件原件-----
发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
发送时间: 2011年5月10日 1:38
收件人: dev@hbase.apache.org
主题: Re: Table can't disable

I'm looking into this Gao.  Something similar seems to have happened
here over the w/e.  Let me look at our systems first and then I'l be
back to you on this below.  Thanks,
St.Ack

On Sun, May 8, 2011 at 4:58 AM, Gaojinchao <ga...@huawei.com> wrote:
> Today I test Hbase verison 0.90.3
> I seems like some bugs
> 1, if node exists and node state is RS_ZK_REGION_CLOSED
>  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server
>
> case PENDING_CLOSE:
>                LOG.info("Region has been PENDING_CLOSE for too " +
>                    "long, running forced unassign again on region=" +
>                    regionInfo.getRegionNameAsString());
>                  try {
>                    // If the server got the RPC, it will transition the node
>                    // to CLOSING, so only do something here if no node exists
>                    if (!ZKUtil.watchAndCheckExists(watcher,
>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>                      // Queue running of an unassign -- do actual unassign
>                      // outside of the regionsInTransition lock.
>                      unassigns.add(regionInfo);
>                    }
>                                        else
>                    {
>                                                //It need to process for node state is RS_ZK_REGION_CLOSED
>                                        }
>                  } catch (NoNodeException e) {
>                    LOG.debug("Node no longer existed so not forcing another " +
>                      "unassignment");
>                  } catch (KeeperException e) {
>                    LOG.warn("Unexpected ZK exception timing out a region " +
>                      "close", e);
>                  }
> 2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
> above is the logs. I need to keep digging.
>
> Shell logs:
> hbase(main):003:0*
> hbase(main):004:0* disable 'ufdr'
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
> 11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING
>
> ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.
>
>
> Region server logs:
>
> 2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
> 2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
> 2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
> 2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
> 2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
> 2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>
>
> Hmaster logs:
> 2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
> 2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:5
>
>

Re: Table can't disable

Posted by Stack <st...@duboce.net>.
I'm looking into this Gao.  Something similar seems to have happened
here over the w/e.  Let me look at our systems first and then I'l be
back to you on this below.  Thanks,
St.Ack

On Sun, May 8, 2011 at 4:58 AM, Gaojinchao <ga...@huawei.com> wrote:
> Today I test Hbase verison 0.90.3
> I seems like some bugs
> 1, if node exists and node state is RS_ZK_REGION_CLOSED
>  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server
>
> case PENDING_CLOSE:
>                LOG.info("Region has been PENDING_CLOSE for too " +
>                    "long, running forced unassign again on region=" +
>                    regionInfo.getRegionNameAsString());
>                  try {
>                    // If the server got the RPC, it will transition the node
>                    // to CLOSING, so only do something here if no node exists
>                    if (!ZKUtil.watchAndCheckExists(watcher,
>                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
>                      // Queue running of an unassign -- do actual unassign
>                      // outside of the regionsInTransition lock.
>                      unassigns.add(regionInfo);
>                    }
>                                        else
>                    {
>                                                //It need to process for node state is RS_ZK_REGION_CLOSED
>                                        }
>                  } catch (NoNodeException e) {
>                    LOG.debug("Node no longer existed so not forcing another " +
>                      "unassignment");
>                  } catch (KeeperException e) {
>                    LOG.warn("Unexpected ZK exception timing out a region " +
>                      "close", e);
>                  }
> 2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
> above is the logs. I need to keep digging.
>
> Shell logs:
> hbase(main):003:0*
> hbase(main):004:0* disable 'ufdr'
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
> 11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
> 11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING
>
> ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.
>
>
> Region server logs:
>
> 2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
> 2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
> 2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
> 2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
> 2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
> 2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
> 2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
> 2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
>
>
> Hmaster logs:
> 2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
> 2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
> 2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
> 2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
> 2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
> 2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
> 2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
> 2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
> 2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
> 2011-05-08 17:5
>
>

Table can't disable

Posted by Gaojinchao <ga...@huawei.com>.
Today I test Hbase verison 0.90.3
I seems like some bugs
1, if node exists and node state is RS_ZK_REGION_CLOSED 
  We should call ClosedRegionHandle to close region. In that case the reigon has closed by region server 

case PENDING_CLOSE:
                LOG.info("Region has been PENDING_CLOSE for too " +
                    "long, running forced unassign again on region=" +
                    regionInfo.getRegionNameAsString());
                  try {
                    // If the server got the RPC, it will transition the node
                    // to CLOSING, so only do something here if no node exists
                    if (!ZKUtil.watchAndCheckExists(watcher,
                      ZKAssign.getNodeName(watcher, regionInfo.getEncodedName()))) {
                      // Queue running of an unassign -- do actual unassign
                      // outside of the regionsInTransition lock.
                      unassigns.add(regionInfo);
                    }
					else
                    {
						//It need to process for node state is RS_ZK_REGION_CLOSED 
					}
                  } catch (NoNodeException e) {
                    LOG.debug("Node no longer existed so not forcing another " +
                      "unassignment");
                  } catch (KeeperException e) {
                    LOG.warn("Unexpected ZK exception timing out a region " +
                      "close", e);
                  }
2, other It seems like zk message losing or there is some bugs for disabling table when region server is splitting region.
above is the logs. I need to keep digging.

Shell logs:
hbase(main):003:0*
hbase(main):004:0* disable 'ufdr'
11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Set watcher on existing znode /hbase/root-region-server
11/05/08 18:20:40 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 15 byte(s) of data from znode /hbase/root-region-server and set watcher; C4C5.site:60020
11/05/08 18:53:19 DEBUG zookeeper.ZKUtil: hconnection-0x2fcd58283c0037 Retrieved 9 byte(s) of data from znode /hbase/table/ufdr; data=DISABLING

ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table ufdr to be disabled.


Region server logs:

2011-05-08 17:42:44,862 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:42:45,468 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd, isReference=true, isBulkLoadResult=false, seqid=627830, majorCompaction=false
2011-05-08 17:42:45,471 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.; next sequenceid=627831
2011-05-08 17:42:45,471 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. because Region has references on open; priority=9, compaction queue size=40
2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. in region .META.,,1, serverInfo=C4C4.site,60020,1304820199467
2011-05-08 17:42:45,476 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated, and report to master. Parent=ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd., new regions: ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.. Split took 0sec
2011-05-08 17:44:25,731 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:48:11,066 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:48:11,067 INFO org.apache.hadoop.hbase.regionserver.Store: Started compaction of 1 file(s) in cf=value, hasReferences=true, into hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/.tmp, seqid=627830, totalSize=892.2m
2011-05-08 17:48:11,067 DEBUG org.apache.hadoop.hbase.regionserver.Store: Compacting hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/3442771824694350714.8e9a3b05abe1c3a692999cf5e8dfd9dd-hdfs://C4C1:9000/hbase/ufdr/8e9a3b05abe1c3a692999cf5e8dfd9dd/value/3442771824694350714-top, keycount=2843998, bloomtype=NONE, size=892.2m
2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s), new file=hdfs://C4C1:9000/hbase/ufdr/4418fb197685a21f77e151e401cf8b66/value/4640085674257062204, size=446.2m; total size for store is 446.2m
2011-05-08 17:48:32,340 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. after 21sec
2011-05-08 17:52:46,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,627 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Creating unassigned node for 4418fb197685a21f77e151e401cf8b66 in a CLOSING state
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.: disabling compactions & flushes
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,986 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:46,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Attempting to transition node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,052 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66; data=region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSING
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12fcd58280c0003-0x12fcd58280c0003 Successfully transitioned node 4418fb197685a21f77e151e401cf8b66 from RS_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,143 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,696 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,696 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:47,814 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:48,167 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:52:48,167 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Received close for region we are not serving; 4418fb197685a21f77e151e401cf8b66


Hmaster logs:
2011-05-08 17:42:45,514 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:43:37,599 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:43:45,525 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467
2011-05-08 17:52:47,309 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Retrieved 125 byte(s) of data from znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 and set watcher; region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase/unassigned/4418fb197685a21f77e151e401cf8b66 (region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., server=C4C4.site,60020,1304820199467, state=RS_ZK_REGION_CLOSED)
2011-05-08 17:52:47,388 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=C4C4.site,60020,1304820199467, region=4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,388 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4418fb197685a21f77e151e401cf8b66 from server C4C4.site,60020,1304820199467 but region was in  the state null and not in expected PENDING_CLOSE or CLOSING states
2011-05-08 17:52:47,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,398 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,400 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,501 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,702 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:47,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,816 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,819 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,820 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,820 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,821 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:47,821 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:47,822 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175)
2011-05-08 17:52:48,172 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,173 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,174 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Received close for ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but we are not serving it for 4418fb197685a21f77e151e401cf8b66
2011-05-08 17:52:48,175 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. (offlining)
2011-05-08 17:52:48,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,177 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,178 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. but it is already in transition (PENDING_CLOSE)
2011-05-08 17:52:48,690 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughter reference ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66., qualifier=splitB, from parent ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.
2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:55:51,787 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:55:51,787 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:56:01,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:56:01,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:56:11,774 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66.
2011-05-08 17:56:11,775 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22fcd582836003d Set watcher on existing znode /hbase/unassigned/4418fb197685a21f77e151e401cf8b66
2011-05-08 17:56:21,775 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. state=PENDING_CLOSE, ts=1304848368173
2011-05-08 17:5


Re: HBase book/Wiki update

Posted by Stack <st...@duboce.net>.
You are a good man Doug.  Thanks.
St.Ack

On Fri, May 6, 2011 at 4:44 PM, Doug Meil <do...@explorysmedical.com> wrote:
> Hi folks,
>
> The HBase wiki PerformanceTuning page has been ported to the HBase book in 2 sections:  Performance (for tuning) and Troubleshooting (for the GC log reading part).  Therefore this page is now obsolete and is not on the main HBase Wiki page anymore.  Historians can find it on the "obsolete pages" link.
>
> Next week, the HBase wiki Troubleshooting page will be ported to the HBase book and then that will be obsolete.
>
> The wiki FAQ page will continue, though.  This page was refreshed this week pointing to the HBase book for a number of things, and it will also be reorganized a bit to be grouped by "question/problem area" (sort of like the Troubleshooting section in the HBase book).
>
>
> Doug Meil
> Chief Software Architect, Explorys
> doug.meil@explorys.com
>
>