You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2017/05/07 23:11:03 UTC

Re: Regarding issue of Master is initializing after formatting HDFS

Dropping dev@

bq. Assigning hbase:meta,,1.1588230740 to hnode4.informatik.uni-kl.de

Have you checked the log on hnode4 to see what happened around 2017-05-07 23:36
?

Have you run hbck ?

Pastebin more of the server hosting hbase:meta if possible.

On Sun, May 7, 2017 at 3:05 PM, Weiping Qu <qu...@informatik.uni-kl.de> wrote:

> Dear HBase mailing list,
>
> We got following error all the time once after the HDFS is reformatted.
> =====================================================================
> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>         at org.apache.hadoop.hbase.master.HMaster.checkInitialized(
> HMaster.java:2293)
>         at org.apache.hadoop.hbase.master.MasterRpcServices.getTableNam
> es(MasterRpcServices.java:900)
>         at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$
> MasterService$2.callBlockingMethod(MasterProtos.java:55650)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExec
> utor.java:133)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.ja
> va:108)
>         at java.lang.Thread.run(Thread.java:745)
> =====================================================================
>
> The configuration is the following:
>
> hadoop: 2.6.5
> hbase: 1.2.4
> hbase use its own zookeeper: 3.4.6
>
> All regionservers work fine while the master could not complete
> initlization. After checking some articles from online, it seems that there
> are some problems in zookeeper.
> I attach the log from master and zookeeper in the following:
>
> hbase-hadoop-master.log
> ============================================================
> =================
> 2017-05-07 23:36:10,551 INFO  [atbara:16000.activeMasterManager]
> zookeeper.MetaTableLocator: Deleting hbase:meta region location in ZooKeeper
> 2017-05-07 23:36:10,573 INFO  [atbara:16000.activeMasterManager]
> master.AssignmentManager: Setting node as OFFLINED in ZooKeeper for region
> {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY =>
> ''}
> 2017-05-07 23:36:10,586 INFO  [atbara:16000.activeMasterManager]
> zookeeper.ZKTableStateManager: Moving table hbase:meta state from null to
> ENABLED
> 2017-05-07 23:36:10,599 INFO  [atbara:16000.activeMasterManager]
> master.AssignmentManager: Assigning hbase:meta,,1.1588230740 to
> hnode4.informatik.uni-kl.de,16020,1494192962751
> 2017-05-07 23:36:10,599 INFO  [atbara:16000.activeMasterManager]
> master.RegionStates: Transition {1588230740 state=OFFLINE,
> ts=1494192970573, server=null} to {1588230740 state=PENDING_OPEN,
> ts=1494192970599, server=hnode4.informatik.uni-kl.de,16020,1494192962751}
> 2017-05-07 23:36:10,808 INFO  [atbara:16000.activeMasterManager]
> master.ServerManager: AssignmentManager hasn't finished failover cleanup;
> waiting
> 2017-05-07 23:36:10,842 INFO  [AM.ZK.Worker-pool2-t1] master.RegionStates:
> Transition {1588230740 state=PENDING_OPEN, ts=1494192970599, server=
> hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740
> state=OPENING, ts=1494192970842, server=hnode4.informatik.uni-kl.de
> ,16020,1494192962751}
> 2017-05-07 23:36:11,201 INFO  [AM.ZK.Worker-pool2-t2] master.RegionStates:
> Transition {1588230740 state=OPENING, ts=1494192970842, server=
> hnode4.informatik.uni-kl.de,16020,1494192962751} to {1588230740
> state=OPEN, ts=1494192971201, server=hnode4.informatik.uni-kl.de
> ,16020,1494192962751}
> 2017-05-07 23:36:11,204 INFO  [AM.ZK.Worker-pool2-t2]
> coordination.ZkOpenRegionCoordination: Handling OPENED of 1588230740 from
> atbara.informatik.uni-kl.de,16000,1494192960469; deleting unassigned node
> 2017-05-07 23:36:11,208 INFO  [atbara:16000.activeMasterManager]
> master.HMaster: hbase:meta with replicaId 0 assigned=1, rit=false, location=
> hnode4.informatik.uni-kl.de,16020,1494192962751
> 2017-05-07 23:36:49,727 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=10, retries=350,
> started=38384 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:36:59,791 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=11, retries=350,
> started=48448 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:37:19,808 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=12, retries=350,
> started=68465 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:37:39,885 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=13, retries=350,
> started=88542 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:38:00,059 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=14, retries=350,
> started=108716 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:38:20,113 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=15, retries=350,
> started=128770 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:38:40,166 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=16, retries=350,
> started=148823 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:39:00,279 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=17, retries=350,
> started=168936 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:39:20,448 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=18, retries=350,
> started=189105 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:39:40,566 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=19, retries=350,
> started=209223 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:40:00,770 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=20, retries=350,
> started=229427 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:40:20,900 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=21, retries=350,
> started=249557 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:40:40,962 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=22, retries=350,
> started=269619 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:41:01,020 INFO [hconnection-0x47c57609-shared--pool3-t2]
> client.RpcRetryingCaller: Call exception, tries=23, retries=350,
> started=289677 ms ago, cancelled=false, msg=row '' on table 'hbase:meta' at
> region=hbase:meta,,1.1588230740, hostname=hnode4.informatik.uni-kl.de,16020,1494192962751,
> seqNum=0
> 2017-05-07 23:41:04,765 INFO  [LruBlockCacheStatsExecutor]
> hfile.LruBlockCache: totalSize=3.24 MB, freeSize=3.09 GB, max=3.09 GB,
> blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0,
> cachingHits=0, cachingHitsRatio=0,evictions=29, evicted=0,
> evictedPerRun=0.0
> ==============================================================
>
> hbase-hadoop-zookeeper.log
> ==============================================================
> 2017-05-07 23:36:04,977 INFO  [SyncThread:0] server.ZooKeeperServer:
> Established session 0x15be4d72bfa000c with negotiated timeout 90000 for
> client /192.168.18.8:38830
> 2017-05-07 23:36:05,070 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0000 type:create cxid:0x2c zxid:0x1c txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc Error:KeeperErrorCode =
> NoNode for /hbase/flush-table-proc
> 2017-05-07 23:36:05,091 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0000 type:create cxid:0x36 zxid:0x21 txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot Error:KeeperErrorCode =
> NoNode for /hbase/online-snapshot
> 2017-05-07 23:36:05,153 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0003 type:create cxid:0x7 zxid:0x26 txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
> 2017-05-07 23:36:05,154 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0006 type:create cxid:0x7 zxid:0x27 txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
> 2017-05-07 23:36:05,155 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0005 type:create cxid:0x7 zxid:0x28 txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
> 2017-05-07 23:36:05,158 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0002 type:create cxid:0x7 zxid:0x29 txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
> 2017-05-07 23:36:05,173 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0003 type:create cxid:0xa zxid:0x2a txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
> 2017-05-07 23:36:05,173 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0006 type:create cxid:0xa zxid:0x2b txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
> 2017-05-07 23:36:05,177 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0005 type:create cxid:0xa zxid:0x2c txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
> 2017-05-07 23:36:05,180 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0002 type:create cxid:0xa zxid:0x2d txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
> 2017-05-07 23:36:05,184 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.NIOServerCnxnFactory: Accepted socket connection from /
> 192.168.18.8:38832
> 2017-05-07 23:36:05,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.ZooKeeperServer: Client attempting to establish new session at /
> 192.168.18.8:38832
> 2017-05-07 23:36:05,190 INFO  [SyncThread:0] server.ZooKeeperServer:
> Established session 0x15be4d72bfa000d with negotiated timeout 90000 for
> client /192.168.18.8:38832
> 2017-05-07 23:36:05,192 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa000d type:create cxid:0x1 zxid:0x2f txntype:-1
> reqpath:n/a Error Path:/hbase/replication Error:KeeperErrorCode = NoNode
> for /hbase/replication
> 2017-05-07 23:36:05,202 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0004 type:create cxid:0x7 zxid:0x32 txntype:-1
> reqpath:n/a Error Path:/hbase/flush-table-proc/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/flush-table-proc/acquired
> 2017-05-07 23:36:05,212 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0004 type:create cxid:0xa zxid:0x33 txntype:-1
> reqpath:n/a Error Path:/hbase/online-snapshot/acquired
> Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
> 2017-05-07 23:36:05,743 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0002 type:create cxid:0x12 zxid:0x3e txntype:-1
> reqpath:n/a Error Path:/hbase/replication/peers Error:KeeperErrorCode =
> NodeExists for /hbase/replication/peers
> 2017-05-07 23:36:10,552 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0000 type:delete cxid:0x6e zxid:0x40 txntype:-1
> reqpath:n/a Error Path:/hbase/meta-region-server Error:KeeperErrorCode =
> NoNode for /hbase/meta-region-server
> 2017-05-07 23:36:11,194 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Got user-level KeeperException when processing
> sessionid:0x15be4d72bfa0006 type:setData cxid:0x29 zxid:0x45 txntype:-1
> reqpath:n/a Error Path:/hbase/meta-region-server Error:KeeperErrorCode =
> NoNode for /hbase/meta-region-server
> 2017-05-07 23:39:33,056 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.NIOServerCnxnFactory: Accepted socket connection from /
> 192.168.18.8:38858
> 2017-05-07 23:39:33,061 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.ZooKeeperServer: Client attempting to establish new session at /
> 192.168.18.8:38858
> 2017-05-07 23:39:33,063 INFO  [SyncThread:0] server.ZooKeeperServer:
> Established session 0x15be4d72bfa000e with negotiated timeout 90000 for
> client /192.168.18.8:38858
> 2017-05-07 23:44:18,506 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.NIOServerCnxn: caught end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x15be4d72bfa000e, likely client has closed socket
>         at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn
> .java:228)
>         at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServ
> erCnxnFactory.java:208)
>         at java.lang.Thread.run(Thread.java:745)
> 2017-05-07 23:44:18,510 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2222]
> server.NIOServerCnxn: Closed socket connection for client /
> 192.168.18.8:38858 which had sessionid 0x15be4d72bfa000e
> 2017-05-07 23:45:36,000 INFO  [SessionTracker] server.ZooKeeperServer:
> Expiring session 0x15be4d72bfa000e, timeout of 90000ms exceeded
> 2017-05-07 23:45:36,001 INFO  [ProcessThread(sid:0 cport:-1):]
> server.PrepRequestProcessor: Processed session termination for sessionid:
> 0x15be4d72bfa000e
> ======================================================================
>
> I tried the following:
> 1. wipe hadoop tmp dir, data dir across all nodes and then hadoop namenode
> -format
> 2. wipe zookeeper.property.dataDir set in hbase-site.xml
>
> still not working..
> Thank you for your help in advance,
> Best,
> Weiping
>
>