You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by sreenivasulu y <sr...@huawei.com> on 2014/08/25 03:37:30 UTC

RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Hi,

I am running a cluster with 4 node and  HDFS in HA mode.
Machine1, machine2, machine3 and machine4.
And I am performing the following operations respectively.
1. create table with precreated regions.
2. insert 1000 records into the table.
3. Disable the table.
4. Modify the table.
5. enable the table.
6. disable and delete the table.

While performing the above operations step3 and step4, machine3 regionserver went down.
But in machine1 thrown the following error
2014-07-11 08:23:58,933 FATAL [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:  Got while writing log entry to log
java.io.IOException: cannot get log writer
        at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
        at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
        at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
and machine 1 also down..

please help me why the machine1 regionserver went down.



-------------------------------------------------------------------------------------------------------------------------------------
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!


Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Posted by ramkrishna vasudevan <ra...@gmail.com>.
>>Here my doubt is machine1 regionserver went down even though I am
performing normal operations.
Yes.. So there is no data loss right?  Because before replaying if the
recovered.edits are deleted then it is a very serious issue.  Pls confirm
the same.

If the assumed scenario above is correct, then it is an issue.  But need to
see the entire logs to confirm.

Regards
Ram


On Mon, Aug 25, 2014 at 4:53 PM, sreenivasulu y <sr...@huawei.com>
wrote:

> Hai ram,
>
> The rest of the machines are machine2 and machine4 are running fine.
> I am performing set of operations as part of that I deleted that table,
> Here my doubt is machine1 regionserver went down even though I am
> performing normal operations.
>
> Might your assumed scenario, may be correct.
> Then this is an issue rite?
>
> Regards
> seenu
>
> -----Original Message-----
> From: ramkrishna vasudevan [mailto:ramkrishna.s.vasudevan@gmail.com]
> Sent: 25 August 2014 PM 03:15
> To: user@hbase.apache.org
> Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got
> while writing log entry to log
>
> Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/
> df2dae34231829adc6ac10b43f5decb2/recovered.edits
>
> Did the other two machines machine 2 and machine 4 up and running? Were
> you able to get back the table and the number of records you have inserted?
> Seems to me that after the recovery was successful the recovered.edits dir
> for that region was deleted by Log splitting thread and other machine tried
> to use that path and it failed.
>
> Regards
> Ram
>
>
>
> On Mon, Aug 25, 2014 at 12:22 PM, sreenivasulu y <
> sreenivasulu.y@huawei.com>
> wrote:
>
> > Thanks for reply ted
> > I am using following versions
> > HBase 0.98.3
> > HDFS 2.4.3
> >
> > Do you see other exceptions in machine1 server log ?
> > The following file is not exist like that throwing error.
> >
> > 2014-07-11 08:23:58,921 WARN  [Thread-39475] hdfs.DFSClient:
> > DataStreamer Exception
> >
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> > No lease on
> >
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2772)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2680)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
> >         at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
> >         at
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> >
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy18.addBlock(Unknown Source)
> >         at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
> >         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
> >         at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
> >         at $Proxy19.addBlock(Unknown Source)
> >         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
> >         at $Proxy20.addBlock(Unknown Source)
> >         at
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
> >         at
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
> >         at
> > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStrea
> > m.java:525)
> > 2014-07-11 08:23:58,933 FATAL
> > [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> > Got while writing log entry to log
> > java.io.IOException: cannot get log writer
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
> >         at
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run
> > (HLogSplitter.java:813) Caused by: java.io.FileNotFoundException:
> > Parent directory doesn't exist:
> >
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:2156)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2289)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2237)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2190)
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:520)
> >         at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:354)
> >         at
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> >
> >         at
> > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >         at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> >         at
> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> >         at
> >
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> >         at
> >
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
> >         at
> >
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1604)
> >         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465)
> >         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1425)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:437)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:433)
> >         at
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:433)
> >         at
> > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
> >         at
> > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
> >         ... 8 more
> >
> > In NameNode at the same time deleted the file.
> >
> > 2014-07-11 08:23:57,983 INFO BlockStateChange: BLOCK* addToInvalidates:
> > blk_1073848427_107656 10.18.40.69:50076 10.18.40.89:50076
> > 2014-07-11 08:23:58,786 INFO org.apache.hadoop.hdfs.StateChange:
> > BLOCK*
> > allocateBlock:
> >
> /hbase/data/default/2025Thread0_table/6cd0083d54dbd8a8b010c00a2770a321/recovered.edits/0000000000000000002.temp.
> > BP-89134156-10.18.40.69-1405003788606
> > blk_1073849564_108794{blockUCState=UNDER_CONSTRUCTION,
> > primaryNodeIndex=-1,
> > replicas=[ReplicaUnderConstruction[[DISK]DS-de56b04c-75dc-4884-aede-22
> > 12c1a1305e:NORMAL|RBW],
> > ReplicaUnderConstruction[[DISK]DS-6a01309e-a4f2-4ad2-a100-38ae0c947427
> > :NORMAL|RBW]]}
> > 2014-07-11 08:23:58,826 INFO BlockStateChange: BLOCK* addToInvalidates:
> > blk_1073848423_107652 10.18.40.89:50076 10.18.40.69:50076
> > 2014-07-11 08:23:58,832 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 8 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589066 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/7d5032128770926448aa7e02c3cf963e/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,833 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 5 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> > 10.18.40.69:37208 Call#589067 Retry#0: java.io.FileNotFoundException:
> > Parent directory doesn't exist:
> > /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2
> > /recovered.edits
> > 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 6 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> > 10.18.40.69:37208 Call#589068 Retry#0: java.io.FileNotFoundException:
> > Parent directory doesn't exist:
> > /hbase/data/default/2025Thread0_table/f17021778ea258ffb5edde2d86b0237f
> > /recovered.edits
> > 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> > 10.18.40.69:37208 Call#589069 Retry#0: java.io.FileNotFoundException:
> > Parent directory doesn't exist:
> > /hbase/data/default/2025Thread0_table/046da18c6ddeae7df459c42daf711c78
> > /recovered.edits
> > 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 5 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589072 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/995a56512784d109352d9dcbbeeebfbc/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 8 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589070 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 6 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589071 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/6b8b92d73b51e1fd245854e62111fc8c/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,857 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589073 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/591d084674854bb913f419046e7f7d0c/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,960 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 8 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589074 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/5c514b93ed15b38cdc210abed58907df/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,966 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 7 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589076 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/b6d129691540af012a388590bf1e8629/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 3 on 65110, call
> > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> > 10.18.40.69:37208 Call#589077 Retry#0:
> > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
> > on
> >
> /hbase/data/default/2025Thread0_table/190414760587ee6adbbb518e7e0c719f/recovered.edits/0000000000000000002.temp:
> > File does not exist. [Lease.  Holder:
> > DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> > pendingcreates: 11]
> > 2014-07-11 08:23:58,996 INFO BlockStateChange: BLOCK* addToInvalidates:
> > blk_1073849517_108746 10.18.40.69:50076 10.18.40.89:50076
> > 2014-07-11 08:23:59,014 INFO BlockStateChange: BLOCK* addToInvalidates:
> > blk_1073848446_107675 10.18.40.89:50076 10.18.40.69:50076
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: 25 August 2014 AM 11:03
> > To: user@hbase.apache.org
> > Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter:
> > Got while writing log entry to log
> >
> > What release of HBase are you using ?
> >
> > Do you see other exceptions in machine1 server log ?
> >
> > Please check namenode log as well.
> >
> > Cheers
> >
> >
> > On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y
> > <sreenivasulu.y@huawei.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I am running a cluster with 4 node and  HDFS in HA mode.
> > > Machine1, machine2, machine3 and machine4.
> > > And I am performing the following operations respectively.
> > > 1. create table with precreated regions.
> > > 2. insert 1000 records into the table.
> > > 3. Disable the table.
> > > 4. Modify the table.
> > > 5. enable the table.
> > > 6. disable and delete the table.
> > >
> > > While performing the above operations step3 and step4, machine3
> > > regionserver went down.
> > > But in machine1 thrown the following error
> > > 2014-07-11 08:23:58,933 FATAL
> > > [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10]
> wal.HLogSplitter:
> > > Got while writing log entry to log
> > > java.io.IOException: cannot get log writer
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLog
> > Factory.java:197)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEd
> > itsWriter(HLogFactory.java:182)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLo
> > gSplitter.java:643)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> > sOutputSink.createWAP(HLogSplitter.java:1223)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> > sOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> > sOutputSink.append(HLogSplitter.java:1243)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.wri
> > teBuffer(HLogSplitter.java:851)
> > >         at
> > >
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doR
> > un(HLogSplitter.java:843)
> > >         at
> > > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.r
> > > un
> > > (HLogSplitter.java:813)
> > > and machine 1 also down..
> > >
> > > please help me why the machine1 regionserver went down.
> > >
> > >
> > >
> > >
> > > --------------------------------------------------------------------
> > > --
> > > ---------------------------------------------------------------
> > > This e-mail and its attachments contain confidential information
> > > from HUAWEI, which is intended only for the person or entity whose
> > > address is listed above. Any use of the information contained herein
> > > in any way (including, but not limited to, total or partial
> > > disclosure, reproduction, or dissemination) by persons other than
> > > the intended
> > > recipient(s) is prohibited. If you receive this e-mail in error,
> > > please notify the sender by phone or email immediately and delete it!
> > >
> > >
> >
>

RE: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Posted by sreenivasulu y <sr...@huawei.com>.
Hai ram,

The rest of the machines are machine2 and machine4 are running fine.
I am performing set of operations as part of that I deleted that table,
Here my doubt is machine1 regionserver went down even though I am performing normal operations.

Might your assumed scenario, may be correct.
Then this is an issue rite?

Regards
seenu

-----Original Message-----
From: ramkrishna vasudevan [mailto:ramkrishna.s.vasudevan@gmail.com] 
Sent: 25 August 2014 PM 03:15
To: user@hbase.apache.org
Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/
df2dae34231829adc6ac10b43f5decb2/recovered.edits

Did the other two machines machine 2 and machine 4 up and running? Were you able to get back the table and the number of records you have inserted?
Seems to me that after the recovery was successful the recovered.edits dir for that region was deleted by Log splitting thread and other machine tried to use that path and it failed.

Regards
Ram



On Mon, Aug 25, 2014 at 12:22 PM, sreenivasulu y <sr...@huawei.com>
wrote:

> Thanks for reply ted
> I am using following versions
> HBase 0.98.3
> HDFS 2.4.3
>
> Do you see other exceptions in machine1 server log ?
> The following file is not exist like that throwing error.
>
> 2014-07-11 08:23:58,921 WARN  [Thread-39475] hdfs.DFSClient: 
> DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> No lease on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2772)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2680)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy18.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at $Proxy19.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
>         at $Proxy20.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStrea
> m.java:525)
> 2014-07-11 08:23:58,933 FATAL
> [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> Got while writing log entry to log
> java.io.IOException: cannot get log writer
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run
> (HLogSplitter.java:813) Caused by: java.io.FileNotFoundException: 
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:2156)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2289)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2237)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2190)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:520)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:354)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1604)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1425)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:437)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
>         at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
>         ... 8 more
>
> In NameNode at the same time deleted the file.
>
> 2014-07-11 08:23:57,983 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848427_107656 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:58,786 INFO org.apache.hadoop.hdfs.StateChange: 
> BLOCK*
> allocateBlock:
> /hbase/data/default/2025Thread0_table/6cd0083d54dbd8a8b010c00a2770a321/recovered.edits/0000000000000000002.temp.
> BP-89134156-10.18.40.69-1405003788606
> blk_1073849564_108794{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-de56b04c-75dc-4884-aede-22
> 12c1a1305e:NORMAL|RBW], 
> ReplicaUnderConstruction[[DISK]DS-6a01309e-a4f2-4ad2-a100-38ae0c947427
> :NORMAL|RBW]]}
> 2014-07-11 08:23:58,826 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848423_107652 10.18.40.89:50076 10.18.40.69:50076
> 2014-07-11 08:23:58,832 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589066 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/7d5032128770926448aa7e02c3cf963e/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,833 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 5 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589067 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2
> /recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 6 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589068 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/f17021778ea258ffb5edde2d86b0237f
> /recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 1 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589069 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/046da18c6ddeae7df459c42daf711c78
> /recovered.edits
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 5 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589072 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/995a56512784d109352d9dcbbeeebfbc/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589070 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 6 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589071 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/6b8b92d73b51e1fd245854e62111fc8c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,857 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 1 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589073 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/591d084674854bb913f419046e7f7d0c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,960 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 8 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589074 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/5c514b93ed15b38cdc210abed58907df/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,966 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 7 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589076 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/b6d129691540af012a388590bf1e8629/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 3 on 65110, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589077 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease 
> on
> /hbase/data/default/2025Thread0_table/190414760587ee6adbbb518e7e0c719f/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,996 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073849517_108746 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:59,014 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848446_107675 10.18.40.89:50076 10.18.40.69:50076
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: 25 August 2014 AM 11:03
> To: user@hbase.apache.org
> Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: 
> Got while writing log entry to log
>
> What release of HBase are you using ?
>
> Do you see other exceptions in machine1 server log ?
>
> Please check namenode log as well.
>
> Cheers
>
>
> On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y 
> <sreenivasulu.y@huawei.com
> >
> wrote:
>
> > Hi,
> >
> > I am running a cluster with 4 node and  HDFS in HA mode.
> > Machine1, machine2, machine3 and machine4.
> > And I am performing the following operations respectively.
> > 1. create table with precreated regions.
> > 2. insert 1000 records into the table.
> > 3. Disable the table.
> > 4. Modify the table.
> > 5. enable the table.
> > 6. disable and delete the table.
> >
> > While performing the above operations step3 and step4, machine3 
> > regionserver went down.
> > But in machine1 thrown the following error
> > 2014-07-11 08:23:58,933 FATAL
> > [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> > Got while writing log entry to log
> > java.io.IOException: cannot get log writer
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLog
> Factory.java:197)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEd
> itsWriter(HLogFactory.java:182)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLo
> gSplitter.java:643)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.createWAP(HLogSplitter.java:1223)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEdit
> sOutputSink.append(HLogSplitter.java:1243)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.wri
> teBuffer(HLogSplitter.java:851)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doR
> un(HLogSplitter.java:843)
> >         at
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.r
> > un
> > (HLogSplitter.java:813)
> > and machine 1 also down..
> >
> > please help me why the machine1 regionserver went down.
> >
> >
> >
> >
> > --------------------------------------------------------------------
> > --
> > ---------------------------------------------------------------
> > This e-mail and its attachments contain confidential information 
> > from HUAWEI, which is intended only for the person or entity whose 
> > address is listed above. Any use of the information contained herein 
> > in any way (including, but not limited to, total or partial 
> > disclosure, reproduction, or dissemination) by persons other than 
> > the intended
> > recipient(s) is prohibited. If you receive this e-mail in error, 
> > please notify the sender by phone or email immediately and delete it!
> >
> >
>

Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/
df2dae34231829adc6ac10b43f5decb2/recovered.edits

Did the other two machines machine 2 and machine 4 up and running? Were you
able to get back the table and the number of records you have inserted?
Seems to me that after the recovery was successful the recovered.edits dir
for that region was deleted by Log splitting thread and other machine tried
to use that path and it failed.

Regards
Ram



On Mon, Aug 25, 2014 at 12:22 PM, sreenivasulu y <sr...@huawei.com>
wrote:

> Thanks for reply ted
> I am using following versions
> HBase 0.98.3
> HDFS 2.4.3
>
> Do you see other exceptions in machine1 server log ?
> The following file is not exist like that throwing error.
>
> 2014-07-11 08:23:58,921 WARN  [Thread-39475] hdfs.DFSClient: DataStreamer
> Exception
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> No lease on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2772)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2680)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy18.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>         at $Proxy19.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
>         at $Proxy20.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
> 2014-07-11 08:23:58,933 FATAL
> [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> Got while writing log entry to log
> java.io.IOException: cannot get log writer
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
> Caused by: java.io.FileNotFoundException: Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:2156)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2289)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2237)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2190)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:520)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:354)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>         at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1604)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1425)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:437)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:433)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
>         at
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
>         at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
>         ... 8 more
>
> In NameNode at the same time deleted the file.
>
> 2014-07-11 08:23:57,983 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848427_107656 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:58,786 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> allocateBlock:
> /hbase/data/default/2025Thread0_table/6cd0083d54dbd8a8b010c00a2770a321/recovered.edits/0000000000000000002.temp.
> BP-89134156-10.18.40.69-1405003788606
> blk_1073849564_108794{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1,
> replicas=[ReplicaUnderConstruction[[DISK]DS-de56b04c-75dc-4884-aede-2212c1a1305e:NORMAL|RBW],
> ReplicaUnderConstruction[[DISK]DS-6a01309e-a4f2-4ad2-a100-38ae0c947427:NORMAL|RBW]]}
> 2014-07-11 08:23:58,826 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848423_107652 10.18.40.89:50076 10.18.40.69:50076
> 2014-07-11 08:23:58,832 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589066 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/7d5032128770926448aa7e02c3cf963e/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,833 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589067 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589068 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/f17021778ea258ffb5edde2d86b0237f/recovered.edits
> 2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create from
> 10.18.40.69:37208 Call#589069 Retry#0: java.io.FileNotFoundException:
> Parent directory doesn't exist:
> /hbase/data/default/2025Thread0_table/046da18c6ddeae7df459c42daf711c78/recovered.edits
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589072 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/995a56512784d109352d9dcbbeeebfbc/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589070 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589071 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/6b8b92d73b51e1fd245854e62111fc8c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,857 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589073 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/591d084674854bb913f419046e7f7d0c/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,960 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589074 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/5c514b93ed15b38cdc210abed58907df/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,966 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589076 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/b6d129691540af012a388590bf1e8629/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 65110, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
> 10.18.40.69:37208 Call#589077 Retry#0:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/data/default/2025Thread0_table/190414760587ee6adbbb518e7e0c719f/recovered.edits/0000000000000000002.temp:
> File does not exist. [Lease.  Holder:
> DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29,
> pendingcreates: 11]
> 2014-07-11 08:23:58,996 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073849517_108746 10.18.40.69:50076 10.18.40.89:50076
> 2014-07-11 08:23:59,014 INFO BlockStateChange: BLOCK* addToInvalidates:
> blk_1073848446_107675 10.18.40.89:50076 10.18.40.69:50076
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: 25 August 2014 AM 11:03
> To: user@hbase.apache.org
> Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got
> while writing log entry to log
>
> What release of HBase are you using ?
>
> Do you see other exceptions in machine1 server log ?
>
> Please check namenode log as well.
>
> Cheers
>
>
> On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y <sreenivasulu.y@huawei.com
> >
> wrote:
>
> > Hi,
> >
> > I am running a cluster with 4 node and  HDFS in HA mode.
> > Machine1, machine2, machine3 and machine4.
> > And I am performing the following operations respectively.
> > 1. create table with precreated regions.
> > 2. insert 1000 records into the table.
> > 3. Disable the table.
> > 4. Modify the table.
> > 5. enable the table.
> > 6. disable and delete the table.
> >
> > While performing the above operations step3 and step4, machine3
> > regionserver went down.
> > But in machine1 thrown the following error
> > 2014-07-11 08:23:58,933 FATAL
> > [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> > Got while writing log entry to log
> > java.io.IOException: cannot get log writer
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
> >         at
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run
> > (HLogSplitter.java:813)
> > and machine 1 also down..
> >
> > please help me why the machine1 regionserver went down.
> >
> >
> >
> >
> > ----------------------------------------------------------------------
> > ---------------------------------------------------------------
> > This e-mail and its attachments contain confidential information from
> > HUAWEI, which is intended only for the person or entity whose address
> > is listed above. Any use of the information contained herein in any
> > way (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete it!
> >
> >
>

RE: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Posted by sreenivasulu y <sr...@huawei.com>.
Thanks for reply ted 
I am using following versions
HBase 0.98.3
HDFS 2.4.3

Do you see other exceptions in machine1 server log ?
The following file is not exist like that throwing error.

2014-07-11 08:23:58,921 WARN  [Thread-39475] hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2772)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2680)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
	at org.apache.hadoop.ipc.Client.call(Client.java:1363)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at $Proxy18.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
	at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
	at $Proxy19.addBlock(Unknown Source)
	at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
	at $Proxy20.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
2014-07-11 08:23:58,933 FATAL [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:  Got while writing log entry to log
java.io.IOException: cannot get log writer
	at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
	at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
	at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
Caused by: java.io.FileNotFoundException: Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:2156)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2289)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2237)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2190)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:520)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:354)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
	at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1604)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1425)
	at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:437)
	at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:433)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.createNonRecursive(DistributedFileSystem.java:433)
	at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1110)
	at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1086)
	at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:78)
	at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:194)
	... 8 more

In NameNode at the same time deleted the file.

2014-07-11 08:23:57,983 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073848427_107656 10.18.40.69:50076 10.18.40.89:50076 
2014-07-11 08:23:58,786 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /hbase/data/default/2025Thread0_table/6cd0083d54dbd8a8b010c00a2770a321/recovered.edits/0000000000000000002.temp. BP-89134156-10.18.40.69-1405003788606 blk_1073849564_108794{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-de56b04c-75dc-4884-aede-2212c1a1305e:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-6a01309e-a4f2-4ad2-a100-38ae0c947427:NORMAL|RBW]]}
2014-07-11 08:23:58,826 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073848423_107652 10.18.40.89:50076 10.18.40.69:50076 
2014-07-11 08:23:58,832 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589066 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/7d5032128770926448aa7e02c3cf963e/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,833 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.18.40.69:37208 Call#589067 Retry#0: java.io.FileNotFoundException: Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/df2dae34231829adc6ac10b43f5decb2/recovered.edits
2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.18.40.69:37208 Call#589068 Retry#0: java.io.FileNotFoundException: Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/f17021778ea258ffb5edde2d86b0237f/recovered.edits
2014-07-11 08:23:58,834 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.18.40.69:37208 Call#589069 Retry#0: java.io.FileNotFoundException: Parent directory doesn't exist: /hbase/data/default/2025Thread0_table/046da18c6ddeae7df459c42daf711c78/recovered.edits
2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589072 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/995a56512784d109352d9dcbbeeebfbc/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589070 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/247b5b359ab81ed5deb5379e0f07ba56/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,844 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589071 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/6b8b92d73b51e1fd245854e62111fc8c/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,857 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589073 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/591d084674854bb913f419046e7f7d0c/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,960 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589074 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/5c514b93ed15b38cdc210abed58907df/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,966 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589076 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/b6d129691540af012a388590bf1e8629/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 65110, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.18.40.69:37208 Call#589077 Retry#0: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase/data/default/2025Thread0_table/190414760587ee6adbbb518e7e0c719f/recovered.edits/0000000000000000002.temp: File does not exist. [Lease.  Holder: DFSClient_hb_rs_HOST-10-18-40-69,60020,1405003853456_765233565_29, pendingcreates: 11]
2014-07-11 08:23:58,996 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073849517_108746 10.18.40.69:50076 10.18.40.89:50076 
2014-07-11 08:23:59,014 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073848446_107675 10.18.40.89:50076 10.18.40.69:50076


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: 25 August 2014 AM 11:03
To: user@hbase.apache.org
Subject: Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

What release of HBase are you using ?

Do you see other exceptions in machine1 server log ?

Please check namenode log as well.

Cheers


On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y <sr...@huawei.com>
wrote:

> Hi,
>
> I am running a cluster with 4 node and  HDFS in HA mode.
> Machine1, machine2, machine3 and machine4.
> And I am performing the following operations respectively.
> 1. create table with precreated regions.
> 2. insert 1000 records into the table.
> 3. Disable the table.
> 4. Modify the table.
> 5. enable the table.
> 6. disable and delete the table.
>
> While performing the above operations step3 and step4, machine3 
> regionserver went down.
> But in machine1 thrown the following error
> 2014-07-11 08:23:58,933 FATAL
> [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> Got while writing log entry to log
> java.io.IOException: cannot get log writer
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run
> (HLogSplitter.java:813)
> and machine 1 also down..
>
> please help me why the machine1 regionserver went down.
>
>
>
>
> ----------------------------------------------------------------------
> ---------------------------------------------------------------
> This e-mail and its attachments contain confidential information from 
> HUAWEI, which is intended only for the person or entity whose address 
> is listed above. Any use of the information contained herein in any 
> way (including, but not limited to, total or partial disclosure, 
> reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, 
> please notify the sender by phone or email immediately and delete it!
>
>

Re: RegionServer failed in logsplitting, wal.HLogSplitter: Got while writing log entry to log

Posted by Ted Yu <yu...@gmail.com>.
What release of HBase are you using ?

Do you see other exceptions in machine1 server log ?

Please check namenode log as well.

Cheers


On Sun, Aug 24, 2014 at 6:37 PM, sreenivasulu y <sr...@huawei.com>
wrote:

> Hi,
>
> I am running a cluster with 4 node and  HDFS in HA mode.
> Machine1, machine2, machine3 and machine4.
> And I am performing the following operations respectively.
> 1. create table with precreated regions.
> 2. insert 1000 records into the table.
> 3. Disable the table.
> 4. Modify the table.
> 5. enable the table.
> 6. disable and delete the table.
>
> While performing the above operations step3 and step4, machine3
> regionserver went down.
> But in machine1 thrown the following error
> 2014-07-11 08:23:58,933 FATAL
> [RS_LOG_REPLAY_OPS-HOST-10-18-40-69:60020-0-Writer-10] wal.HLogSplitter:
> Got while writing log entry to log
> java.io.IOException: cannot get log writer
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:197)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createRecoveredEditsWriter(HLogFactory.java:182)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWriter(HLogSplitter.java:643)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1223)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>         at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
> and machine 1 also down..
>
> please help me why the machine1 regionserver went down.
>
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------------
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please
> notify the sender by phone or email immediately and delete it!
>
>