You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2018/11/17 12:49:00 UTC
[jira] [Commented] (HBASE-21490) WALProcedure may remove proc wal
files still with active procedures
[ https://issues.apache.org/jira/browse/HBASE-21490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690533#comment-16690533 ]
Duo Zhang commented on HBASE-21490:
-----------------------------------
OK, the root cause is a bug in RecoverStandByProcedure, there is a NPE when loading it and then causes the master down. But after two times of restarts, the file contains the procedures is deleted.
{noformat}
2018-11-16,20:43:37,454 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.33 cmd=create src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log perm=hbase_tst:supergroup:rw-r----- proto=rpc
2018-11-16,21:05:58,652 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=open src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log proto=rpc
2018-11-16,21:05:58,747 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=open src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log proto=rpc
2018-11-16,21:06:04,196 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=open src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log proto=rpc
2018-11-16,21:06:04,305 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=open src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log proto=rpc
2018-11-16,21:06:04,669 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=rename src=/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000185.log dst=/hbase/c4tst-sync1/oldWALs/pv2-00000000000000000185.log perm=hbase_tst:supergroup:rw-r----- proto=rpc
2018-11-16,21:07:12,776 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: allowed=true ugi=hbase_tst/hadoop@XIAOMI.HADOOP (auth:KERBEROS) ip=/10.132.16.34 cmd=delete src=/hbase/c4tst-sync1/oldWALs/pv2-00000000000000000185.log
{noformat}
Let me check what is going on here...
> WALProcedure may remove proc wal files still with active procedures
> -------------------------------------------------------------------
>
> Key: HBASE-21490
> URL: https://issues.apache.org/jira/browse/HBASE-21490
> Project: HBase
> Issue Type: Sub-task
> Components: proc-v2
> Reporter: Duo Zhang
> Priority: Major
>
> It happens for me several times. After master restart, all the procedures are gone.
> And the proc wal files were deleted before restarting, I see this in the master's log
> {noformat}
> 2018-11-16,20:57:40,177 INFO [WALProcedureStoreSyncThread] org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Remove all state logs with ID less than 184, since all the active procedures are in the latest log
> 2018-11-16,20:57:40,177 INFO [WALProcedureStoreSyncThread] org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFile: Archiving hdfs://c4tst-xiaomi/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000184.log to hdfs://c4tst-xiaomi/hbase/c4tst-sync1/oldWALs/pv2-00000000000000000184.log
> {noformat}
> Let me dig...
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)