You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chaijunjie (Jira)" <ji...@apache.org> on 2023/11/02 01:38:00 UTC

[jira] [Commented] (HBASE-28150) CreateTableProcedure should sleep a while before retrying

    [ https://issues.apache.org/jira/browse/HBASE-28150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781960#comment-17781960 ] 

chaijunjie commented on HBASE-28150:
------------------------------------

[~zhangduo] thanks for checking, this issue's root cause is moving tmp table dir to target table dir failed, the parent dir not exist it will never success, but if we can retry with sleep,it will be more helper for this...I will raise a PR

> CreateTableProcedure should sleep a while before retrying
> ---------------------------------------------------------
>
>                 Key: HBASE-28150
>                 URL: https://issues.apache.org/jira/browse/HBASE-28150
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, proc-v2
>    Affects Versions: 2.4.14
>            Reporter: chaijunjie
>            Assignee: chaijunjie
>            Priority: Major
>
> create a table, but it failed when execute CREATE_TABLE_WRITE_FS_LAYOUT, then will try again and again, will write too many proc record to master:store, we find num of the master WAL in oldWALs more than 13000..
>  
> Q: should add a  suspend time logic for create table proc retry? i see TransitRegionStateProcedure has the logic..
>  
> ---------------------------------------------------------------
> sorry, i upload screenshot failed, just copy to here
> {code:java}
> // 2023-10-12 12:34:35,360 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closing region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1688)
> 2023-10-12 12:34:35,360 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closed themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1900)
> 2023-10-12 12:34:35,360 | INFO  | PEWorker-1 | Region directories are created at hdfs://hacluster/hbase/.tmp for table themis:a | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:346)
> 2023-10-12 12:34:35,362 | WARN  | PEWorker-1 | Retriable error trying to create table=themis:a state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:159)
> java.io.IOException: Unable to move table from temp=hdfs://hacluster/hbase/.tmp/data/themis/a to hbase root=hdfs://hacluster/hbase/data/themis/a
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.moveTempDirectoryToHBaseRoot(CreateTableProcedure.java:391)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:350)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:318)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:121)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:75)
>         at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)
>         at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1962)
>         at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:221)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1988)
> 2023-10-12 12:34:35,387 | INFO  | PEWorker-1 | pid=917, state=RUNNABLE:CREATE_TABLE_WRITE_FS_LAYOUT, locked=true; CreateTableProcedure table=themis:a execute state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:102)
> 2023-10-12 12:34:35,414 | INFO  | RegionOpenAndInit-themis:a-pool-0 | creating {ENCODED => 513d3d5b4d3ad5c8f13bacea4a888d69, NAME => 'themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69.', STARTKEY => '', ENDKEY => ''}, tableDescriptor='themis:a', {NAME => 'f1', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}, regionDir=hdfs://hacluster/hbase/.tmp | org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:7906)
> 2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Waiting for flushes and compactions to finish for the region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.waitForFlushesAndCompactions(HRegion.java:1911)
> 2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Total wait time for flushes and compaction for the region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. is: 0ms | org.apache.hadoop.hbase.regionserver.HRegion.waitForFlushesAndCompactions(HRegion.java:1946)
> 2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closing region themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1688)
> 2023-10-12 12:34:35,432 | INFO  | RegionOpenAndInit-themis:a-pool-0 | Closed themis:a,,1697025107991.513d3d5b4d3ad5c8f13bacea4a888d69. | org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1900)
> 2023-10-12 12:34:35,432 | INFO  | PEWorker-1 | Region directories are created at hdfs://hacluster/hbase/.tmp for table themis:a | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:346)
> 2023-10-12 12:34:35,434 | WARN  | PEWorker-1 | Retriable error trying to create table=themis:a state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:159)
> java.io.IOException: Unable to move table from temp=hdfs://hacluster/hbase/.tmp/data/themis/a to hbase root=hdfs://hacluster/hbase/data/themis/a
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.moveTempDirectoryToHBaseRoot(CreateTableProcedure.java:391)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:350)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:318)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:121)
>         at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:75)
>         at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)
>         at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1962)
>         at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:221)
>         at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1988)
> 2023-10-12 12:34:35,469 | INFO  | PEWorker-1 | pid=917, state=RUNNABLE:CREATE_TABLE_WRITE_FS_LAYOUT, locked=true; CreateTableProcedure table=themis:a execute state=CREATE_TABLE_WRITE_FS_LAYOUT | org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:102)
>  {code}
> {code:java}
> //hdfs dfs -ls /hbase/oldWALs | grep 'masterlocal' |wc -l
> 13398
>  {code}
>  
> analysis:
> this was beacuse {color:#ff0000}*i delete namespace dir in HDFS directly*{color}...but {color:#ff0000}*did not delete from hbase:namespce*{color}, so when i want to create a table in this namespce will hang....
> it's a operation error...
> but if some logic failed in CreateTableProcedure, i think will cause this issue again...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)