You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2018/09/12 13:12:00 UTC

[jira] [Created] (HBASE-21187) The HBase UTs are extremely slow on some jenkins node

Duo Zhang created HBASE-21187:
---------------------------------

             Summary: The HBase UTs are extremely slow on some jenkins node
                 Key: HBASE-21187
                 URL: https://issues.apache.org/jira/browse/HBASE-21187
             Project: HBase
          Issue Type: Bug
          Components: test
            Reporter: Duo Zhang


Looking at the flaky dashboard for master branch, the top several UTs are likely to fail at the same time. One of the common things for the failed flaky tests job is that, the execution time is more than one hour, and the successful executions are usually only about half an hour.

And I have compared the output for TestRestoreSnapshotFromClientWithRegionReplicas, for a successful run, the DisableTableProcedure can finish within one second, and for the failed run, it can take even more than half a minute.

Not sure what is the real problem, but it seems that for the failed runs, there are likely time holes in the output, i.e, there is no log output for several seconds. Like this:
{noformat}
2018-09-11 21:08:08,152 INFO  [PEWorker-4] procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 12.9380sec
2018-09-11 21:08:15,590 DEBUG [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=33663] master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
{noformat}

No log output for about 7 seconds.

And for a successful run, the same place
{noformat}
2018-09-12 07:47:32,488 INFO  [PEWorker-7] procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 1.2220sec
2018-09-12 07:47:32,881 DEBUG [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=59079] master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
{noformat}

There is no such hole.

Maybe there is big GC?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)