You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mallikarjun (Jira)" <ji...@apache.org> on 2021/05/18 03:45:00 UTC

[jira] [Comment Edited] (HBASE-25888) Backup tests are categorically flakey

    [ https://issues.apache.org/jira/browse/HBASE-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346564#comment-17346564 ] 

Mallikarjun edited comment on HBASE-25888 at 5/18/21, 3:44 AM:
---------------------------------------------------------------

[~ndimiduk] I see the link is no longer valid, while I missed to look at it in time.

From the logs you have attached, it seems zookeeper connection failure and not actually backups test failures
{code:java}
--------------------------------------------------------------------------------------------------------------------------------------------------------------Test set: org.apache.hadoop.hbase.backup.TestBackupDeleteRestore-------------------------------------------------------------------------------Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 559.735 s <<< FAILURE! - in org.apache.hadoop.hbase.backup.TestBackupDeleteRestoreorg.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore  Time elapsed: 33.143 s  <<< ERROR!java.io.IOException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/hbaseid at org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore(TestBackupDeleteRestore.java:60)Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/hbaseid
org.apache.hadoop.hbase.backup.TestBackupDeleteRestore  Time elapsed: 66.341 s  <<< ERROR!java.lang.NullPointerException
{code}

I ran on my machine, it looks like all tests are passed. 


{code:java}
cd hbase-backup && mvn -Dhadoop-three.version=3.1.4 test
{code}
{code:java}
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupCommandLineTool
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.803 s - in org.apache.hadoop.hbase.backup.TestBackupCommandLineTool                                
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupSmallTests
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupHFileCleaner
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.655 s - in org.apache.hadoop.hbase.backup.TestBackupSmallTests                                     
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupUtils
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.283 s - in org.apache.hadoop.hbase.backup.TestBackupUtils                                           
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.524 s - in org.apache.hadoop.hbase.backup.TestBackupHFileCleaner                                   
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 21, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
[INFO] --- maven-surefire-plugin:3.0.0-M4:test (secondPartTestsExecution) @ hbase-backup ---                                                                                 
[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupManager
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.174 s - in org.apache.hadoop.hbase.backup.TestBackupManager                                         
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
[INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.617 s - in org.apache.hadoop.hbase.backup.TestBackupSystemTable                                   
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.334 s - in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore                                  
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:15 min
[INFO] Finished at: 2021-05-18T08:20:25+05:30
[INFO] Final Memory: 41M/1132M
[INFO] ------------------------------------------------------------------------
{code}
Can you correct me if I have missed something?

 


was (Author: rda3mon):
[~ndimiduk] I see the link is no longer valid, while I missed to look at it in time.

From the logs you have attached, it seems zookeeper connection failure and not actually backups test failures


{code:java}
--------------------------------------------------------------------------------------------------------------------------------------------------------------Test set: org.apache.hadoop.hbase.backup.TestBackupDeleteRestore-------------------------------------------------------------------------------Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 559.735 s <<< FAILURE! - in org.apache.hadoop.hbase.backup.TestBackupDeleteRestoreorg.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore  Time elapsed: 33.143 s  <<< ERROR!java.io.IOException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/hbaseid at org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore(TestBackupDeleteRestore.java:60)Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/hbaseid
org.apache.hadoop.hbase.backup.TestBackupDeleteRestore  Time elapsed: 66.341 s  <<< ERROR!java.lang.NullPointerException
{code}

I ran on my machine, it looks like all tests are passed. 
{code:java}
cd hbase-backup && mvn -Dhadoop-three.version=3.1.4 test
{code}
{code:java}
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupCommandLineTool
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.803 s - in org.apache.hadoop.hbase.backup.TestBackupCommandLineTool                                
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupSmallTests
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupHFileCleaner
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.655 s - in org.apache.hadoop.hbase.backup.TestBackupSmallTests                                     
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupUtils
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.283 s - in org.apache.hadoop.hbase.backup.TestBackupUtils                                           
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.524 s - in org.apache.hadoop.hbase.backup.TestBackupHFileCleaner                                   
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 21, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
[INFO] --- maven-surefire-plugin:3.0.0-M4:test (secondPartTestsExecution) @ hbase-backup ---                                                                                 
[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupManager
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.174 s - in org.apache.hadoop.hbase.backup.TestBackupManager                                         
[INFO] Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
[INFO] Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.617 s - in org.apache.hadoop.hbase.backup.TestBackupSystemTable                                   
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.334 s - in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore                                  
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 17, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:15 min
[INFO] Finished at: 2021-05-18T08:20:25+05:30
[INFO] Final Memory: 41M/1132M
[INFO] ------------------------------------------------------------------------
{code}
Can you correct me if I have missed something?

 

> Backup tests are categorically flakey
> -------------------------------------
>
>                 Key: HBASE-25888
>                 URL: https://issues.apache.org/jira/browse/HBASE-25888
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&amp;restore, test
>            Reporter: Nick Dimiduk
>            Assignee: Mallikarjun
>            Priority: Major
>         Attachments: TEST-org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.xml.gz, org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt, org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt.gz
>
>
> Here's some logs from a PR build vs. master that suffered a significant number of failures in the backup tests. I suspect that a single improvement could fix all of these tests to be more robust.
> {noformat}
> Test Name
> Duration
> Age
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore	6 min 23 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.(?)	1 min 6 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestBackupMerge.TestIncBackupMergeRestore	5 min 3 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestBackupMerge.(?)	1 min 6 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestFullBackupSet.testFullBackupSetExist	6 min 16 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestFullBackupSet.(?)	1 min 6 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.TestIncBackupMergeRestore	5 min 55 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.(?)	1 min 6 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable	5 min 56 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.(?)	1 min 6 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreSingleEmpty	6 min 5 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreMultipleEmpty	0.17 sec	1
>  precommit checks / yetus jdk8 Hadoop3 checks / org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.(?)
> {noformat}
> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3249/4/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)