You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/09/26 07:46:06 UTC

[GitHub] [bookkeeper] wenbingshen opened a new pull request, #3504: ledgerFragment check and results keep order

wenbingshen opened a new pull request, #3504:
URL: https://github.com/apache/bookkeeper/pull/3504

   ### Motivation
   When `LedgerChecker` checks Ledger's fragments, it does not keep the order, and the results of the check are not kept in order, which works correctly most of the time.
   
   I'm not sure if we should keep the order of checks and final results because I observed a flaky-test like this:
   https://github.com/apache/bookkeeper/actions/runs/3124965464/jobs/5068848112
   
   I have printed some verbose logs as follows:
   testShouldGetTwoFrgamentsIfTwoBookiesFailedInSameEnsemble In the case of success and failure:
   
   ### When the test execution is successful:
   2022-09-26T15:00:13,790 - INFO  - [Time-limited test:LedgerChecker@373] - checkLedger nowing.
   2022-09-26T15:00:13,792 - INFO  - [Time-limited test:LedgerChecker@398] - LedgerCheck fragments before[Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474, 127.0.0.1:50476], Closed: true)]
   2022-09-26T15:00:13,793 - INFO  - [Time-limited test:LedgerChecker@430] - not execute check fragments. curEnsemble [127.0.0.1:50466, 127.0.0.1:50486, 127.0.0.1:50476], curEntryId 1
   2022-09-26T15:00:13,809 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$1@440] - check2Fragments [Fragment(LedgerID: 0, **FirstEntryID: 0[0]**, LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474, 127.0.0.1:50476], Closed: true), Fragment(LedgerID: 0, **FirstEntryID: 1[1]**, LastKnownEntryID: 1[1], Host: [127.0.0.1:50466, 127.0.0.1:50476, 127.0.0.1:50486], Closed: false)]
   2022-09-26T15:00:13,810 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker@479] - Checking fragment Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474, 127.0.0.1:50476], Closed: true)
   2022-09-26T15:00:13,813 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$FullLedgerCallback@360] - operationComplete Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474], Closed: true), -8
   2022-09-26T15:00:13,813 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker@479] - Checking fragment Fragment(LedgerID: 0, FirstEntryID: 1[1], LastKnownEntryID: 1[1], Host: [127.0.0.1:50466, 127.0.0.1:50476, 127.0.0.1:50486], Closed: false)
   2022-09-26T15:00:13,817 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$FullLedgerCallback@360] - operationComplete Fragment(LedgerID: 0, FirstEntryID: 1[-1], LastKnownEntryID: 1[-1], Host: [127.0.0.1:50466], Closed: false), -8
   2022-09-26T15:00:13,817 - INFO  - [Time-limited test:TestLedgerChecker@534] - waitAndGetResult [Fragment(LedgerID: 0, **FirstEntryID: 0[0]**, LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474], Closed: true), Fragment(LedgerID: 0, **FirstEntryID: 1[-1]**, LastKnownEntryID: 1[-1], Host: [127.0.0.1:50466], Closed: false)]
   2022-09-26T15:00:13,818 - INFO  - [Time-limited test:TestLedgerChecker@201] - other Ensemble {0=[127.0.0.1:50466, 127.0.0.1:50474, 127.0.0.1:50476], 1=[127.0.0.1:50466, 127.0.0.1:50486, 127.0.0.1:50476]}
   2022-09-26T15:00:13,818 - INFO  - [Time-limited test:TestLedgerChecker@206] - unreplicated fragment: Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50466, 127.0.0.1:50474], Closed: true)
   2022-09-26T15:00:13,819 - INFO  - [Time-limited test:TestLedgerChecker@206] - unreplicated fragment: Fragment(LedgerID: 0, FirstEntryID: 1[-1], LastKnownEntryID: 1[-1], Host: [127.0.0.1:50466], Closed: false)
   2022-09-26T15:00:13,819 - INFO  - [Time-limited test:TestLedgerChecker@212] - bookies fragments [127.0.0.1:50466, 127.0.0.1:50474, 127.0.0.1:50476], [0, 1]
   2022-09-26T15:00:13,819 - INFO  - [Time-limited test:BookKeeperClusterTestCase@199] - TearDown
   
   ### When test execution failed:
   2022-09-26T14:58:09,687 - INFO  - [Time-limited test:LedgerChecker@373] - checkLedger nowing.
   2022-09-26T14:58:09,689 - INFO  - [Time-limited test:LedgerChecker@398] - LedgerCheck fragments before[Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50107, 127.0.0.1:50109], Closed: true)]
   2022-09-26T14:58:09,690 - INFO  - [Time-limited test:LedgerChecker@430] - not execute check fragments. curEnsemble [127.0.0.1:50105, 127.0.0.1:50122, 127.0.0.1:50107], curEntryId 1
   2022-09-26T14:58:09,704 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$1@440] - check2Fragments [Fragment(LedgerID: 0, **FirstEntryID: 1[1]**, LastKnownEntryID: 1[1], Host: [127.0.0.1:50105, 127.0.0.1:50122, 127.0.0.1:50107], Closed: false), Fragment(LedgerID: 0, **FirstEntryID: 0[0]**, LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50107, 127.0.0.1:50109], Closed: true)]
   2022-09-26T14:58:09,705 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker@479] - Checking fragment Fragment(LedgerID: 0, FirstEntryID: 1[1], LastKnownEntryID: 1[1], Host: [127.0.0.1:50105, 127.0.0.1:50122, 127.0.0.1:50107], Closed: false)
   2022-09-26T14:58:09,707 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker@479] - Checking fragment Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50107, 127.0.0.1:50109], Closed: true)
   2022-09-26T14:58:09,708 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$FullLedgerCallback@360] - operationComplete Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50109], Closed: true), -8
   2022-09-26T14:58:09,708 - INFO  - [BookKeeperClientWorker-OrderedExecutor-0-0:LedgerChecker$FullLedgerCallback@360] - operationComplete Fragment(LedgerID: 0, FirstEntryID: 1[-1], LastKnownEntryID: 1[-1], Host: [127.0.0.1:50105], Closed: false), -8
   2022-09-26T14:58:09,709 - INFO  - [Time-limited test:TestLedgerChecker@534] - waitAndGetResult [Fragment(LedgerID: 0, **FirstEntryID: 1[-1]**, LastKnownEntryID: 1[-1], Host: [127.0.0.1:50105], Closed: false), Fragment(LedgerID: 0, **FirstEntryID: 0[0]**, LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50109], Closed: true)]
   2022-09-26T14:58:09,709 - INFO  - [Time-limited test:TestLedgerChecker@201] - other Ensemble {0=[127.0.0.1:50105, 127.0.0.1:50109, 127.0.0.1:50107], 1=[127.0.0.1:50105, 127.0.0.1:50122, 127.0.0.1:50107]}
   2022-09-26T14:58:09,709 - INFO  - [Time-limited test:TestLedgerChecker@206] - unreplicated fragment: Fragment(LedgerID: 0, FirstEntryID: 1[-1], LastKnownEntryID: 1[-1], Host: [127.0.0.1:50105], Closed: false)
   2022-09-26T14:58:09,709 - INFO  - [Time-limited test:TestLedgerChecker@206] - unreplicated fragment: Fragment(LedgerID: 0, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [127.0.0.1:50105, 127.0.0.1:50109], Closed: true)
   2022-09-26T14:58:09,709 - INFO  - [Time-limited test:TestLedgerChecker@212] - bookies fragments [127.0.0.1:50105, 127.0.0.1:50122, 127.0.0.1:50107], [0]
   2022-09-26T14:58:09,710 - INFO  - [Time-limited test:BookKeeperClusterTestCase@199] - TearDown
   
   ### Changes
   Replace `HashSet` with `LinkedHashSet`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on pull request #3504: ledgerFragment check and results keep order

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on PR #3504:
URL: https://github.com/apache/bookkeeper/pull/3504#issuecomment-1257634295

   ping @Shoothzj @hangc0276 @dlg99 @zymap @StevenLuMT PTAL, Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on pull request #3504: ledgerFragment check and results keep order

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on PR #3504:
URL: https://github.com/apache/bookkeeper/pull/3504#issuecomment-1259253606

   > Do you have more evidence about that?
   
   No, I have no other affected scenes.
   I just saw a flaky-test and found during analysis that there is no guaranteed order here
   
   > After reading the code, looks like the fragments' usage is unrelated to the order.
   > 
   > If we are not percent sure about that, I think we'd better not use the LinkedHashSet. That may increase the time complexity.
   
   What time complexity are you talking about?
   You can find some time complexity comparisons of HashSet and LinkedHashSet from the following website:
   https://www.baeldung.com/java-collections-complexity
   
   The screenshot below is my local test:
   1000items:
   ![image](https://user-images.githubusercontent.com/35599757/192492817-92fce0ec-6f1c-4adc-a165-5bfae5c7913e.png)
   
   1 billion items:
   ![image](https://user-images.githubusercontent.com/35599757/192492932-f56c114c-f7ad-4283-b756-09557cbb5adb.png)
   
   @zymap PTAL. Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] wenbingshen commented on pull request #3504: ledgerFragment check and results keep order

Posted by GitBox <gi...@apache.org>.
wenbingshen commented on PR #3504:
URL: https://github.com/apache/bookkeeper/pull/3504#issuecomment-1257692852

   rerun failure checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [bookkeeper] zymap merged pull request #3504: ledgerFragment check and results keep order

Posted by GitBox <gi...@apache.org>.
zymap merged PR #3504:
URL: https://github.com/apache/bookkeeper/pull/3504


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org