You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/01/05 11:18:51 UTC

[GitHub] [bookkeeper] 1559924775 opened a new issue #2974: When the journal directory is even, the journal data is unevenly distributed

1559924775 opened a new issue #2974:
URL: https://github.com/apache/bookkeeper/issues/2974


   **BUG REPORT**
   
   ***Describe the bug***
   
   When the journal directory is configured to be even, the ledger allocated to each journal will be uneven. The root cause is that almost all the ledger IDs generated by the LedgerIdGenerator class are even. The number of ZK temporary order nodes is determined by the cversion of the parent node. Whenever a child node is added or deleted, it will increase by 1. The LedgerIdGenerator class uses ZK's temporary order node to generate the ledgerid, and deletes the node immediately after generation. This way of use causes almost all cversions to be even, resulting in all ledgerids to be even.
   
   ***To Reproduce***
   
   Steps to reproduce the behavior:
   1. Configure in bk_server.conf: jouranlDirectorie=/tmp/bk-txn1,/tmp/bk-txn2
   2. production data to bookie
   
   ***Expected behavior***
   
   Almost all ledgerids are assigned to directory 1
   
   ***Screenshots***
   
   zookeeper:  PrepRequestProcessor::pRequest2TxnCreate:
   ![image2021-11-8_16-1-33](https://user-images.githubusercontent.com/35036009/141671319-418a2c51-0c2b-4f43-849f-c9fb340f114f.png)
   
   <img width="702" alt="journal" src="https://user-images.githubusercontent.com/35036009/141671860-bd2eaa8b-9b01-41d3-a159-688ceea08554.png">
   
   <img width="503" alt="mathutil" src="https://user-images.githubusercontent.com/35036009/141671870-fca81d57-9d97-49ed-8669-b77cdb3176c5.png">
   
   bugfix:
   <img width="942" alt="mathutilnew" src="https://user-images.githubusercontent.com/35036009/141671914-469286b1-b07d-423f-9fef-664286bc2122.png">
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] dlg99 commented on issue #2974: When the journal directory is even, the journal data is unevenly distributed

Posted by GitBox <gi...@apache.org>.
dlg99 commented on issue #2974:
URL: https://github.com/apache/bookkeeper/issues/2974#issuecomment-1030214283


   @Vanlightly @eolivelli I updated OrderedExecutor to better distribute data across the threads, added tests reproducing the issue. please take a look: 
   
   https://github.com/apache/bookkeeper/pull/3023
   
   This does not address the fact that generated ledger ids are skewed towards even numbers.
   This does not fix routing of the journals/data dirs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] Vanlightly commented on issue #2974: When the journal directory is even, the journal data is unevenly distributed

Posted by GitBox <gi...@apache.org>.
Vanlightly commented on issue #2974:
URL: https://github.com/apache/bookkeeper/issues/2974#issuecomment-1028054818


   Changing the routing without some kind of migration will result in effective data loss. Plus an md5 hash is relatively expensive.
   
   The issue here is that ledger ids are all even. I am seeing this myself which results in only even numbered read threads from being routed to. This problem of even ledger ids affects the selection of journals, ledger storage, read threads and write threads. It is a real problem.
   
   @eolivelli This is a pretty serious problem as we're throwing away a bunch of scaling mechanisms.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org