You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/04/28 14:51:49 UTC

[GitHub] [pulsar] poorbarcode opened a new issue, #15370: Add batched ManagedLedger

poorbarcode opened a new issue, #15370:
URL: https://github.com/apache/pulsar/issues/15370

   ## Motivation
   
   `ManagedLedger` provides apis for operating on Bookie, now  has only one implementation: `ManagedLedgerImpl`, In the case of an `Entry` only has single message, performance is less perfect. We should provide a feature that could accumulative and batch processing. Just like this: 
   
   ````java
   public void addEntry(ByteBuf buffer){
       list.add(buffer);
       if (list.size() > 512){
           doFlush();
       }
   }
   ````
   
   If Pulsar have this feature, Broker can recieve a lot request in short time, but only write to Bookie once, this will greatly improve performance.
   
   
   User need this feature at some case: 
   - User has a lot client, send message at regular intervals. For example: statistics data collect/ device data collect.
   - At transaction mode, Broker will receive more tx-request in short time. These requests contains `ADD_PARTITION_TO_TXN`, `ADD_SUBSCRIPTION_TO_TXN`, `ACK_RESPONSE` and more.
   
   ## Goal
   
   1.Provide a batched implementation for `ManagedLedger` for write.
   
   **org.apache.bookkeeper.mledger.impl.BatchManagedLedgerImpl**
   ````java
   public class BatchManagedLedgerImpl extends ManagedLedgerImpl {
   
       /** When items count reaches the threshold, do batch flush. default: 512. **/
       private final int batchThresholdRecordCount;
       /** When data size reaches the threshold, do batch flush. default: 4m. **/
       private final int batchThresholdByteSize;
       /** Do batch flush scheduled at fixed time, no matter whether the threshold is reached. **/
       private final int batchIntervalMillis;
   
       @Override
       public void asyncAddEntry(final ByteBuf buffer, final AsyncCallbacks.AddEntryCallback callback,
                                 final Object context){
           // batch write
       }
   
       @Override
       public void asyncAddEntry(final ByteBuf buffer, final int numberOfMessages,
                                 final AsyncCallbacks.AddEntryCallback callback, final Object context) {
           // batch write
       }
   }
   ````
   
   2.Provide a batched implementation for `ManagedCursor` for read.
   
   **org.apache.bookkeeper.mledger.impl.BatchManagedCursorImpl**
   ````java
   public class BatchManagedCursorImpl extends ManagedCursorImpl{
   
       @Override
       public void asyncReadEntries(int numberOfEntriesToRead, AsyncCallbacks.ReadEntriesCallback callback, Object ctx,
                                    PositionImpl maxPosition){
           // batch read
       }
   }
   ````
   
   
   ## API Changes
   
   No API change.
   
   ## Implementation
   
   **How to cache requests and trigger flush ?**
   
   - Definition a array field at `ManagedLedgerImpl`
   - Three flush strategies
     - When requests count reach threshold
     - When requests byte size reach threshold
     - Schedule at fixed rate
   
   **How to mark the Entry is a batch Entry ?**
   
   Add a field at `BrokerEntryMetadata`:  `elementEndIndexArray`, this field has two meanings:
   
   - Mark the entry is a batch entry
   - Mark every inner entry's range( start at & end at )
   
   Provide another implementation for `ManagedLedgerInterceptor`, to set `elementEndIndexArray` value.
   
   **After batch managed ledger implementation, what else to do ?**
   - Make `MLTransactionLogImpl` batch submit.
   - Make `MLPendingAckStore` batch submit.
   - Make `Producer.publishMessage()` batch submit( User optional ).
   
   
   ## Reject Alternatives
   
   Nothing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode commented on issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
poorbarcode commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1162601065

   @eolivelli @gaoran10 @gaozhangmin @315157973 @lhotari @michaeljmarshall @congbobo184 @Technoboy-   @hangc0276  @merlimat @jiazhai @lhotari  I have start a vote and already send an email, could you take a look.
   
   Discuss Link: https://lists.apache.org/thread/lsmn0hg9np97qrzzh2wovxq1yhxj9qhy
   
   Vote Link: https://lists.apache.org/thread/hykz4fpz6dnz1skzks3h4ht0t3twf70b


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode commented on issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
poorbarcode commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1159977334

   > We should use plural names for the repeated fields such as entries or records.
   
   OK 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] github-actions[bot] commented on issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1200332612

   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] eolivelli commented on issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
eolivelli commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1158592444

   @nicoloboschi FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode commented on issue #15370: [PIP-160] ManagedLedger decorator for batch append enties

Posted by GitBox <gi...@apache.org>.
poorbarcode commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1122084334

   @eolivelli @hangc0276 @codelipenghui @merlimat @rdhabalia @315157973 @congbobo184 @gaoran10 
   
   I have sended email. Could you take a look. Thanks
   
   https://lists.apache.org/thread/mfgctq88oocj30h3yh69hh8vqps4rgdb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode closed issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
poorbarcode closed issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store
URL: https://github.com/apache/pulsar/issues/15370


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode commented on issue #15370: [PIP-160] Batch writing ledger for transaction operation

Posted by GitBox <gi...@apache.org>.
poorbarcode commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1152174215

   @eolivelli @gaoran10 @gaozhangmin @315157973 @lhotari @michaeljmarshall @congbobo184 @Technoboy-   I have start a vote and already send email, could you take a look.
   
   Discuss Link: https://lists.apache.org/thread/lsmn0hg9np97qrzzh2wovxq1yhxj9qhy


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15370: [PIP-160] Make transactions work more efficiently by aggregation operation for transaction log and pending ack store

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15370:
URL: https://github.com/apache/pulsar/issues/15370#issuecomment-1159870170

   > message BatchedTransactionMetadataEntry{
     // Array for buffer transaction log data.
     repeated TransactionMetadataEntry transaction_log = 1;
   }
   
   > message BatchedPendingAckMetadataEntry{
     // Array for buffer pending ack data.
     repeated PendingAckMetadataEntry pending_ack_log=1;
   }
   
   we should use plural names for the repeated fields such as entries or records.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org