You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/04/07 15:09:53 UTC

[GitHub] [pulsar] liangyepianzhou opened a new issue, #15073: Optimize metadataPositions in MLPendingAckStore

liangyepianzhou opened a new issue, #15073:
URL: https://github.com/apache/pulsar/issues/15073

   ## Motivation
        `This optimization is to optimize a map in MLPendingAckStore to save memory and avoid OOM.`
   Now, there is a map (metadataPositions) in `MLPendingAckStore`, which value is the max poistion acked by an operation and key is the position which is persistent in pendingAck.And the map is used to clear useless  data in PendingAck.
   It will judge the position with the max sub cursor position whether smaller than the `subCursor markdeleteposition`.If the max position is smaller than the subCursor mark delete position, the log cursor will mark delete the position.
   The current implementation is shown below:
   ![image](https://user-images.githubusercontent.com/55571188/162201941-1b2be89e-a3d8-45ec-8768-7f0dbaab08fa.png)
   In normal times, this map will store all transaction ack operations in this map once. This is a waste of memory and CPU. And if a transaction that has not been committed for a long time acks a message in a later position, the map will not be cleaned up, and finally lead to the problem of OOM.
   
   ## Goal
   Optimize the `metadataPositions` in MLPendingAckStore to take up less memory and not cause OOM.
   ## Implementation
   The preferred solution now is to change the value stored in the map. The key is the `lastConfirmEntry` of `pendingack`, and the value is the largest ack position that has appeared.
   The specific implementation is shown in the figure:
   ![image](https://user-images.githubusercontent.com/55571188/162225925-447e09d4-3a53-412d-9a65-2988d25116a4.png)
   Variables we will add: 
   1. maxAckPosition: the largest ack position that has appeared.
   2.  lastConfirmEntry:  PendingAck lastConfirmEntry
   3.  maxAckPositionChangeTimes: change times of maxAckPosition
   When maxAck `PositionChanegTimes` reaches a certain value, put `maxAckPosition` and `lastConfirmEntry` into the map once. And limit the number of key-value pairs in the map.
   When the `PersistentMaxDeletePosition` of the subscription cursor of the original topic is greater than the `maxAckPosition` in the map, the position before the corresponding `lastConfirmEntry` can be cleared in the pendingack
   ## Reject Alternatives
   1. Does not change the values ​​stored in the map, but changes the order in which they are stored. Sort by value. 
   Advantages: OOM can be avoided.
    Disadvantages: Can't reduce memory usage during normal use
   2. Use timer to store the above lastConfirmEntry and maxAckPosition regularly
   Advantages: Can reduce memory usage during normal use
    Disadvantages: OOM can`t be avoided.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] liangyepianzhou commented on issue #15073: [PIP-153] Optimize metadataPositions in MLPendingAckStore

Posted by GitBox <gi...@apache.org>.
liangyepianzhou commented on issue #15073:
URL: https://github.com/apache/pulsar/issues/15073#issuecomment-1094987103

   `thisLastConfirmEntry` is a variable, it means the last confirm entry. It may be a little out of sync with the real LAC, but semantically the same. So I name it `thisLastConfirmEntry`.
   >upperLimitOfMaxAckPositionChangeTimes = metadataPositions.size * 500
   
   when maxAckPosition change times reaches `upperLimitOfMaxAckPositionChangeTimes`, will store` (maxAckPosition, thisLastConfirmEntry)` into `metadataPositions(pendingAckLogReverseIndex)`.
   So it is almost impossible to have indexes that are too close.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on issue #15073: [PIP-153] Optimize metadataPositions in MLPendingAckStore

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #15073:
URL: https://github.com/apache/pulsar/issues/15073#issuecomment-1094955483

   The new introduced map `ConcurrentSkipListMap<PositionImpl, PositionImpl> metadataPositions` should be named `pendingAckLogReverseIndex`, I think this should be a more precise name.
   
   > upperLimitOfMaxAckPositionChangeTimes = metadataPositions.size * 500
   
   I'm not fully understanding here, why need * 500 here,
   I think we can just add a configuration `transactionPendingAckLogReverseMaxIndexSize`?
   If reaching the max limitation, we only update the last element of the index, after the old element is deleted,
   we can continue to add a new element to the index map.
   
   And to avoid the `too close` indexes(to avoid creating an index for pending ack log positions 1 and 2), we can add a `transactionPendingAckLogMinLag`.
   
   > thisLastConfirmEntry: the position persistent by pendingAck log, when need to store a maxAckPosition.
   
   It's better not to use the LAC here(because LAC is a mutable value in Pulsar Concepts), it should be the position in the pendingAck log? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui closed issue #15073: [PIP-153] Optimize metadataPositions in MLPendingAckStore

Posted by GitBox <gi...@apache.org>.
codelipenghui closed issue #15073: [PIP-153] Optimize metadataPositions in MLPendingAckStore
URL: https://github.com/apache/pulsar/issues/15073


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] liangyepianzhou commented on issue #15073: [PIP-153] Optimize metadataPositions in MLPendingAckStore

Posted by GitBox <gi...@apache.org>.
liangyepianzhou commented on issue #15073:
URL: https://github.com/apache/pulsar/issues/15073#issuecomment-1096915534

   I think we should let maxAckPosition as key, and logPosition as value, that is because:
   1. They are all incremental
   2. maxAckPosition can be same, but logPosition always is different.
   3. if maxAckPosition is same, we should update logPosition


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org