You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/06/14 07:27:19 UTC

[GitHub] [pulsar] lhotari opened a new pull request, #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

lhotari opened a new pull request, #16049:
URL: https://github.com/apache/pulsar/pull/16049

   ### Motivation
   
   - heap is an ArrayList which isn't thread safe and it's currently accessed concurrently. 
   
   This could lead to thread safety issues. Here's an example: (stacktraces are [from a fork of branch-2.10](https://github.com/datastax/pulsar/tree/ls210_0.5))
   ```
   2022-06-13T13:57:27,163+0000 [bookkeeper-ml-cache-eviction-19-1] WARN  org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl - Exception while performing cache eviction: Index 0 out of bounds for length 0
   java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
   	at jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) ~[?:?]
   	at jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) ~[?:?]
   	at jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) ~[?:?]
   	at java.util.Objects.checkIndex(Objects.java:372) ~[?:?]
   	at java.util.ArrayList.get(ArrayList.java:459) ~[?:?]
   	at org.apache.bookkeeper.mledger.impl.ManagedCursorContainer.getSlowestReadPositionForActiveCursors(ManagedCursorContainer.java:108) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.getEarlierReadPositionForActiveCursors(ManagedLedgerImpl.java:2185) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.doCacheEviction(ManagedLedgerImpl.java:2175) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$doCacheEviction$4(ManagedLedgerFactoryImpl.java:282) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4772) ~[?:?]
   	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.doCacheEviction(ManagedLedgerFactoryImpl.java:278) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.cacheEvictionTask(ManagedLedgerFactoryImpl.java:263) ~[com.datastax.oss-managed-ledger-2.10.0.5.jar:2.10.0.5]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.77.Final.jar:4.1.77.Final]
   	at java.lang.Thread.run(Thread.java:829) [?:?]
   ```
   
   
   
   ### Modifications
   
   - protect access using the existing rwLock


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1154895104

   The issue about the broker cache regression is #16054


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1154826487

   The thread safety issue was introduced in #12045 . A similar pattern was used in #14985 changes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1154827633

   There's a chance that this fix causes a performance regression. If that's the case, we would have to revisit the solution for tracking the slowest positions. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] michaeljmarshall commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1155500019

   > There's a chance that this fix causes a performance regression. If that's the case, we would have to revisit the solution for tracking the slowest positions.
   
   @lhotari - is there a specific test scenario that we should run to proactively check for a performance regression? The changes looks good to me, but it'd be helpful to diagnose a performance regression sooner than later, if possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
lhotari commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1154880849

   When evaluating the #12045 and #14985 changes, it seems that the solution breaks broker caching. I'll create an issue for tracking this.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] 315157973 commented on pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
315157973 commented on PR #16049:
URL: https://github.com/apache/pulsar/pull/16049#issuecomment-1154902012

   The previous PR is used to solve the memory problem caused by frequent traversal.
   I don't think  add a read lock for cache evictions have any impact on the Broker's read and write performance.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] eolivelli merged pull request #16049: [ML] Fix thread safety issues in accessing ManagedCursorContainer.heap ArrayList

Posted by GitBox <gi...@apache.org>.
eolivelli merged PR #16049:
URL: https://github.com/apache/pulsar/pull/16049


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org