You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2020/10/19 09:11:03 UTC

Slack digest for #general - 2020-10-19

2020-10-18 13:13:16 UTC - Yuval Kovler: @Yuval Kovler has joined the channel
----
2020-10-18 18:59:50 UTC - Rattanjot Singh: Is there a way to list proxies like we list brokers.
```pulsar-admin brokers list use```
----
2020-10-19 06:20:54 UTC - Lari Hotari: @hangc the change looks simple and effectively prevents the infinite loop. It will take some time for me to confirm in the real environment.

I started looking more into the reason why the state become invalid in the first place. There seems to be quite a few past issues where a race condition in updating readPosition has been an issue. For example <https://github.com/apache/pulsar/pull/1478> , <https://github.com/apache/pulsar/pull/3015> &amp; <https://github.com/apache/pulsar/pull/287> .
I also noticed <https://github.com/apache/pulsar/pull/6606> which adds READ_POSITION_UPDATER for readPosition in ManagedCursorImpl .

It seems that ManagedCursorImpl.readPosition could only get out of sync from OpReadEntry.readPosition if ManagedCursorImpl.readPosition gets updated after the OpReadEntry has been created since OpReadEntry's readPosition gets initialized from ManagedCursorImpl.readPosition.
The race condition seems to happen in this code in the setAcknowledgePosition method:
<https://github.com/apache/pulsar/blob/825fdd4222dd65ef3099f1a975a1555226297379/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L1512-L1523>
In other locations, whenever readPosition field is modified, it is locked. In this location, there is no lock.
However <https://github.com/apache/pulsar/pull/6606> introduced another method for handling race condition. So there are 2 ways to handle race conditions for readPosition field: ManagedCursorImpl.lock.writeLock() and there's also the ManagedCursorImpl.READ_POSITION_UPDATER .

What would be the way to fix the root cause, the race condition in updating the readPosition field?
----
2020-10-19 06:40:56 UTC - Lari Hotari: I created a separate issue to handle the root cause: <https://github.com/apache/pulsar/issues/8293>
----
2020-10-19 06:56:03 UTC - Johannes Wienke: Thanks for the replies. Getting that WIP into production would definitely help us. We'd still like to use validation if possible
----