You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Alexandre DUVAL <ka...@gmail.com> on 2023/03/06 14:32:34 UTC

[DISCUSS] new idea: reverse reading a topic

Hi,

I'm wondering if it is possible to introduce a new feature on Pulsar 
which will enable users to read topic from a defined MessageId to 
previous messages until the begin of the topic.

I tried to use Pulsar SQL but it requires so much RAM even for little 
queries (due to Presto design).

Currently, every read in Pulsar are expected to be going forward. So it 
might be a bit tricky to prevent every weird behavior by introducing the 
feature.

I'm currently tried to make an MVP/POC by introducting a readReverse 
field in the CommandSubscribe that is used by ReaderAPI and currently 
looking for to create a getFirstMessageId() on ManagedLedger 
(https://github.com/CleverCloud/pulsar/pull/3). I also removed 
startPosition < endPosition sanity checks in BookKeeper locally 
(https://github.com/CleverCloud/bookkeeper/pull/2).

We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in 
the ReaderAPI.

I'm not familiar with these internals such as NonDurableCursor, 
RangeEntryCache, ManagedCursor so it's a bit tricky.

So I wondering someone to help/guide me or even directly handle the 
subject (or the discuss).

Regards,

Kannar



Re: [DISCUSS] new idea: reverse reading a topic

Posted by Andrey Yegorov <an...@datastax.com>.
Bookkeeper's ledger is a distributed WAL with random reads support.
ledger.readAsync(long firstEntry, long lastEntry) works if firstEntry ==
lastEntry
With that one can pipeline reads of as many entries in parallel as needed,
no BK API changes needed for such a niche case.

Reading backwards may backfire in performance, as there is a caching layer
in BK + OS pages data in memory in anticipation of sequential reads.
You can experiment with OS tuning and some BK parameters, with modern SSD
drives backwards reads should have minimal perf differences.

Majority of your changes are in pulsar, I haven't looked closely but my
best guess is that it will affect topic truncation that relies on
subscription moving forwards.

FWIW, consumer has seek() API
https://github.com/apache/pulsar/blob/e5a833a2dcb7ce13ada4ca94714cc045a02de276/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Consumer.java#L482
You can seek to N messages back from last offset, read messages in memory
forward sequentially, reverse, handle, repeat.

As for the PR and idea overall, I'd suggest calling into one of the Pulsar
community meetings to get faster feedback.

On Mon, Mar 6, 2023 at 7:37 AM Alexandre DUVAL <ka...@gmail.com> wrote:

> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>

-- 
Andrey Yegorov

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Andrey Yegorov <an...@datastax.com>.
Bookkeeper's ledger is a distributed WAL with random reads support.
ledger.readAsync(long firstEntry, long lastEntry) works if firstEntry ==
lastEntry
With that one can pipeline reads of as many entries in parallel as needed,
no BK API changes needed for such a niche case.

Reading backwards may backfire in performance, as there is a caching layer
in BK + OS pages data in memory in anticipation of sequential reads.
You can experiment with OS tuning and some BK parameters, with modern SSD
drives backwards reads should have minimal perf differences.

Majority of your changes are in pulsar, I haven't looked closely but my
best guess is that it will affect topic truncation that relies on
subscription moving forwards.

FWIW, consumer has seek() API
https://github.com/apache/pulsar/blob/e5a833a2dcb7ce13ada4ca94714cc045a02de276/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Consumer.java#L482
You can seek to N messages back from last offset, read messages in memory
forward sequentially, reverse, handle, repeat.

As for the PR and idea overall, I'd suggest calling into one of the Pulsar
community meetings to get faster feedback.

On Mon, Mar 6, 2023 at 7:37 AM Alexandre DUVAL <ka...@gmail.com> wrote:

> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>

-- 
Andrey Yegorov

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Alexandre DUVAL <ka...@gmail.com>.
Considering topic as a timeseries store it enables us to read forward & 
backward stored messages using limited since/until windows.

On 3/15/23 05:34, Haiting Jiang wrote:
>> Here, the goal is to read a topic from a known messageId to previous messages
>> until a constraint depending on read messages.
> I get your point, but can you give a more detailed use case?
>
> Thanks,
> Haiting
>
> On Tue, Mar 14, 2023 at 6:40 PM Alexandre DUVAL
> <al...@clever-cloud.com.invalid> wrote:
>> Hi,
>>
>> The way you suggest works when we know how many messages we want to
>> read. It can be a nice feature too, but this is not my case here. Here,
>> the goal is to read a topic from a known messageId to previous messages
>> until a constraint depending on read messages. It looks
>> hasPreviousMessage() & receivePrevious are the only way to implement it.
>>
>> For the record, I already implemented the way you suggest using Pulsar
>> API by fetching topic's internal informations and compute the messageId
>> from ledgers metadata with a recursion.
>>
>> Best,
>>
>> Kannar
>>
>>
>> On 3/9/23 07:26, Haiting Jiang wrote:
>>> Hi Kannar,
>>>
>>> +1 to find the position first and then read like normal as mentioned
>>> by Yong and Michael.
>>>
>>> Another problem of  reading reverse is that  it would break all the
>>> read ahead techniques in the storage and result in very poor
>>> performance.
>>>
>>>> This would work but it will need something to store every messages read
>>>> to reverse them before answer which can be heavy in RAM usages.
>>> Finding the position doesn't require reading all the messages body.
>>> Just use the ledger metadata info and maybe some message heads in the
>>> last ledger would be enough.
>>>
>>> Thanks,
>>> Haiting
>>>
>>> On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>>>>> Have you looked at the seek implementation to see if it would be
>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>> a message id in the way you described?
>>>>
>>>> I think it's not very feasible for this case. Seeking before can lead
>>>> to consumer reconnection, which can cause significant performance
>>>> issues and overhead.
>>>>
>>>>
>>>> Zike Yang
>>>>
>>>> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>>>>> Kannar,
>>>>>
>>>>> Why not find the stop position first, then read the message
>>>>> until a given position?
>>>>> Does the stop position change dynamically? You only know
>>>>> it once you meet it?
>>>>>
>>>>> Yong
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
>>>>> <al...@clever-cloud.com.invalid> wrote:
>>>>>
>>>>>> Hi Michael,
>>>>>>
>>>>>> This would work but it will need something to store every messages read
>>>>>> to reverse them before answer which can be heavy in RAM usages. The key
>>>>>> point of the future is to read message by message from a MessageId to
>>>>>> past with stop read possible conditions.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Kannar
>>>>>>
>>>>>> On 3/7/23 22:10, Michael Marshall wrote:
>>>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>>>> before this MessageId.
>>>>>>> Have you looked at the seek implementation to see if it would be
>>>>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>>>>> a message id in the way you described? I haven't considered all of the
>>>>>>> implications, but if the main goal is to move the cursor, I think the
>>>>>>> solution should be about moving the cursor, not about reading a topic
>>>>>>> in reverse.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Michael
>>>>>>>
>>>>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
>>>>>> wrote:
>>>>>>>> Hi Yong,
>>>>>>>>
>>>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>>>> before this MessageId.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Kannar
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/7/23 01:53, Yong Zhang wrote:
>>>>>>>>> Hi Kannar,
>>>>>>>>>
>>>>>>>>> Just interested in what exactly your case.
>>>>>>>>>
>>>>>>>>> Why do you need to read messages in a reversed order? What is your
>>>>>> case?
>>>>>>>>> Best,
>>>>>>>>> Yong
>>>>>>>>>
>>>>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>>>>>>>> which will enable users to read topic from a defined MessageId to
>>>>>>>>>> previous messages until the begin of the topic.
>>>>>>>>>>
>>>>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>>>>>>>> queries (due to Presto design).
>>>>>>>>>>
>>>>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
>>>>>> it
>>>>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
>>>>>> the
>>>>>>>>>> feature.
>>>>>>>>>>
>>>>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>>>>>>>
>>>>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
>>>>>> in
>>>>>>>>>> the ReaderAPI.
>>>>>>>>>>
>>>>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>>>>>>>
>>>>>>>>>> So I wondering someone to help/guide me or even directly handle the
>>>>>>>>>> subject (or the discuss).
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Kannar
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
-- 
Best, Kannar

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Haiting Jiang <ji...@gmail.com>.
> Here, the goal is to read a topic from a known messageId to previous messages
> until a constraint depending on read messages.

I get your point, but can you give a more detailed use case?

Thanks,
Haiting

On Tue, Mar 14, 2023 at 6:40 PM Alexandre DUVAL
<al...@clever-cloud.com.invalid> wrote:
>
> Hi,
>
> The way you suggest works when we know how many messages we want to
> read. It can be a nice feature too, but this is not my case here. Here,
> the goal is to read a topic from a known messageId to previous messages
> until a constraint depending on read messages. It looks
> hasPreviousMessage() & receivePrevious are the only way to implement it.
>
> For the record, I already implemented the way you suggest using Pulsar
> API by fetching topic's internal informations and compute the messageId
> from ledgers metadata with a recursion.
>
> Best,
>
> Kannar
>
>
> On 3/9/23 07:26, Haiting Jiang wrote:
> > Hi Kannar,
> >
> > +1 to find the position first and then read like normal as mentioned
> > by Yong and Michael.
> >
> > Another problem of  reading reverse is that  it would break all the
> > read ahead techniques in the storage and result in very poor
> > performance.
> >
> >> This would work but it will need something to store every messages read
> >> to reverse them before answer which can be heavy in RAM usages.
> > Finding the position doesn't require reading all the messages body.
> > Just use the ledger metadata info and maybe some message heads in the
> > last ledger would be enough.
> >
> > Thanks,
> > Haiting
> >
> > On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
> >>> Have you looked at the seek implementation to see if it would be
> >> feasible to extend the implementation and add a method to "seekBefore"
> >> a message id in the way you described?
> >>
> >> I think it's not very feasible for this case. Seeking before can lead
> >> to consumer reconnection, which can cause significant performance
> >> issues and overhead.
> >>
> >>
> >> Zike Yang
> >>
> >> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
> >>> Kannar,
> >>>
> >>> Why not find the stop position first, then read the message
> >>> until a given position?
> >>> Does the stop position change dynamically? You only know
> >>> it once you meet it?
> >>>
> >>> Yong
> >>>
> >>>
> >>>
> >>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> >>> <al...@clever-cloud.com.invalid> wrote:
> >>>
> >>>> Hi Michael,
> >>>>
> >>>> This would work but it will need something to store every messages read
> >>>> to reverse them before answer which can be heavy in RAM usages. The key
> >>>> point of the future is to read message by message from a MessageId to
> >>>> past with stop read possible conditions.
> >>>>
> >>>> Best,
> >>>>
> >>>> Kannar
> >>>>
> >>>> On 3/7/23 22:10, Michael Marshall wrote:
> >>>>>> The goal is to start from a known MessageId and read the N message
> >>>>>> before this MessageId.
> >>>>> Have you looked at the seek implementation to see if it would be
> >>>>> feasible to extend the implementation and add a method to "seekBefore"
> >>>>> a message id in the way you described? I haven't considered all of the
> >>>>> implications, but if the main goal is to move the cursor, I think the
> >>>>> solution should be about moving the cursor, not about reading a topic
> >>>>> in reverse.
> >>>>>
> >>>>> Thanks,
> >>>>> Michael
> >>>>>
> >>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> >>>> wrote:
> >>>>>> Hi Yong,
> >>>>>>
> >>>>>> The goal is to start from a known MessageId and read the N message
> >>>>>> before this MessageId.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Kannar
> >>>>>>
> >>>>>>
> >>>>>> On 3/7/23 01:53, Yong Zhang wrote:
> >>>>>>> Hi Kannar,
> >>>>>>>
> >>>>>>> Just interested in what exactly your case.
> >>>>>>>
> >>>>>>> Why do you need to read messages in a reversed order? What is your
> >>>> case?
> >>>>>>> Best,
> >>>>>>> Yong
> >>>>>>>
> >>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> >>>> wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> >>>>>>>> which will enable users to read topic from a defined MessageId to
> >>>>>>>> previous messages until the begin of the topic.
> >>>>>>>>
> >>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> >>>>>>>> queries (due to Presto design).
> >>>>>>>>
> >>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
> >>>> it
> >>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
> >>>> the
> >>>>>>>> feature.
> >>>>>>>>
> >>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> >>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> >>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
> >>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
> >>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>>>>>>>
> >>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> >>>> in
> >>>>>>>> the ReaderAPI.
> >>>>>>>>
> >>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
> >>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>>>>>>>
> >>>>>>>> So I wondering someone to help/guide me or even directly handle the
> >>>>>>>> subject (or the discuss).
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>>
> >>>>>>>> Kannar
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Alexandre DUVAL <al...@clever-cloud.com.INVALID>.
Hi,

The way you suggest works when we know how many messages we want to 
read. It can be a nice feature too, but this is not my case here. Here, 
the goal is to read a topic from a known messageId to previous messages 
until a constraint depending on read messages. It looks 
hasPreviousMessage() & receivePrevious are the only way to implement it.

For the record, I already implemented the way you suggest using Pulsar 
API by fetching topic's internal informations and compute the messageId 
from ledgers metadata with a recursion.

Best,

Kannar


On 3/9/23 07:26, Haiting Jiang wrote:
> Hi Kannar,
>
> +1 to find the position first and then read like normal as mentioned
> by Yong and Michael.
>
> Another problem of  reading reverse is that  it would break all the
> read ahead techniques in the storage and result in very poor
> performance.
>
>> This would work but it will need something to store every messages read
>> to reverse them before answer which can be heavy in RAM usages.
> Finding the position doesn't require reading all the messages body.
> Just use the ledger metadata info and maybe some message heads in the
> last ledger would be enough.
>
> Thanks,
> Haiting
>
> On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>>> Have you looked at the seek implementation to see if it would be
>> feasible to extend the implementation and add a method to "seekBefore"
>> a message id in the way you described?
>>
>> I think it's not very feasible for this case. Seeking before can lead
>> to consumer reconnection, which can cause significant performance
>> issues and overhead.
>>
>>
>> Zike Yang
>>
>> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>>> Kannar,
>>>
>>> Why not find the stop position first, then read the message
>>> until a given position?
>>> Does the stop position change dynamically? You only know
>>> it once you meet it?
>>>
>>> Yong
>>>
>>>
>>>
>>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
>>> <al...@clever-cloud.com.invalid> wrote:
>>>
>>>> Hi Michael,
>>>>
>>>> This would work but it will need something to store every messages read
>>>> to reverse them before answer which can be heavy in RAM usages. The key
>>>> point of the future is to read message by message from a MessageId to
>>>> past with stop read possible conditions.
>>>>
>>>> Best,
>>>>
>>>> Kannar
>>>>
>>>> On 3/7/23 22:10, Michael Marshall wrote:
>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>> before this MessageId.
>>>>> Have you looked at the seek implementation to see if it would be
>>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>>> a message id in the way you described? I haven't considered all of the
>>>>> implications, but if the main goal is to move the cursor, I think the
>>>>> solution should be about moving the cursor, not about reading a topic
>>>>> in reverse.
>>>>>
>>>>> Thanks,
>>>>> Michael
>>>>>
>>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
>>>> wrote:
>>>>>> Hi Yong,
>>>>>>
>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>> before this MessageId.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Kannar
>>>>>>
>>>>>>
>>>>>> On 3/7/23 01:53, Yong Zhang wrote:
>>>>>>> Hi Kannar,
>>>>>>>
>>>>>>> Just interested in what exactly your case.
>>>>>>>
>>>>>>> Why do you need to read messages in a reversed order? What is your
>>>> case?
>>>>>>> Best,
>>>>>>> Yong
>>>>>>>
>>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
>>>> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>>>>>> which will enable users to read topic from a defined MessageId to
>>>>>>>> previous messages until the begin of the topic.
>>>>>>>>
>>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>>>>>> queries (due to Presto design).
>>>>>>>>
>>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
>>>> it
>>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
>>>> the
>>>>>>>> feature.
>>>>>>>>
>>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>>>>>
>>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
>>>> in
>>>>>>>> the ReaderAPI.
>>>>>>>>
>>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>>>>>
>>>>>>>> So I wondering someone to help/guide me or even directly handle the
>>>>>>>> subject (or the discuss).
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Kannar
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Haiting Jiang <ji...@gmail.com>.
Hi Kannar,

+1 to find the position first and then read like normal as mentioned
by Yong and Michael.

Another problem of  reading reverse is that  it would break all the
read ahead techniques in the storage and result in very poor
performance.

> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages.

Finding the position doesn't require reading all the messages body.
Just use the ledger metadata info and maybe some message heads in the
last ledger would be enough.

Thanks,
Haiting

On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>
> > Have you looked at the seek implementation to see if it would be
> feasible to extend the implementation and add a method to "seekBefore"
> a message id in the way you described?
>
> I think it's not very feasible for this case. Seeking before can lead
> to consumer reconnection, which can cause significant performance
> issues and overhead.
>
>
> Zike Yang
>
> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
> >
> > Kannar,
> >
> > Why not find the stop position first, then read the message
> > until a given position?
> > Does the stop position change dynamically? You only know
> > it once you meet it?
> >
> > Yong
> >
> >
> >
> > On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> > <al...@clever-cloud.com.invalid> wrote:
> >
> > > Hi Michael,
> > >
> > > This would work but it will need something to store every messages read
> > > to reverse them before answer which can be heavy in RAM usages. The key
> > > point of the future is to read message by message from a MessageId to
> > > past with stop read possible conditions.
> > >
> > > Best,
> > >
> > > Kannar
> > >
> > > On 3/7/23 22:10, Michael Marshall wrote:
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > > Have you looked at the seek implementation to see if it would be
> > > > feasible to extend the implementation and add a method to "seekBefore"
> > > > a message id in the way you described? I haven't considered all of the
> > > > implications, but if the main goal is to move the cursor, I think the
> > > > solution should be about moving the cursor, not about reading a topic
> > > > in reverse.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> > > wrote:
> > > >> Hi Yong,
> > > >>
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > >>
> > > >> Best,
> > > >>
> > > >> Kannar
> > > >>
> > > >>
> > > >> On 3/7/23 01:53, Yong Zhang wrote:
> > > >>> Hi Kannar,
> > > >>>
> > > >>> Just interested in what exactly your case.
> > > >>>
> > > >>> Why do you need to read messages in a reversed order? What is your
> > > case?
> > > >>>
> > > >>> Best,
> > > >>> Yong
> > > >>>
> > > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> > > wrote:
> > > >>>
> > > >>>> Hi,
> > > >>>>
> > > >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> > > >>>> which will enable users to read topic from a defined MessageId to
> > > >>>> previous messages until the begin of the topic.
> > > >>>>
> > > >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> > > >>>> queries (due to Presto design).
> > > >>>>
> > > >>>> Currently, every read in Pulsar are expected to be going forward. So
> > > it
> > > >>>> might be a bit tricky to prevent every weird behavior by introducing
> > > the
> > > >>>> feature.
> > > >>>>
> > > >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> > > >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> > > >>>> looking for to create a getFirstMessageId() on ManagedLedger
> > > >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> > > >>>> startPosition < endPosition sanity checks in BookKeeper locally
> > > >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> > > >>>>
> > > >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > > in
> > > >>>> the ReaderAPI.
> > > >>>>
> > > >>>> I'm not familiar with these internals such as NonDurableCursor,
> > > >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> > > >>>>
> > > >>>> So I wondering someone to help/guide me or even directly handle the
> > > >>>> subject (or the discuss).
> > > >>>>
> > > >>>> Regards,
> > > >>>>
> > > >>>> Kannar
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > >

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Zike Yang <zi...@apache.org>.
> Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described?

I think it's not very feasible for this case. Seeking before can lead
to consumer reconnection, which can cause significant performance
issues and overhead.


Zike Yang

On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>
> Kannar,
>
> Why not find the stop position first, then read the message
> until a given position?
> Does the stop position change dynamically? You only know
> it once you meet it?
>
> Yong
>
>
>
> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> <al...@clever-cloud.com.invalid> wrote:
>
> > Hi Michael,
> >
> > This would work but it will need something to store every messages read
> > to reverse them before answer which can be heavy in RAM usages. The key
> > point of the future is to read message by message from a MessageId to
> > past with stop read possible conditions.
> >
> > Best,
> >
> > Kannar
> >
> > On 3/7/23 22:10, Michael Marshall wrote:
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > > Have you looked at the seek implementation to see if it would be
> > > feasible to extend the implementation and add a method to "seekBefore"
> > > a message id in the way you described? I haven't considered all of the
> > > implications, but if the main goal is to move the cursor, I think the
> > > solution should be about moving the cursor, not about reading a topic
> > > in reverse.
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> > wrote:
> > >> Hi Yong,
> > >>
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > >>
> > >> Best,
> > >>
> > >> Kannar
> > >>
> > >>
> > >> On 3/7/23 01:53, Yong Zhang wrote:
> > >>> Hi Kannar,
> > >>>
> > >>> Just interested in what exactly your case.
> > >>>
> > >>> Why do you need to read messages in a reversed order? What is your
> > case?
> > >>>
> > >>> Best,
> > >>> Yong
> > >>>
> > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> > >>>> which will enable users to read topic from a defined MessageId to
> > >>>> previous messages until the begin of the topic.
> > >>>>
> > >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> > >>>> queries (due to Presto design).
> > >>>>
> > >>>> Currently, every read in Pulsar are expected to be going forward. So
> > it
> > >>>> might be a bit tricky to prevent every weird behavior by introducing
> > the
> > >>>> feature.
> > >>>>
> > >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> > >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> > >>>> looking for to create a getFirstMessageId() on ManagedLedger
> > >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> > >>>> startPosition < endPosition sanity checks in BookKeeper locally
> > >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> > >>>>
> > >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > in
> > >>>> the ReaderAPI.
> > >>>>
> > >>>> I'm not familiar with these internals such as NonDurableCursor,
> > >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> > >>>>
> > >>>> So I wondering someone to help/guide me or even directly handle the
> > >>>> subject (or the discuss).
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>> Kannar
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> >

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Yong Zhang <zh...@gmail.com>.
Kannar,

Why not find the stop position first, then read the message
until a given position?
Does the stop position change dynamically? You only know
it once you meet it?

Yong



On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
<al...@clever-cloud.com.invalid> wrote:

> Hi Michael,
>
> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages. The key
> point of the future is to read message by message from a MessageId to
> past with stop read possible conditions.
>
> Best,
>
> Kannar
>
> On 3/7/23 22:10, Michael Marshall wrote:
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> > Have you looked at the seek implementation to see if it would be
> > feasible to extend the implementation and add a method to "seekBefore"
> > a message id in the way you described? I haven't considered all of the
> > implications, but if the main goal is to move the cursor, I think the
> > solution should be about moving the cursor, not about reading a topic
> > in reverse.
> >
> > Thanks,
> > Michael
> >
> > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> wrote:
> >> Hi Yong,
> >>
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> >>
> >> Best,
> >>
> >> Kannar
> >>
> >>
> >> On 3/7/23 01:53, Yong Zhang wrote:
> >>> Hi Kannar,
> >>>
> >>> Just interested in what exactly your case.
> >>>
> >>> Why do you need to read messages in a reversed order? What is your
> case?
> >>>
> >>> Best,
> >>> Yong
> >>>
> >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> >>>> which will enable users to read topic from a defined MessageId to
> >>>> previous messages until the begin of the topic.
> >>>>
> >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> >>>> queries (due to Presto design).
> >>>>
> >>>> Currently, every read in Pulsar are expected to be going forward. So
> it
> >>>> might be a bit tricky to prevent every weird behavior by introducing
> the
> >>>> feature.
> >>>>
> >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> >>>> looking for to create a getFirstMessageId() on ManagedLedger
> >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >>>> startPosition < endPosition sanity checks in BookKeeper locally
> >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>>>
> >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> in
> >>>> the ReaderAPI.
> >>>>
> >>>> I'm not familiar with these internals such as NonDurableCursor,
> >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>>>
> >>>> So I wondering someone to help/guide me or even directly handle the
> >>>> subject (or the discuss).
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kannar
> >>>>
> >>>>
> >>>>
> >>>>
>

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Alexandre DUVAL <al...@clever-cloud.com.INVALID>.
Hi Michael,

This would work but it will need something to store every messages read 
to reverse them before answer which can be heavy in RAM usages. The key 
point of the future is to read message by message from a MessageId to 
past with stop read possible conditions.

Best,

Kannar

On 3/7/23 22:10, Michael Marshall wrote:
>> The goal is to start from a known MessageId and read the N message
>> before this MessageId.
> Have you looked at the seek implementation to see if it would be
> feasible to extend the implementation and add a method to "seekBefore"
> a message id in the way you described? I haven't considered all of the
> implications, but if the main goal is to move the cursor, I think the
> solution should be about moving the cursor, not about reading a topic
> in reverse.
>
> Thanks,
> Michael
>
> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com> wrote:
>> Hi Yong,
>>
>> The goal is to start from a known MessageId and read the N message
>> before this MessageId.
>>
>> Best,
>>
>> Kannar
>>
>>
>> On 3/7/23 01:53, Yong Zhang wrote:
>>> Hi Kannar,
>>>
>>> Just interested in what exactly your case.
>>>
>>> Why do you need to read messages in a reversed order? What is your case?
>>>
>>> Best,
>>> Yong
>>>
>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>  wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>> which will enable users to read topic from a defined MessageId to
>>>> previous messages until the begin of the topic.
>>>>
>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>> queries (due to Presto design).
>>>>
>>>> Currently, every read in Pulsar are expected to be going forward. So it
>>>> might be a bit tricky to prevent every weird behavior by introducing the
>>>> feature.
>>>>
>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>
>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
>>>> the ReaderAPI.
>>>>
>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>
>>>> So I wondering someone to help/guide me or even directly handle the
>>>> subject (or the discuss).
>>>>
>>>> Regards,
>>>>
>>>> Kannar
>>>>
>>>>
>>>>
>>>>

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Michael Marshall <mm...@apache.org>.
> The goal is to start from a known MessageId and read the N message
> before this MessageId.

Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described? I haven't considered all of the
implications, but if the main goal is to move the cursor, I think the
solution should be about moving the cursor, not about reading a topic
in reverse.

Thanks,
Michael

On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com> wrote:
>
> Hi Yong,
>
> The goal is to start from a known MessageId and read the N message
> before this MessageId.
>
> Best,
>
> Kannar
>
>
> On 3/7/23 01:53, Yong Zhang wrote:
> > Hi Kannar,
> >
> > Just interested in what exactly your case.
> >
> > Why do you need to read messages in a reversed order? What is your case?
> >
> > Best,
> > Yong
> >
> > On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>  wrote:
> >
> >> Hi,
> >>
> >> I'm wondering if it is possible to introduce a new feature on Pulsar
> >> which will enable users to read topic from a defined MessageId to
> >> previous messages until the begin of the topic.
> >>
> >> I tried to use Pulsar SQL but it requires so much RAM even for little
> >> queries (due to Presto design).
> >>
> >> Currently, every read in Pulsar are expected to be going forward. So it
> >> might be a bit tricky to prevent every weird behavior by introducing the
> >> feature.
> >>
> >> I'm currently tried to make an MVP/POC by introducting a readReverse
> >> field in the CommandSubscribe that is used by ReaderAPI and currently
> >> looking for to create a getFirstMessageId() on ManagedLedger
> >> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >> startPosition < endPosition sanity checks in BookKeeper locally
> >> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>
> >> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> >> the ReaderAPI.
> >>
> >> I'm not familiar with these internals such as NonDurableCursor,
> >> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>
> >> So I wondering someone to help/guide me or even directly handle the
> >> subject (or the discuss).
> >>
> >> Regards,
> >>
> >> Kannar
> >>
> >>
> >>
> --
> *Alexandre DUVAL*
> @KannarFR
> /+33 6 12 97 19 70/

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Alexandre DUVAL <ka...@gmail.com>.
Hi Yong,

The goal is to start from a known MessageId and read the N message 
before this MessageId.

Best,

Kannar


On 3/7/23 01:53, Yong Zhang wrote:
> Hi Kannar,
>
> Just interested in what exactly your case.
>
> Why do you need to read messages in a reversed order? What is your case?
>
> Best,
> Yong
>
> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>  wrote:
>
>> Hi,
>>
>> I'm wondering if it is possible to introduce a new feature on Pulsar
>> which will enable users to read topic from a defined MessageId to
>> previous messages until the begin of the topic.
>>
>> I tried to use Pulsar SQL but it requires so much RAM even for little
>> queries (due to Presto design).
>>
>> Currently, every read in Pulsar are expected to be going forward. So it
>> might be a bit tricky to prevent every weird behavior by introducing the
>> feature.
>>
>> I'm currently tried to make an MVP/POC by introducting a readReverse
>> field in the CommandSubscribe that is used by ReaderAPI and currently
>> looking for to create a getFirstMessageId() on ManagedLedger
>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>> startPosition < endPosition sanity checks in BookKeeper locally
>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>
>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
>> the ReaderAPI.
>>
>> I'm not familiar with these internals such as NonDurableCursor,
>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>
>> So I wondering someone to help/guide me or even directly handle the
>> subject (or the discuss).
>>
>> Regards,
>>
>> Kannar
>>
>>
>>
-- 
*Alexandre DUVAL*
@KannarFR
/+33 6 12 97 19 70/

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Yong Zhang <zh...@gmail.com>.
Hi Kannar,

Just interested in what exactly your case.

Why do you need to read messages in a reversed order? What is your case?

Best,
Yong

On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL <ka...@gmail.com> wrote:

> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>

Re: [DISCUSS] new idea: reverse reading a topic

Posted by Yong Zhang <zh...@gmail.com>.
Hi Kannar,

Just interested in what exactly your case.

Why do you need to read messages in a reversed order? What is your case?

Best,
Yong

On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL <ka...@gmail.com> wrote:

> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>