You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Alexandre DUVAL <ka...@gmail.com> on 2023/03/06 14:32:34 UTC
[DISCUSS] new idea: reverse reading a topic
Hi,
I'm wondering if it is possible to introduce a new feature on Pulsar
which will enable users to read topic from a defined MessageId to
previous messages until the begin of the topic.
I tried to use Pulsar SQL but it requires so much RAM even for little
queries (due to Presto design).
Currently, every read in Pulsar are expected to be going forward. So it
might be a bit tricky to prevent every weird behavior by introducing the
feature.
I'm currently tried to make an MVP/POC by introducting a readReverse
field in the CommandSubscribe that is used by ReaderAPI and currently
looking for to create a getFirstMessageId() on ManagedLedger
(https://github.com/CleverCloud/pulsar/pull/3). I also removed
startPosition < endPosition sanity checks in BookKeeper locally
(https://github.com/CleverCloud/bookkeeper/pull/2).
We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
the ReaderAPI.
I'm not familiar with these internals such as NonDurableCursor,
RangeEntryCache, ManagedCursor so it's a bit tricky.
So I wondering someone to help/guide me or even directly handle the
subject (or the discuss).
Regards,
Kannar
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Andrey Yegorov <an...@datastax.com>.
Bookkeeper's ledger is a distributed WAL with random reads support.
ledger.readAsync(long firstEntry, long lastEntry) works if firstEntry ==
lastEntry
With that one can pipeline reads of as many entries in parallel as needed,
no BK API changes needed for such a niche case.
Reading backwards may backfire in performance, as there is a caching layer
in BK + OS pages data in memory in anticipation of sequential reads.
You can experiment with OS tuning and some BK parameters, with modern SSD
drives backwards reads should have minimal perf differences.
Majority of your changes are in pulsar, I haven't looked closely but my
best guess is that it will affect topic truncation that relies on
subscription moving forwards.
FWIW, consumer has seek() API
https://github.com/apache/pulsar/blob/e5a833a2dcb7ce13ada4ca94714cc045a02de276/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Consumer.java#L482
You can seek to N messages back from last offset, read messages in memory
forward sequentially, reverse, handle, repeat.
As for the PR and idea overall, I'd suggest calling into one of the Pulsar
community meetings to get faster feedback.
On Mon, Mar 6, 2023 at 7:37 AM Alexandre DUVAL <ka...@gmail.com> wrote:
> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>
--
Andrey Yegorov
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Andrey Yegorov <an...@datastax.com>.
Bookkeeper's ledger is a distributed WAL with random reads support.
ledger.readAsync(long firstEntry, long lastEntry) works if firstEntry ==
lastEntry
With that one can pipeline reads of as many entries in parallel as needed,
no BK API changes needed for such a niche case.
Reading backwards may backfire in performance, as there is a caching layer
in BK + OS pages data in memory in anticipation of sequential reads.
You can experiment with OS tuning and some BK parameters, with modern SSD
drives backwards reads should have minimal perf differences.
Majority of your changes are in pulsar, I haven't looked closely but my
best guess is that it will affect topic truncation that relies on
subscription moving forwards.
FWIW, consumer has seek() API
https://github.com/apache/pulsar/blob/e5a833a2dcb7ce13ada4ca94714cc045a02de276/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Consumer.java#L482
You can seek to N messages back from last offset, read messages in memory
forward sequentially, reverse, handle, repeat.
As for the PR and idea overall, I'd suggest calling into one of the Pulsar
community meetings to get faster feedback.
On Mon, Mar 6, 2023 at 7:37 AM Alexandre DUVAL <ka...@gmail.com> wrote:
> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>
--
Andrey Yegorov
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Alexandre DUVAL <ka...@gmail.com>.
Considering topic as a timeseries store it enables us to read forward &
backward stored messages using limited since/until windows.
On 3/15/23 05:34, Haiting Jiang wrote:
>> Here, the goal is to read a topic from a known messageId to previous messages
>> until a constraint depending on read messages.
> I get your point, but can you give a more detailed use case?
>
> Thanks,
> Haiting
>
> On Tue, Mar 14, 2023 at 6:40 PM Alexandre DUVAL
> <al...@clever-cloud.com.invalid> wrote:
>> Hi,
>>
>> The way you suggest works when we know how many messages we want to
>> read. It can be a nice feature too, but this is not my case here. Here,
>> the goal is to read a topic from a known messageId to previous messages
>> until a constraint depending on read messages. It looks
>> hasPreviousMessage() & receivePrevious are the only way to implement it.
>>
>> For the record, I already implemented the way you suggest using Pulsar
>> API by fetching topic's internal informations and compute the messageId
>> from ledgers metadata with a recursion.
>>
>> Best,
>>
>> Kannar
>>
>>
>> On 3/9/23 07:26, Haiting Jiang wrote:
>>> Hi Kannar,
>>>
>>> +1 to find the position first and then read like normal as mentioned
>>> by Yong and Michael.
>>>
>>> Another problem of reading reverse is that it would break all the
>>> read ahead techniques in the storage and result in very poor
>>> performance.
>>>
>>>> This would work but it will need something to store every messages read
>>>> to reverse them before answer which can be heavy in RAM usages.
>>> Finding the position doesn't require reading all the messages body.
>>> Just use the ledger metadata info and maybe some message heads in the
>>> last ledger would be enough.
>>>
>>> Thanks,
>>> Haiting
>>>
>>> On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>>>>> Have you looked at the seek implementation to see if it would be
>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>> a message id in the way you described?
>>>>
>>>> I think it's not very feasible for this case. Seeking before can lead
>>>> to consumer reconnection, which can cause significant performance
>>>> issues and overhead.
>>>>
>>>>
>>>> Zike Yang
>>>>
>>>> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>>>>> Kannar,
>>>>>
>>>>> Why not find the stop position first, then read the message
>>>>> until a given position?
>>>>> Does the stop position change dynamically? You only know
>>>>> it once you meet it?
>>>>>
>>>>> Yong
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
>>>>> <al...@clever-cloud.com.invalid> wrote:
>>>>>
>>>>>> Hi Michael,
>>>>>>
>>>>>> This would work but it will need something to store every messages read
>>>>>> to reverse them before answer which can be heavy in RAM usages. The key
>>>>>> point of the future is to read message by message from a MessageId to
>>>>>> past with stop read possible conditions.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Kannar
>>>>>>
>>>>>> On 3/7/23 22:10, Michael Marshall wrote:
>>>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>>>> before this MessageId.
>>>>>>> Have you looked at the seek implementation to see if it would be
>>>>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>>>>> a message id in the way you described? I haven't considered all of the
>>>>>>> implications, but if the main goal is to move the cursor, I think the
>>>>>>> solution should be about moving the cursor, not about reading a topic
>>>>>>> in reverse.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Michael
>>>>>>>
>>>>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
>>>>>> wrote:
>>>>>>>> Hi Yong,
>>>>>>>>
>>>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>>>> before this MessageId.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Kannar
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3/7/23 01:53, Yong Zhang wrote:
>>>>>>>>> Hi Kannar,
>>>>>>>>>
>>>>>>>>> Just interested in what exactly your case.
>>>>>>>>>
>>>>>>>>> Why do you need to read messages in a reversed order? What is your
>>>>>> case?
>>>>>>>>> Best,
>>>>>>>>> Yong
>>>>>>>>>
>>>>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>>>>>>>> which will enable users to read topic from a defined MessageId to
>>>>>>>>>> previous messages until the begin of the topic.
>>>>>>>>>>
>>>>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>>>>>>>> queries (due to Presto design).
>>>>>>>>>>
>>>>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
>>>>>> it
>>>>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
>>>>>> the
>>>>>>>>>> feature.
>>>>>>>>>>
>>>>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>>>>>>>
>>>>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
>>>>>> in
>>>>>>>>>> the ReaderAPI.
>>>>>>>>>>
>>>>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>>>>>>>
>>>>>>>>>> So I wondering someone to help/guide me or even directly handle the
>>>>>>>>>> subject (or the discuss).
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Kannar
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
--
Best, Kannar
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Haiting Jiang <ji...@gmail.com>.
> Here, the goal is to read a topic from a known messageId to previous messages
> until a constraint depending on read messages.
I get your point, but can you give a more detailed use case?
Thanks,
Haiting
On Tue, Mar 14, 2023 at 6:40 PM Alexandre DUVAL
<al...@clever-cloud.com.invalid> wrote:
>
> Hi,
>
> The way you suggest works when we know how many messages we want to
> read. It can be a nice feature too, but this is not my case here. Here,
> the goal is to read a topic from a known messageId to previous messages
> until a constraint depending on read messages. It looks
> hasPreviousMessage() & receivePrevious are the only way to implement it.
>
> For the record, I already implemented the way you suggest using Pulsar
> API by fetching topic's internal informations and compute the messageId
> from ledgers metadata with a recursion.
>
> Best,
>
> Kannar
>
>
> On 3/9/23 07:26, Haiting Jiang wrote:
> > Hi Kannar,
> >
> > +1 to find the position first and then read like normal as mentioned
> > by Yong and Michael.
> >
> > Another problem of reading reverse is that it would break all the
> > read ahead techniques in the storage and result in very poor
> > performance.
> >
> >> This would work but it will need something to store every messages read
> >> to reverse them before answer which can be heavy in RAM usages.
> > Finding the position doesn't require reading all the messages body.
> > Just use the ledger metadata info and maybe some message heads in the
> > last ledger would be enough.
> >
> > Thanks,
> > Haiting
> >
> > On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
> >>> Have you looked at the seek implementation to see if it would be
> >> feasible to extend the implementation and add a method to "seekBefore"
> >> a message id in the way you described?
> >>
> >> I think it's not very feasible for this case. Seeking before can lead
> >> to consumer reconnection, which can cause significant performance
> >> issues and overhead.
> >>
> >>
> >> Zike Yang
> >>
> >> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
> >>> Kannar,
> >>>
> >>> Why not find the stop position first, then read the message
> >>> until a given position?
> >>> Does the stop position change dynamically? You only know
> >>> it once you meet it?
> >>>
> >>> Yong
> >>>
> >>>
> >>>
> >>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> >>> <al...@clever-cloud.com.invalid> wrote:
> >>>
> >>>> Hi Michael,
> >>>>
> >>>> This would work but it will need something to store every messages read
> >>>> to reverse them before answer which can be heavy in RAM usages. The key
> >>>> point of the future is to read message by message from a MessageId to
> >>>> past with stop read possible conditions.
> >>>>
> >>>> Best,
> >>>>
> >>>> Kannar
> >>>>
> >>>> On 3/7/23 22:10, Michael Marshall wrote:
> >>>>>> The goal is to start from a known MessageId and read the N message
> >>>>>> before this MessageId.
> >>>>> Have you looked at the seek implementation to see if it would be
> >>>>> feasible to extend the implementation and add a method to "seekBefore"
> >>>>> a message id in the way you described? I haven't considered all of the
> >>>>> implications, but if the main goal is to move the cursor, I think the
> >>>>> solution should be about moving the cursor, not about reading a topic
> >>>>> in reverse.
> >>>>>
> >>>>> Thanks,
> >>>>> Michael
> >>>>>
> >>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> >>>> wrote:
> >>>>>> Hi Yong,
> >>>>>>
> >>>>>> The goal is to start from a known MessageId and read the N message
> >>>>>> before this MessageId.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Kannar
> >>>>>>
> >>>>>>
> >>>>>> On 3/7/23 01:53, Yong Zhang wrote:
> >>>>>>> Hi Kannar,
> >>>>>>>
> >>>>>>> Just interested in what exactly your case.
> >>>>>>>
> >>>>>>> Why do you need to read messages in a reversed order? What is your
> >>>> case?
> >>>>>>> Best,
> >>>>>>> Yong
> >>>>>>>
> >>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> >>>> wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> >>>>>>>> which will enable users to read topic from a defined MessageId to
> >>>>>>>> previous messages until the begin of the topic.
> >>>>>>>>
> >>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> >>>>>>>> queries (due to Presto design).
> >>>>>>>>
> >>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
> >>>> it
> >>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
> >>>> the
> >>>>>>>> feature.
> >>>>>>>>
> >>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> >>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> >>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
> >>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
> >>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>>>>>>>
> >>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> >>>> in
> >>>>>>>> the ReaderAPI.
> >>>>>>>>
> >>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
> >>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>>>>>>>
> >>>>>>>> So I wondering someone to help/guide me or even directly handle the
> >>>>>>>> subject (or the discuss).
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>>
> >>>>>>>> Kannar
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Alexandre DUVAL <al...@clever-cloud.com.INVALID>.
Hi,
The way you suggest works when we know how many messages we want to
read. It can be a nice feature too, but this is not my case here. Here,
the goal is to read a topic from a known messageId to previous messages
until a constraint depending on read messages. It looks
hasPreviousMessage() & receivePrevious are the only way to implement it.
For the record, I already implemented the way you suggest using Pulsar
API by fetching topic's internal informations and compute the messageId
from ledgers metadata with a recursion.
Best,
Kannar
On 3/9/23 07:26, Haiting Jiang wrote:
> Hi Kannar,
>
> +1 to find the position first and then read like normal as mentioned
> by Yong and Michael.
>
> Another problem of reading reverse is that it would break all the
> read ahead techniques in the storage and result in very poor
> performance.
>
>> This would work but it will need something to store every messages read
>> to reverse them before answer which can be heavy in RAM usages.
> Finding the position doesn't require reading all the messages body.
> Just use the ledger metadata info and maybe some message heads in the
> last ledger would be enough.
>
> Thanks,
> Haiting
>
> On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>>> Have you looked at the seek implementation to see if it would be
>> feasible to extend the implementation and add a method to "seekBefore"
>> a message id in the way you described?
>>
>> I think it's not very feasible for this case. Seeking before can lead
>> to consumer reconnection, which can cause significant performance
>> issues and overhead.
>>
>>
>> Zike Yang
>>
>> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>>> Kannar,
>>>
>>> Why not find the stop position first, then read the message
>>> until a given position?
>>> Does the stop position change dynamically? You only know
>>> it once you meet it?
>>>
>>> Yong
>>>
>>>
>>>
>>> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
>>> <al...@clever-cloud.com.invalid> wrote:
>>>
>>>> Hi Michael,
>>>>
>>>> This would work but it will need something to store every messages read
>>>> to reverse them before answer which can be heavy in RAM usages. The key
>>>> point of the future is to read message by message from a MessageId to
>>>> past with stop read possible conditions.
>>>>
>>>> Best,
>>>>
>>>> Kannar
>>>>
>>>> On 3/7/23 22:10, Michael Marshall wrote:
>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>> before this MessageId.
>>>>> Have you looked at the seek implementation to see if it would be
>>>>> feasible to extend the implementation and add a method to "seekBefore"
>>>>> a message id in the way you described? I haven't considered all of the
>>>>> implications, but if the main goal is to move the cursor, I think the
>>>>> solution should be about moving the cursor, not about reading a topic
>>>>> in reverse.
>>>>>
>>>>> Thanks,
>>>>> Michael
>>>>>
>>>>> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
>>>> wrote:
>>>>>> Hi Yong,
>>>>>>
>>>>>> The goal is to start from a known MessageId and read the N message
>>>>>> before this MessageId.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Kannar
>>>>>>
>>>>>>
>>>>>> On 3/7/23 01:53, Yong Zhang wrote:
>>>>>>> Hi Kannar,
>>>>>>>
>>>>>>> Just interested in what exactly your case.
>>>>>>>
>>>>>>> Why do you need to read messages in a reversed order? What is your
>>>> case?
>>>>>>> Best,
>>>>>>> Yong
>>>>>>>
>>>>>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
>>>> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>>>>>> which will enable users to read topic from a defined MessageId to
>>>>>>>> previous messages until the begin of the topic.
>>>>>>>>
>>>>>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>>>>>> queries (due to Presto design).
>>>>>>>>
>>>>>>>> Currently, every read in Pulsar are expected to be going forward. So
>>>> it
>>>>>>>> might be a bit tricky to prevent every weird behavior by introducing
>>>> the
>>>>>>>> feature.
>>>>>>>>
>>>>>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>>>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>>>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>>>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>>>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>>>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>>>>>
>>>>>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
>>>> in
>>>>>>>> the ReaderAPI.
>>>>>>>>
>>>>>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>>>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>>>>>
>>>>>>>> So I wondering someone to help/guide me or even directly handle the
>>>>>>>> subject (or the discuss).
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Kannar
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Haiting Jiang <ji...@gmail.com>.
Hi Kannar,
+1 to find the position first and then read like normal as mentioned
by Yong and Michael.
Another problem of reading reverse is that it would break all the
read ahead techniques in the storage and result in very poor
performance.
> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages.
Finding the position doesn't require reading all the messages body.
Just use the ledger metadata info and maybe some message heads in the
last ledger would be enough.
Thanks,
Haiting
On Thu, Mar 9, 2023 at 11:09 AM Zike Yang <zi...@apache.org> wrote:
>
> > Have you looked at the seek implementation to see if it would be
> feasible to extend the implementation and add a method to "seekBefore"
> a message id in the way you described?
>
> I think it's not very feasible for this case. Seeking before can lead
> to consumer reconnection, which can cause significant performance
> issues and overhead.
>
>
> Zike Yang
>
> On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
> >
> > Kannar,
> >
> > Why not find the stop position first, then read the message
> > until a given position?
> > Does the stop position change dynamically? You only know
> > it once you meet it?
> >
> > Yong
> >
> >
> >
> > On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> > <al...@clever-cloud.com.invalid> wrote:
> >
> > > Hi Michael,
> > >
> > > This would work but it will need something to store every messages read
> > > to reverse them before answer which can be heavy in RAM usages. The key
> > > point of the future is to read message by message from a MessageId to
> > > past with stop read possible conditions.
> > >
> > > Best,
> > >
> > > Kannar
> > >
> > > On 3/7/23 22:10, Michael Marshall wrote:
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > > Have you looked at the seek implementation to see if it would be
> > > > feasible to extend the implementation and add a method to "seekBefore"
> > > > a message id in the way you described? I haven't considered all of the
> > > > implications, but if the main goal is to move the cursor, I think the
> > > > solution should be about moving the cursor, not about reading a topic
> > > > in reverse.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> > > wrote:
> > > >> Hi Yong,
> > > >>
> > > >> The goal is to start from a known MessageId and read the N message
> > > >> before this MessageId.
> > > >>
> > > >> Best,
> > > >>
> > > >> Kannar
> > > >>
> > > >>
> > > >> On 3/7/23 01:53, Yong Zhang wrote:
> > > >>> Hi Kannar,
> > > >>>
> > > >>> Just interested in what exactly your case.
> > > >>>
> > > >>> Why do you need to read messages in a reversed order? What is your
> > > case?
> > > >>>
> > > >>> Best,
> > > >>> Yong
> > > >>>
> > > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> > > wrote:
> > > >>>
> > > >>>> Hi,
> > > >>>>
> > > >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> > > >>>> which will enable users to read topic from a defined MessageId to
> > > >>>> previous messages until the begin of the topic.
> > > >>>>
> > > >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> > > >>>> queries (due to Presto design).
> > > >>>>
> > > >>>> Currently, every read in Pulsar are expected to be going forward. So
> > > it
> > > >>>> might be a bit tricky to prevent every weird behavior by introducing
> > > the
> > > >>>> feature.
> > > >>>>
> > > >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> > > >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> > > >>>> looking for to create a getFirstMessageId() on ManagedLedger
> > > >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> > > >>>> startPosition < endPosition sanity checks in BookKeeper locally
> > > >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> > > >>>>
> > > >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > > in
> > > >>>> the ReaderAPI.
> > > >>>>
> > > >>>> I'm not familiar with these internals such as NonDurableCursor,
> > > >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> > > >>>>
> > > >>>> So I wondering someone to help/guide me or even directly handle the
> > > >>>> subject (or the discuss).
> > > >>>>
> > > >>>> Regards,
> > > >>>>
> > > >>>> Kannar
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > >
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Zike Yang <zi...@apache.org>.
> Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described?
I think it's not very feasible for this case. Seeking before can lead
to consumer reconnection, which can cause significant performance
issues and overhead.
Zike Yang
On Thu, Mar 9, 2023 at 10:22 AM Yong Zhang <zh...@gmail.com> wrote:
>
> Kannar,
>
> Why not find the stop position first, then read the message
> until a given position?
> Does the stop position change dynamically? You only know
> it once you meet it?
>
> Yong
>
>
>
> On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
> <al...@clever-cloud.com.invalid> wrote:
>
> > Hi Michael,
> >
> > This would work but it will need something to store every messages read
> > to reverse them before answer which can be heavy in RAM usages. The key
> > point of the future is to read message by message from a MessageId to
> > past with stop read possible conditions.
> >
> > Best,
> >
> > Kannar
> >
> > On 3/7/23 22:10, Michael Marshall wrote:
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > > Have you looked at the seek implementation to see if it would be
> > > feasible to extend the implementation and add a method to "seekBefore"
> > > a message id in the way you described? I haven't considered all of the
> > > implications, but if the main goal is to move the cursor, I think the
> > > solution should be about moving the cursor, not about reading a topic
> > > in reverse.
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> > wrote:
> > >> Hi Yong,
> > >>
> > >> The goal is to start from a known MessageId and read the N message
> > >> before this MessageId.
> > >>
> > >> Best,
> > >>
> > >> Kannar
> > >>
> > >>
> > >> On 3/7/23 01:53, Yong Zhang wrote:
> > >>> Hi Kannar,
> > >>>
> > >>> Just interested in what exactly your case.
> > >>>
> > >>> Why do you need to read messages in a reversed order? What is your
> > case?
> > >>>
> > >>> Best,
> > >>> Yong
> > >>>
> > >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> > >>>> which will enable users to read topic from a defined MessageId to
> > >>>> previous messages until the begin of the topic.
> > >>>>
> > >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> > >>>> queries (due to Presto design).
> > >>>>
> > >>>> Currently, every read in Pulsar are expected to be going forward. So
> > it
> > >>>> might be a bit tricky to prevent every weird behavior by introducing
> > the
> > >>>> feature.
> > >>>>
> > >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> > >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> > >>>> looking for to create a getFirstMessageId() on ManagedLedger
> > >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> > >>>> startPosition < endPosition sanity checks in BookKeeper locally
> > >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> > >>>>
> > >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> > in
> > >>>> the ReaderAPI.
> > >>>>
> > >>>> I'm not familiar with these internals such as NonDurableCursor,
> > >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> > >>>>
> > >>>> So I wondering someone to help/guide me or even directly handle the
> > >>>> subject (or the discuss).
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>> Kannar
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> >
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Yong Zhang <zh...@gmail.com>.
Kannar,
Why not find the stop position first, then read the message
until a given position?
Does the stop position change dynamically? You only know
it once you meet it?
Yong
On Wed, 8 Mar 2023 at 21:37, Alexandre DUVAL
<al...@clever-cloud.com.invalid> wrote:
> Hi Michael,
>
> This would work but it will need something to store every messages read
> to reverse them before answer which can be heavy in RAM usages. The key
> point of the future is to read message by message from a MessageId to
> past with stop read possible conditions.
>
> Best,
>
> Kannar
>
> On 3/7/23 22:10, Michael Marshall wrote:
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> > Have you looked at the seek implementation to see if it would be
> > feasible to extend the implementation and add a method to "seekBefore"
> > a message id in the way you described? I haven't considered all of the
> > implications, but if the main goal is to move the cursor, I think the
> > solution should be about moving the cursor, not about reading a topic
> > in reverse.
> >
> > Thanks,
> > Michael
> >
> > On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com>
> wrote:
> >> Hi Yong,
> >>
> >> The goal is to start from a known MessageId and read the N message
> >> before this MessageId.
> >>
> >> Best,
> >>
> >> Kannar
> >>
> >>
> >> On 3/7/23 01:53, Yong Zhang wrote:
> >>> Hi Kannar,
> >>>
> >>> Just interested in what exactly your case.
> >>>
> >>> Why do you need to read messages in a reversed order? What is your
> case?
> >>>
> >>> Best,
> >>> Yong
> >>>
> >>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I'm wondering if it is possible to introduce a new feature on Pulsar
> >>>> which will enable users to read topic from a defined MessageId to
> >>>> previous messages until the begin of the topic.
> >>>>
> >>>> I tried to use Pulsar SQL but it requires so much RAM even for little
> >>>> queries (due to Presto design).
> >>>>
> >>>> Currently, every read in Pulsar are expected to be going forward. So
> it
> >>>> might be a bit tricky to prevent every weird behavior by introducing
> the
> >>>> feature.
> >>>>
> >>>> I'm currently tried to make an MVP/POC by introducting a readReverse
> >>>> field in the CommandSubscribe that is used by ReaderAPI and currently
> >>>> looking for to create a getFirstMessageId() on ManagedLedger
> >>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >>>> startPosition < endPosition sanity checks in BookKeeper locally
> >>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>>>
> >>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable()
> in
> >>>> the ReaderAPI.
> >>>>
> >>>> I'm not familiar with these internals such as NonDurableCursor,
> >>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>>>
> >>>> So I wondering someone to help/guide me or even directly handle the
> >>>> subject (or the discuss).
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kannar
> >>>>
> >>>>
> >>>>
> >>>>
>
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Alexandre DUVAL <al...@clever-cloud.com.INVALID>.
Hi Michael,
This would work but it will need something to store every messages read
to reverse them before answer which can be heavy in RAM usages. The key
point of the future is to read message by message from a MessageId to
past with stop read possible conditions.
Best,
Kannar
On 3/7/23 22:10, Michael Marshall wrote:
>> The goal is to start from a known MessageId and read the N message
>> before this MessageId.
> Have you looked at the seek implementation to see if it would be
> feasible to extend the implementation and add a method to "seekBefore"
> a message id in the way you described? I haven't considered all of the
> implications, but if the main goal is to move the cursor, I think the
> solution should be about moving the cursor, not about reading a topic
> in reverse.
>
> Thanks,
> Michael
>
> On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com> wrote:
>> Hi Yong,
>>
>> The goal is to start from a known MessageId and read the N message
>> before this MessageId.
>>
>> Best,
>>
>> Kannar
>>
>>
>> On 3/7/23 01:53, Yong Zhang wrote:
>>> Hi Kannar,
>>>
>>> Just interested in what exactly your case.
>>>
>>> Why do you need to read messages in a reversed order? What is your case?
>>>
>>> Best,
>>> Yong
>>>
>>> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm wondering if it is possible to introduce a new feature on Pulsar
>>>> which will enable users to read topic from a defined MessageId to
>>>> previous messages until the begin of the topic.
>>>>
>>>> I tried to use Pulsar SQL but it requires so much RAM even for little
>>>> queries (due to Presto design).
>>>>
>>>> Currently, every read in Pulsar are expected to be going forward. So it
>>>> might be a bit tricky to prevent every weird behavior by introducing the
>>>> feature.
>>>>
>>>> I'm currently tried to make an MVP/POC by introducting a readReverse
>>>> field in the CommandSubscribe that is used by ReaderAPI and currently
>>>> looking for to create a getFirstMessageId() on ManagedLedger
>>>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>>>> startPosition < endPosition sanity checks in BookKeeper locally
>>>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>>>
>>>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
>>>> the ReaderAPI.
>>>>
>>>> I'm not familiar with these internals such as NonDurableCursor,
>>>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>>>
>>>> So I wondering someone to help/guide me or even directly handle the
>>>> subject (or the discuss).
>>>>
>>>> Regards,
>>>>
>>>> Kannar
>>>>
>>>>
>>>>
>>>>
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Michael Marshall <mm...@apache.org>.
> The goal is to start from a known MessageId and read the N message
> before this MessageId.
Have you looked at the seek implementation to see if it would be
feasible to extend the implementation and add a method to "seekBefore"
a message id in the way you described? I haven't considered all of the
implications, but if the main goal is to move the cursor, I think the
solution should be about moving the cursor, not about reading a topic
in reverse.
Thanks,
Michael
On Tue, Mar 7, 2023 at 6:50 AM Alexandre DUVAL <ka...@gmail.com> wrote:
>
> Hi Yong,
>
> The goal is to start from a known MessageId and read the N message
> before this MessageId.
>
> Best,
>
> Kannar
>
>
> On 3/7/23 01:53, Yong Zhang wrote:
> > Hi Kannar,
> >
> > Just interested in what exactly your case.
> >
> > Why do you need to read messages in a reversed order? What is your case?
> >
> > Best,
> > Yong
> >
> > On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I'm wondering if it is possible to introduce a new feature on Pulsar
> >> which will enable users to read topic from a defined MessageId to
> >> previous messages until the begin of the topic.
> >>
> >> I tried to use Pulsar SQL but it requires so much RAM even for little
> >> queries (due to Presto design).
> >>
> >> Currently, every read in Pulsar are expected to be going forward. So it
> >> might be a bit tricky to prevent every weird behavior by introducing the
> >> feature.
> >>
> >> I'm currently tried to make an MVP/POC by introducting a readReverse
> >> field in the CommandSubscribe that is used by ReaderAPI and currently
> >> looking for to create a getFirstMessageId() on ManagedLedger
> >> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> >> startPosition < endPosition sanity checks in BookKeeper locally
> >> (https://github.com/CleverCloud/bookkeeper/pull/2).
> >>
> >> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> >> the ReaderAPI.
> >>
> >> I'm not familiar with these internals such as NonDurableCursor,
> >> RangeEntryCache, ManagedCursor so it's a bit tricky.
> >>
> >> So I wondering someone to help/guide me or even directly handle the
> >> subject (or the discuss).
> >>
> >> Regards,
> >>
> >> Kannar
> >>
> >>
> >>
> --
> *Alexandre DUVAL*
> @KannarFR
> /+33 6 12 97 19 70/
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Alexandre DUVAL <ka...@gmail.com>.
Hi Yong,
The goal is to start from a known MessageId and read the N message
before this MessageId.
Best,
Kannar
On 3/7/23 01:53, Yong Zhang wrote:
> Hi Kannar,
>
> Just interested in what exactly your case.
>
> Why do you need to read messages in a reversed order? What is your case?
>
> Best,
> Yong
>
> On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL<ka...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm wondering if it is possible to introduce a new feature on Pulsar
>> which will enable users to read topic from a defined MessageId to
>> previous messages until the begin of the topic.
>>
>> I tried to use Pulsar SQL but it requires so much RAM even for little
>> queries (due to Presto design).
>>
>> Currently, every read in Pulsar are expected to be going forward. So it
>> might be a bit tricky to prevent every weird behavior by introducing the
>> feature.
>>
>> I'm currently tried to make an MVP/POC by introducting a readReverse
>> field in the CommandSubscribe that is used by ReaderAPI and currently
>> looking for to create a getFirstMessageId() on ManagedLedger
>> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
>> startPosition < endPosition sanity checks in BookKeeper locally
>> (https://github.com/CleverCloud/bookkeeper/pull/2).
>>
>> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
>> the ReaderAPI.
>>
>> I'm not familiar with these internals such as NonDurableCursor,
>> RangeEntryCache, ManagedCursor so it's a bit tricky.
>>
>> So I wondering someone to help/guide me or even directly handle the
>> subject (or the discuss).
>>
>> Regards,
>>
>> Kannar
>>
>>
>>
--
*Alexandre DUVAL*
@KannarFR
/+33 6 12 97 19 70/
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Yong Zhang <zh...@gmail.com>.
Hi Kannar,
Just interested in what exactly your case.
Why do you need to read messages in a reversed order? What is your case?
Best,
Yong
On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL <ka...@gmail.com> wrote:
> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>
Re: [DISCUSS] new idea: reverse reading a topic
Posted by Yong Zhang <zh...@gmail.com>.
Hi Kannar,
Just interested in what exactly your case.
Why do you need to read messages in a reversed order? What is your case?
Best,
Yong
On Mon, 6 Mar 2023 at 23:37, Alexandre DUVAL <ka...@gmail.com> wrote:
> Hi,
>
> I'm wondering if it is possible to introduce a new feature on Pulsar
> which will enable users to read topic from a defined MessageId to
> previous messages until the begin of the topic.
>
> I tried to use Pulsar SQL but it requires so much RAM even for little
> queries (due to Presto design).
>
> Currently, every read in Pulsar are expected to be going forward. So it
> might be a bit tricky to prevent every weird behavior by introducing the
> feature.
>
> I'm currently tried to make an MVP/POC by introducting a readReverse
> field in the CommandSubscribe that is used by ReaderAPI and currently
> looking for to create a getFirstMessageId() on ManagedLedger
> (https://github.com/CleverCloud/pulsar/pull/3). I also removed
> startPosition < endPosition sanity checks in BookKeeper locally
> (https://github.com/CleverCloud/bookkeeper/pull/2).
>
> We definitely prefer a readPrevious(), hasPreviousMessageAvailable() in
> the ReaderAPI.
>
> I'm not familiar with these internals such as NonDurableCursor,
> RangeEntryCache, ManagedCursor so it's a bit tricky.
>
> So I wondering someone to help/guide me or even directly handle the
> subject (or the discuss).
>
> Regards,
>
> Kannar
>
>
>