You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pulsar.apache.org by Sijie Guo <gu...@gmail.com> on 2021/02/03 05:20:56 UTC

Re: [E] Re: [PIP-78] Split the individual acknowledgments into multiple entries

@Rajan Dhabalia <rd...@apache.org>

We discussed this PIP in the community meeting and agreed that we need to
evaluate this PIP one more time to see if we can combine this PIP together
with the bitmap proposal together as the final solution.

We should think about more use cases before pushing the bitmap approach.
Hence let's spend more time reviewing Lin's proposal and see if we can find
an approach there.

If there is no way to find a solution that solves all the problems, I would
still suggest having an interface to allow implementing different
algorithms. We can always have a basic algorithm that works for most of the
use cases. But also allow advanced users to customize a solution that works
for the other use cases.

https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE/edit#bookmark=id.ekbmsiqz25jm

Thanks,
Sijie

On Fri, Jan 29, 2021 at 1:59 AM Sijie Guo <gu...@gmail.com> wrote:

> Rajan - I understand your concern about memory. Lin, Penghui, and I also
> acknowledged that none of the implementations solve every use case. Each
> implementation has its limitation and concerns.
>
> I am trying to find a way for both parties (You and Lin) can explore
> different implementations. An abstraction/interface sounds like a
> reasonable approach to take, no? We have a similar situation in the delayed
> message scheduler. The current queue-based implementation works for certain
> workloads but doesn't work for delayed messages spanning a long time span.
> But the delayed message scheduler is an interface that allows people to
> implement a better solution (i.e. a hashed-wheel-based implementation).
> Does that make sense?
>
> Thanks,
> Sijie
>
>
> On Thu, Jan 28, 2021 at 11:54 AM Rajan Dhabalia <rd...@apache.org>
> wrote:
>
>> My only point was, if the broker tries to manage more than 10M unack
>> message ranges (per subscription) then the broker has to face many
>> negative
>> consequences in terms of memory and CPU, and then it will require to build
>> additional debt to solve such problems. Therefore, I consider 10M unack
>> message ranges are more than enough for any application, if not then we
>> should prevent those applications to not go beyond such thresholds to
>> handle the back-pressure of unack messages and not allowing applications
>> to
>> build a cache at broker-side.
>>
>> We have seen multiple issues in the past due to large numbers of unack
>> messages in memory and to handle such abuse, brokers have configurations
>> to
>> allow max unack messages per consumer (maxUnackedMessagesPerConsumer:
>> default 50K) and subscription (maxUnackedMessagesPerSubscription: default
>> 200K). So, brokers already have large enough limits to store unack
>> messages
>> and going beyond that limit causes scaling issues in brokers, so better to
>> keep the system stable and safe.
>>
>> Thanks,
>> Rajan
>>
>> On Thu, Jan 28, 2021 at 9:50 AM Sijie Guo <gu...@gmail.com> wrote:
>>
>> > Agreed with Lin. I think we should try to abstract this into an
>> interface
>> > and allow different implementations.
>> >
>> > Rajan - what is your real concern making it abstract?
>> >
>> > - Sijie
>> >
>> > On Wed, Jan 27, 2021 at 7:37 PM Lin Lin <li...@apache.org> wrote:
>> >
>> > > Hi Rajan,
>> > > Thank you for your PR.
>> > > The main difference lies in whether 10MB is enough and memory doubling
>> > > problem, which is caused by different business scenarios.
>> > > In some business scenario, the QPS of 20k/s is considered to be very
>> low,
>> > > and requests exceeding this order of magnitude are common.
>> > > If it is only increased to 10MB, the time exceeding the threshold only
>> > > changes from 30 seconds to 60 seconds, and the problems in PIP are
>> still
>> > > not solved.
>> > > "large enough" may be base on your scenario, and in some scenario, it
>> is
>> > > not enough in most cases...
>> > > Because the problem has not been solved, I suggest to abstract, so
>> that
>> > > different people can choose.
>> > > Your PR is an improvement to the current performance, there is no
>> > conflict
>> > > between them.
>> > >
>> > > Thanks
>> > >
>> > > On 2021/01/27 03:50:07, Rajan Dhabalia <rd...@apache.org> wrote:
>> > > > I have created a PR which should allow brokers to store up to 10M
>> > > > unack-message ranges. I think it should be large enough for any
>> > usecases
>> > > > and probably now, we might not need to introduce abstraction for ack
>> > > > management to avoid any further complexity in message
>> acknowledgement
>> > > path
>> > > > as well.
>> > > > https://github.com/apache/pulsar/pull/9292
>> > > >
>> > > > Thanks,
>> > > > Rajan
>> > >
>> > >
>> >
>>
>

Re: [E] Re: [PIP-78] Split the individual acknowledgments into multiple entries

Posted by Sijie Guo <gu...@gmail.com>.

Matteo, Rajan, Joe, Addison, Penghui, and Jia have met today to discuss
this issue. We have come up with next steps:

In general, we have agreed to proceed with the PIP.

1. Penghui is going to update the PIP to clarify terminology and diagrams
in the PIP. He will also write more details about how the LRU cache work.
2. Matteo suggested adding the entry index to the existing PositionInfo. If
the number of the unacked ranges is lower than a certain threshold, use
existing approach, and only move to the split approach after it goes beyond
the threshold.
3. We agreed to make sure we still have a cap about the max unacked ranges.

If I missed anything, feel free to add them here.

The discussion recording is also available at
https://github.com/apache/pulsar/wiki/Community-Meetings#pip-discussions

Thanks,
Sijie


On Tue, Feb 2, 2021 at 9:20 PM Sijie Guo <gu...@gmail.com> wrote:

> @Rajan Dhabalia <rd...@apache.org>
>
> We discussed this PIP in the community meeting and agreed that we need to
> evaluate this PIP one more time to see if we can combine this PIP together
> with the bitmap proposal together as the final solution.
>
> We should think about more use cases before pushing the bitmap approach.
> Hence let's spend more time reviewing Lin's proposal and see if we can find
> an approach there.
>
> If there is no way to find a solution that solves all the problems, I
> would still suggest having an interface to allow implementing different
> algorithms. We can always have a basic algorithm that works for most of the
> use cases. But also allow advanced users to customize a solution that works
> for the other use cases.
>
>
> https://docs.google.com/document/d/19dXkVXeU2q_nHmkG8zURjKnYlvD96TbKf5KjYyASsOE/edit#bookmark=id.ekbmsiqz25jm
>
> Thanks,
> Sijie
>
> On Fri, Jan 29, 2021 at 1:59 AM Sijie Guo <gu...@gmail.com> wrote:
>
>> Rajan - I understand your concern about memory. Lin, Penghui, and I also
>> acknowledged that none of the implementations solve every use case. Each
>> implementation has its limitation and concerns.
>>
>> I am trying to find a way for both parties (You and Lin) can explore
>> different implementations. An abstraction/interface sounds like a
>> reasonable approach to take, no? We have a similar situation in the delayed
>> message scheduler. The current queue-based implementation works for certain
>> workloads but doesn't work for delayed messages spanning a long time span.
>> But the delayed message scheduler is an interface that allows people to
>> implement a better solution (i.e. a hashed-wheel-based implementation).
>> Does that make sense?
>>
>> Thanks,
>> Sijie
>>
>>
>> On Thu, Jan 28, 2021 at 11:54 AM Rajan Dhabalia <rd...@apache.org>
>> wrote:
>>
>>> My only point was, if the broker tries to manage more than 10M unack
>>> message ranges (per subscription) then the broker has to face many
>>> negative
>>> consequences in terms of memory and CPU, and then it will require to
>>> build
>>> additional debt to solve such problems. Therefore, I consider 10M unack
>>> message ranges are more than enough for any application, if not then we
>>> should prevent those applications to not go beyond such thresholds to
>>> handle the back-pressure of unack messages and not allowing applications
>>> to
>>> build a cache at broker-side.
>>>
>>> We have seen multiple issues in the past due to large numbers of unack
>>> messages in memory and to handle such abuse, brokers have configurations
>>> to
>>> allow max unack messages per consumer (maxUnackedMessagesPerConsumer:
>>> default 50K) and subscription (maxUnackedMessagesPerSubscription: default
>>> 200K). So, brokers already have large enough limits to store unack
>>> messages
>>> and going beyond that limit causes scaling issues in brokers, so better
>>> to
>>> keep the system stable and safe.
>>>
>>> Thanks,
>>> Rajan
>>>
>>> On Thu, Jan 28, 2021 at 9:50 AM Sijie Guo <gu...@gmail.com> wrote:
>>>
>>> > Agreed with Lin. I think we should try to abstract this into an
>>> interface
>>> > and allow different implementations.
>>> >
>>> > Rajan - what is your real concern making it abstract?
>>> >
>>> > - Sijie
>>> >
>>> > On Wed, Jan 27, 2021 at 7:37 PM Lin Lin <li...@apache.org> wrote:
>>> >
>>> > > Hi Rajan,
>>> > > Thank you for your PR.
>>> > > The main difference lies in whether 10MB is enough and memory
>>> doubling
>>> > > problem, which is caused by different business scenarios.
>>> > > In some business scenario, the QPS of 20k/s is considered to be very
>>> low,
>>> > > and requests exceeding this order of magnitude are common.
>>> > > If it is only increased to 10MB, the time exceeding the threshold
>>> only
>>> > > changes from 30 seconds to 60 seconds, and the problems in PIP are
>>> still
>>> > > not solved.
>>> > > "large enough" may be base on your scenario, and in some scenario,
>>> it is
>>> > > not enough in most cases...
>>> > > Because the problem has not been solved, I suggest to abstract, so
>>> that
>>> > > different people can choose.
>>> > > Your PR is an improvement to the current performance, there is no
>>> > conflict
>>> > > between them.
>>> > >
>>> > > Thanks
>>> > >
>>> > > On 2021/01/27 03:50:07, Rajan Dhabalia <rd...@apache.org> wrote:
>>> > > > I have created a PR which should allow brokers to store up to 10M
>>> > > > unack-message ranges. I think it should be large enough for any
>>> > usecases
>>> > > > and probably now, we might not need to introduce abstraction for
>>> ack
>>> > > > management to avoid any further complexity in message
>>> acknowledgement
>>> > > path
>>> > > > as well.
>>> > > > https://github.com/apache/pulsar/pull/9292
>>> > > >
>>> > > > Thanks,
>>> > > > Rajan
>>> > >
>>> > >
>>> >
>>>
>>