You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Lari Hotari <lh...@apache.org> on 2023/11/24 10:03:40 UTC

Pulsar Community Meeting minutes 2023/11/23

Pulsar Community Meeting minutes 2023/11/23

Notice: Draft minutes pending review - please suggest any corrections or
additions by replying to this email thread.

-  Attendees:
   -  Girish Sharma
   -  YuWei Sung
   -  Apurva T
   -  Asaf Mesika
   -  Lari Hotari
   -  Chris Bono

-  Agenda

   -  PIP-310 and rate limiting improvements

      -  Pulsar Rate Limiter requirements by Girish -
            https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc

      -  Lari to present summary of views on PIP-310. This is documented
            in the blog post “Apache Pulsar service level objectives and
            rate limiting”
            <https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html>
            . Please read the blog post before the meeting as a
            preparation

-  Meeting Minutes:

   -  Girish presented the background and the problem with the current
         rate limiters by going over the Pulsar Rate Limiter document
         <https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc>.
         The conclusion is that there’s a need for supporting bursting
         while keeping the allowed bursting on a single broker under the
         limit of what the broker can do.

   -  Related to how the combined bursting of all topics in a broker
         could be kept under the limits of a broker, Lari added that in
         Confluent Kora, there's a concept called dynamic quota
         management that is described in the Kora paper section 5.2.2
         <http://vldb.org/pvldb/vol16/p3822-povzner.pdf#page=11>: "Kora
         addresses this issue by using a dynamic quota mechanism that
         adjusts bandwidth distribution based on a tenant’s bandwidth
         consumption."

      -  While bursting, the remaining available capacity on the broker
            could be proportionally split based on the configured topic
            rates.

      -  Girish added that in their case, the topics that should be
            prioritized in bursting aren’t the ones with the highest
            throughput.

      -  There would be a need to have SLA/SLO (Service Level Objective)
            metadata for topics in the future that would help Pulsar 
            making proper prioritization decisions in these types of
            scenarios.

   -  Girish continued explaining the details of rate limiting bursting
         requirements by going over the document. There are very
         valuable findings and observations that will be very helpful in
         improving the Pulsar rate limiting solution. Girish has taken
         an approach in the document where it goes beyond PIP-310 to
         explain the requirements from his organization’s perspective.

   -  After going over Girish's Pulsar Rate Limiter document, there was
         a discussion about the next steps for proceeding forward.

   -  There was a consensus that the default (“polling”) rate limiter
         option in Pulsar is unusable in practice and this needs to be
         addressed in the Pulsar core (see Girish’s analysis in the
         document section “4.1 Existing pulsar rate limiter”
         <https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit#heading=h.nx692qsf70id>).

   -  The group discussed the next steps in order to make progress.
         There are two separate areas of work: addressing the issues
         with the Pulsar default rate limiters and the other one is
         addressing the requirements that Girish brought up in his
         presentation over the Pulsar Rate Limiter document
         <https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc>.

   -  Lari presented his view to address the issue in the Pulsar default
         rate limiters based on his blog post “Apache Pulsar service
         level objectives and rate limiting section “Problems to address
         as the next step”
         <https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step>.

      -  The first goal is to reach feature parity with the current rate
            limiters in Pulsar without introducing breaking changes.

      -  Instead of adding more feature flags to clutter the code base
            and add more complexity, this would be handled as a
            refactoring where the existing internal solution in the
            Pulsar code base is replaced with the new solution that
            addresses the problems explained in the blog post.

      -  The replacement solution for the refactoring has already been
            sufficiently validated (explained in the blog post
            <https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step>)
            so that there’s confidence to move forward.

   -  There was a question whether this change could be implemented with
         a feature flag instead of handling it as a refactoring where
         old code gets deleted and removed.

      -  Lari thinks that this would be a bad idea in this case since it
            would increase complexity in the code base, and it would
            make it even harder to maintain the code base in the future.
            He would rather solve this by creating a minimal refactoring
            PR that reaches feature parity with the existing solution in
            a single PR.

      -  There was a discussion that it would be a hard PR to review
            because it could be a large change since the current rate
            limiting touches many parts of the code base.

   -  It was then discussed if a PIP should first be made before
         starting to make further changes towards this direction.

      -  There was a discussion about the PIP process. Lari said that
            the process could be adjusted when it is needed. In this
            case, Lari is planning to proceed by first creating a PR in
            draft mode before writing a PIP. Lari’s opinion is that PIPs
            could also be created in a different order when it makes
            sense. In Apache projects, the Pulsar dev mailing list is
            the place where decisions are made eventually. There was a
            long discussion about the tradeoffs of PIPs and the process.
            (I’m sorry that I couldn’t capture that to meeting notes.
            Someone also mentioned that Lari’s blog post is already
            almost a PIP.)

      -  Lari explained that by creating the draft PR, it would also
            show the extent of the required changes. Analyzing the
            required changes without doing actual changes is not
            practical in this case.

      -  **Conclusion 1**: Lari will attempt to create a PR for the
            Pulsar rate limiting refactoring changes in draft mode, and
            then proceed to create a PIP that covers the refactoring.
            The main reason a PIP is needed for this change is that it
            is a large code change touching multiple components, as
            required by the PIP process guidelines. (PIP process
            <https://github.com/apache/pulsar/blob/master/pip/README.md>).

   -  **Conclusion 2:** For Girish’s requirements for rate limiting, it
         was agreed that Girish would start a “parent PIP” which focuses
         on describing the Pulsar rate limiter requirements (outcomes)
         and the problem instead of the solution. Child PIPs could
         follow.

The next meeting will be held on December 7th, 2023. Everyone is welcome
to join. Here is the Pulsar Community Meeting calendar, which includes
the Zoom link: https://github.com/apache/pulsar/wiki/Community-Meetings.
Please add your agenda proposals to the meeting minutes document. You
can find the link to this document on the community meetings page.

Re: Pulsar Community Meeting minutes 2023/11/23

Posted by Enrico Olivelli <eo...@gmail.com>.
Thanks Lari for sharing

Enrico

Il Ven 24 Nov 2023, 11:03 Lari Hotari <lh...@apache.org> ha scritto:

> Pulsar Community Meeting minutes 2023/11/23
>
> Notice: Draft minutes pending review - please suggest any corrections or
> additions by replying to this email thread.
>
> -  Attendees:
>    -  Girish Sharma
>    -  YuWei Sung
>    -  Apurva T
>    -  Asaf Mesika
>    -  Lari Hotari
>    -  Chris Bono
>
> -  Agenda
>
>    -  PIP-310 and rate limiting improvements
>
>       -  Pulsar Rate Limiter requirements by Girish -
>
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc
>
>       -  Lari to present summary of views on PIP-310. This is documented
>             in the blog post “Apache Pulsar service level objectives and
>             rate limiting”
>             <
> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html
> >
>             . Please read the blog post before the meeting as a
>             preparation
>
> -  Meeting Minutes:
>
>    -  Girish presented the background and the problem with the current
>          rate limiters by going over the Pulsar Rate Limiter document
>          <
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc
> >.
>          The conclusion is that there’s a need for supporting bursting
>          while keeping the allowed bursting on a single broker under the
>          limit of what the broker can do.
>
>    -  Related to how the combined bursting of all topics in a broker
>          could be kept under the limits of a broker, Lari added that in
>          Confluent Kora, there's a concept called dynamic quota
>          management that is described in the Kora paper section 5.2.2
>          <http://vldb.org/pvldb/vol16/p3822-povzner.pdf#page=11>: "Kora
>          addresses this issue by using a dynamic quota mechanism that
>          adjusts bandwidth distribution based on a tenant’s bandwidth
>          consumption."
>
>       -  While bursting, the remaining available capacity on the broker
>             could be proportionally split based on the configured topic
>             rates.
>
>       -  Girish added that in their case, the topics that should be
>             prioritized in bursting aren’t the ones with the highest
>             throughput.
>
>       -  There would be a need to have SLA/SLO (Service Level Objective)
>             metadata for topics in the future that would help Pulsar
>             making proper prioritization decisions in these types of
>             scenarios.
>
>    -  Girish continued explaining the details of rate limiting bursting
>          requirements by going over the document. There are very
>          valuable findings and observations that will be very helpful in
>          improving the Pulsar rate limiting solution. Girish has taken
>          an approach in the document where it goes beyond PIP-310 to
>          explain the requirements from his organization’s perspective.
>
>    -  After going over Girish's Pulsar Rate Limiter document, there was
>          a discussion about the next steps for proceeding forward.
>
>    -  There was a consensus that the default (“polling”) rate limiter
>          option in Pulsar is unusable in practice and this needs to be
>          addressed in the Pulsar core (see Girish’s analysis in the
>          document section “4.1 Existing pulsar rate limiter”
>          <
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc/edit#heading=h.nx692qsf70id
> >).
>
>    -  The group discussed the next steps in order to make progress.
>          There are two separate areas of work: addressing the issues
>          with the Pulsar default rate limiters and the other one is
>          addressing the requirements that Girish brought up in his
>          presentation over the Pulsar Rate Limiter document
>          <
> https://docs.google.com/document/d/1-y5nBaC9QuAUHKUGMVVe4By-SmMZIL4w09U1byJBbMc
> >.
>
>    -  Lari presented his view to address the issue in the Pulsar default
>          rate limiters based on his blog post “Apache Pulsar service
>          level objectives and rate limiting section “Problems to address
>          as the next step”
>          <
> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step
> >.
>
>       -  The first goal is to reach feature parity with the current rate
>             limiters in Pulsar without introducing breaking changes.
>
>       -  Instead of adding more feature flags to clutter the code base
>             and add more complexity, this would be handled as a
>             refactoring where the existing internal solution in the
>             Pulsar code base is replaced with the new solution that
>             addresses the problems explained in the blog post.
>
>       -  The replacement solution for the refactoring has already been
>             sufficiently validated (explained in the blog post
>             <
> https://codingthestreams.com/pulsar/2023/11/22/pulsar-slos-and-rate-limiting.html#problems-to-address-as-the-next-step
> >)
>             so that there’s confidence to move forward.
>
>    -  There was a question whether this change could be implemented with
>          a feature flag instead of handling it as a refactoring where
>          old code gets deleted and removed.
>
>       -  Lari thinks that this would be a bad idea in this case since it
>             would increase complexity in the code base, and it would
>             make it even harder to maintain the code base in the future.
>             He would rather solve this by creating a minimal refactoring
>             PR that reaches feature parity with the existing solution in
>             a single PR.
>
>       -  There was a discussion that it would be a hard PR to review
>             because it could be a large change since the current rate
>             limiting touches many parts of the code base.
>
>    -  It was then discussed if a PIP should first be made before
>          starting to make further changes towards this direction.
>
>       -  There was a discussion about the PIP process. Lari said that
>             the process could be adjusted when it is needed. In this
>             case, Lari is planning to proceed by first creating a PR in
>             draft mode before writing a PIP. Lari’s opinion is that PIPs
>             could also be created in a different order when it makes
>             sense. In Apache projects, the Pulsar dev mailing list is
>             the place where decisions are made eventually. There was a
>             long discussion about the tradeoffs of PIPs and the process.
>             (I’m sorry that I couldn’t capture that to meeting notes.
>             Someone also mentioned that Lari’s blog post is already
>             almost a PIP.)
>
>       -  Lari explained that by creating the draft PR, it would also
>             show the extent of the required changes. Analyzing the
>             required changes without doing actual changes is not
>             practical in this case.
>
>       -  **Conclusion 1**: Lari will attempt to create a PR for the
>             Pulsar rate limiting refactoring changes in draft mode, and
>             then proceed to create a PIP that covers the refactoring.
>             The main reason a PIP is needed for this change is that it
>             is a large code change touching multiple components, as
>             required by the PIP process guidelines. (PIP process
>             <https://github.com/apache/pulsar/blob/master/pip/README.md>).
>
>    -  **Conclusion 2:** For Girish’s requirements for rate limiting, it
>          was agreed that Girish would start a “parent PIP” which focuses
>          on describing the Pulsar rate limiter requirements (outcomes)
>          and the problem instead of the solution. Child PIPs could
>          follow.
>
> The next meeting will be held on December 7th, 2023. Everyone is welcome
> to join. Here is the Pulsar Community Meeting calendar, which includes
> the Zoom link: https://github.com/apache/pulsar/wiki/Community-Meetings.
> Please add your agenda proposals to the meeting minutes document. You
> can find the link to this document on the community meetings page.
>