You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by "Kramer, Andre" <An...@softwareag.com> on 2020/11/12 13:16:38 UTC

Proposal for Consumer Filtering in Pulsar brokers

Hello everyone,

We at Software AG have prototyped adding filtering on Consumer subscriptions in the Pulsar broker and are submitting our changes for consideration under Apache 2.0 license. Please see pull request [Consumer Filtering #8544 https://github.com/apache/pulsar/pull/8544] and attached write up. Comments welcome!

Thanks,
Andre

andre.kramer@softwareag.com
This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk

RE: Proposal for Consumer Filtering in Pulsar brokers

Posted by "Kramer, Andre" <An...@softwareag.com>.
PIP 70 seems a different use case based on "seek"ing rather than filtering. Possibly it could be used for some sort of basic "tag" to filter on as well but not sure.

Andre

-----Original Message-----
From: Jia Zhai <zh...@gmail.com>
Sent: 15 November 2020 14:53
To: Dev <de...@pulsar.apache.org>
Subject: Re: Proposal for Consumer Filtering in Pulsar brokers

Hi Andre,
Thanks for this proposal. Besides Sijie's comments, there is also a PIP 70:
https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata,
do you think it could help on this proposal?  We have discussed this consumer filter before, and the performance and penalty to the broker is also a big concern.


Best Regards.


Jia Zhai

Beijing, China

Mobile: +86 15810491983




On Sat, Nov 14, 2020 at 3:08 AM Sijie Guo <gu...@gmail.com> wrote:

> Andre,
>
> Is it possible to put it in a Google Doc (or similar collaboration
> tool) that allows other people to make comments? Also, it would be
> easier for the committers to copy the PIP to Pulsar wiki pages.
>
> Thanks,
> Sijie
>
> On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre
> <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Hi Sijie,
> >
> > I had added a PIP style document to the pull request:
> >
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> X%20-%20Consumer-filtering.pdf
> > Hopefully that could be used to start the discussion?
> >
> > Regards,
> > Andre
> >
> > -----Original Message-----
> > From: Sijie Guo <gu...@gmail.com>
> > Sent: 12 November 2020 18:32
> > To: Dev <de...@pulsar.apache.org>
> > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> >
> > Hi Andre,
> >
> > I didn't see the attached writeup. Can you write a PIP for this feature?
> > Given it is a big feature, it would be good to discuss it through a PIP.
> >
> > - Sijie
> >
> > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre <
> Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Hello everyone,
> > >
> > >
> > >
> > > We at Software AG have prototyped adding filtering on Consumer
> > > subscriptions in the Pulsar broker and are submitting our changes
> > > for consideration under Apache 2.0 license. Please see pull
> > > request [Consumer Filtering #8544
> > > https://github.com/apache/pulsar/pull/8544]
> > > and attached write up. Comments welcome!
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Andre
> > >
> > >
> > >
> > > andre.kramer@softwareag.com
> > > This communication contains information which is confidential and
> > > may also be privileged. It is for the exclusive use of the
> > > intended recipient(s). If you are not the intended recipient(s),
> > > please note that any distribution, copying, or use of this
> > > communication or the information in it, is strictly prohibited. If
> > > you have received this communication in error please notify us by
> > > e-mail and then delete the
> > e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > *http://www.softwareag.com/uk
> > > * <http://www.softwareag.com/uk>
> > >
> > This communication contains information which is confidential and
> > may
> also
> > be privileged. It is for the exclusive use of the intended recipient(s).
> If
> > you are not the intended recipient(s), please note that any
> > distribution, copying, or use of this communication or the
> > information in it, is
> strictly
> > prohibited. If you have received this communication in error please
> notify
> > us by e-mail and then delete the e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > http://www.softwareag.com/uk
> >
>
This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Jia Zhai <zh...@gmail.com>.
Hi Andre,
Thanks for this proposal. Besides Sijie's comments, there is also a PIP 70:
https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata,
do you think it could help on this proposal?  We have discussed this
consumer filter before, and the performance and penalty to the broker is
also a big concern.


Best Regards.


Jia Zhai

Beijing, China

Mobile: +86 15810491983




On Sat, Nov 14, 2020 at 3:08 AM Sijie Guo <gu...@gmail.com> wrote:

> Andre,
>
> Is it possible to put it in a Google Doc (or similar collaboration tool)
> that allows other people to make comments? Also, it would be easier for the
> committers to copy the PIP to Pulsar wiki pages.
>
> Thanks,
> Sijie
>
> On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Hi Sijie,
> >
> > I had added a PIP style document to the pull request:
> >
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-XX%20-%20Consumer-filtering.pdf
> > Hopefully that could be used to start the discussion?
> >
> > Regards,
> > Andre
> >
> > -----Original Message-----
> > From: Sijie Guo <gu...@gmail.com>
> > Sent: 12 November 2020 18:32
> > To: Dev <de...@pulsar.apache.org>
> > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> >
> > Hi Andre,
> >
> > I didn't see the attached writeup. Can you write a PIP for this feature?
> > Given it is a big feature, it would be good to discuss it through a PIP.
> >
> > - Sijie
> >
> > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre <
> Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Hello everyone,
> > >
> > >
> > >
> > > We at Software AG have prototyped adding filtering on Consumer
> > > subscriptions in the Pulsar broker and are submitting our changes for
> > > consideration under Apache 2.0 license. Please see pull request
> > > [Consumer Filtering #8544 https://github.com/apache/pulsar/pull/8544]
> > > and attached write up. Comments welcome!
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Andre
> > >
> > >
> > >
> > > andre.kramer@softwareag.com
> > > This communication contains information which is confidential and may
> > > also be privileged. It is for the exclusive use of the intended
> > > recipient(s). If you are not the intended recipient(s), please note
> > > that any distribution, copying, or use of this communication or the
> > > information in it, is strictly prohibited. If you have received this
> > > communication in error please notify us by e-mail and then delete the
> > e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > *http://www.softwareag.com/uk
> > > * <http://www.softwareag.com/uk>
> > >
> > This communication contains information which is confidential and may
> also
> > be privileged. It is for the exclusive use of the intended recipient(s).
> If
> > you are not the intended recipient(s), please note that any distribution,
> > copying, or use of this communication or the information in it, is
> strictly
> > prohibited. If you have received this communication in error please
> notify
> > us by e-mail and then delete the e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > http://www.softwareag.com/uk
> >
>

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Rajan Dhabalia <rd...@apache.org>.
*I also have few concerns with the PIP by introducing server side filtering
and processing in dispatch path. Filtering on unbounded tag list will make
computation unpredictable and that will be definitely a problem to define
the capacity model and can be a big problem of noisy neighbors in a
multi-tenant system.I think we should add more details about handling batch
messages as well. It will be tricky because filtering is per message and we
have to deserialize batch messages to route individual messages.
Deserializing messages at the server side will be very expensive and that
will be eventually a stopper for a multi-tenant system.So, server-side
filtering and processing should be avoided in pulsar codebase. However, I
liked the idea to make dispatcher pluggable so it can be enhanced
independently as per user requirements or we can think about client side
filtering and auto acknowledgement to avoid server side
complexity.Thanks,Rajan*

On Mon, Nov 16, 2020 at 3:35 PM Sijie Guo <gu...@gmail.com> wrote:

> Joe - Very comprehensive writeup!
>
> > So my vote is not to allow this (and any other server side logic
> implementations) into the base dispatcher, but permit these kinds of
> changes as configurable dispatchers. I hope I have explained the reasons
> for that vote clearly.
>
> +1. We can do it in a way like what we did for protocol handlers.
>
> Thanks,
> Sijie
>
> On Mon, Nov 16, 2020 at 10:40 AM Joe F <jo...@apache.org> wrote:
>
> > We have had discussions in the community list on server side logic
> > previously. I would like to keep the specific proposal in this PIP aside,
> > and address what this PIP is  implicitly changing in core Pulsar
> design.  I
> > want to have an explicit discussion on that topic: what is the path for
> > server-side business logic in Pulsar?
> >
> > Pulsar has been designed to do a few things very well.  It is designed to
> > be run as a hosted service, meaning it can be scaled horizontally by
> adding
> > storage or compute hardware, as traffic or tenants on the service grows.
> It
> > is optimized for data streaming at  throughput and scale,  and does
> > multi-tenancy extremely well.  Part of that design is that there is no
> > business logic that is in the data flow path. Since  business logic lives
> > outside of the core data flow path in Pulsar, the core is optimized for
> > data flow. Do plain byte movement - no ser/de, no byte copy, no
> > computations - and do it extremely well. Other systems, like Kafka and
> > Kinesis have taken the same approach;  no to server side business logic.
> >
> > This particular PIP  may be  expensive on the server, or not. The next
> PIP
> > could be, and there is no rationale to stop adding any kind of business
> > logic into the broker, once this concept is allowed.
> >
> > Selective consumers are an anti-pattern for data flow systems. There are
> > systems out there that support implementation of business logic in the
> data
> > flow path, and they don't scale.   Take the example of AMQ.   AMQ allows
> > JMS/SQL-92 expressions server side. Once the door to this anti-pattern
> is
> > opened, there is no rhyme or reason to deny anything, upto  including a
> > full-blown SQL query evaluation in the dispatch path.
> >
> > So why not allow that? Why not allow a full blown expression evaluation
> in
> > the data flow path?
> >
> > Unfortunately there  is no way to answer this without bringing up the
> > conflict of interest between small users vs. large scale users running
> > multi-tenant hosted Pulsar, at huge traffic volumes.
> >
> > For low scale, single (or few) tenant installations, efficiency of flow,
> > latency and throughput are not the driving concern. In a small cluster,
> > the implications of cost and scale, is minimal in absolute terms,  when
> > server side business logic is executed.
> >
> > For large scale users (like me) this is a no go. There are many problems
> > with this,  that makes it very difficult to run a hosted platform with
> > predictable  SLAs, once users can introduce business logic into the
> broker.
> > These are on top of the performance and cost  implications
> >
> > First, broker throughput and performance becomes unpredictable.  The
> > current Pulsar load model (and it is used in the load manager for load
> > balancing) becomes unusable. Not only that, there will be no pre-computed
> > model that can be used in the load manager. Since  the producer and
> > consumer randomly decide on what is the business logic,and the
> computation
> > can change based on the data,  the model itself becomes dynamic and the
> > load manager has to rebuild the model anytime an user updates the
> business
> > logic. That is a tall order, worth years of work to implement.
> >
> > Second, this introduces the noisy neighbor issue. Two tenants will
> happily
> > run on the same broker, till one of them decides to change the logic on
> the
> > subscription, and suddenly the  quality for the other tenant is degraded
> > because the broker is impacted.  The system operator of the cluster has
> now
> > to get involved out of the blue, because one tenant did a change.
> > Basically  any tenant can disrupt the system by triggering additional
> > business logic in the server, or by specific data patterns that can make
> > the business logic expensive on the server
> >
> > Third, this makes provisioning capacity impossible. Today Pulsar users
> can
> > be provisioned on flow - bw in/out. Msgs in/out.  With server side
> business
> > logic, there is some random overhead that needs to be accounted in the
> > capacity calculation.
> >
> > We, who run Pulsar as a hosted service, do not want any of our tenants to
> > introduce server side logic into the service.  Because,  to do it well
> > requires a load balancer that can continuously and dynamically adjust its
> > load model and capacity model (based on ML on the traffic maybe).  The
> > scope of building such a system will convert Pulsar  from a  data
> streaming
> > project  to a load balancer/resource manager  project. The only viable
> > solution will be to give each tenant their own dedicated servers - at
> which
> > point all claims to multi-tenancy in Pulsar  should be dropped.
> >
> >
> > So large multi-tenant clusters will have big problems with the addition
> of
> > business logic into the broker.
> >
> > But this problem - Pulsar users attempting to add server side logic into
> > Pulsar - is not going to go away. There will always be yet another new
> user
> > who will ask for adding ‘one more simple implementation' of server side
> > business logic into the broker.
> >
> > My suggestion here is simple. Make the dispatcher a configurable module.
> > Let users who want to do server side logic configure their own
> > computational logic in custom dispatchers and   use it to their needs.
> > Allow users  to implement custom dispatchers as a loadable module.  Users
> > can then implement whatever logic they need to, without depending on
> > Pulsar, and the code and module will remain in user-land rather than
> Pulsar
> > land.  No one will be required to  contribute their dispatchers to
> Pulsar,
> > but if there are specific dispatchers which can have widespread use, they
> > can contribute it back into Pulsar (like connectors)
> >
> > If this seems suspiciously similar to functions, then yes, it is.
> Functions
> > were meant to fulfill this need, but without messing with the dispatcher.
> > Functions were meant to do business logic outside the hosted service, so
> > that the service itself is not impacted by random users injecting
> business
> > logic into the platform.
> >
> > But if functions are not acceptable, and users still want to mess with
> the
> > dispatcher, what I am proposing is a way to let users  do that without
> > breaking the design goals of Pulsar.  That will avoid  impacting the core
> > data flow path,  for large system/ hosted service/multi-tenant use cases.
> >
> > So my vote is not to allow this (and any other server side logic
> > implementations) into the base dispatcher, but permit these kinds of
> > changes as configurable dispatchers. I hope I have explained the reasons
> > for that vote clearly.
> >
> >
> > Joe
> >
> >
> > On Mon, Nov 16, 2020 at 10:03 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > Andre,
> > >
> > > I left a comment on the pull request. But I will just copy them here as
> > > well.
> > >
> > > I have a couple of comments and one suggestion.
> > >
> > > 1. What is the performance & GC implication with this change? I think
> > most
> > > of the questions on this pull request is about the performance & GC
> > > implication. It would be good to show your benchmarking/testing
> > methodology
> > > and the benchmark results to the community.
> > >
> > > 2. How are you going to handle topics with end-to-end encryption
> enabled?
> > >
> > > 3. How do you handle acknowledgment for the messages that have been
> > > filtered out and never sent to the consumers? I don't see it is
> discussed
> > > in the PIP. Especially, how is it related to different subscription
> > types?
> > >
> > > One suggestion - If this PIP is approved, my recommendation is to use
> the
> > > NAR classloader to load the class. You can check how Pulsar uses NAR
> > > classloader for other interfaces.
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <
> > Andre.Kramer@softwareag.com
> > > >
> > > wrote:
> > >
> > > > Sure, please feel free to copy the doc to wiki pages. It's mainly
> text
> > so
> > > > can be converted easily.
> > > >
> > > > Cheers,
> > > > Andre
> > > >
> > > > -----Original Message-----
> > > > From: Sijie Guo <gu...@gmail.com>
> > > > Sent: 13 November 2020 19:08
> > > > To: Dev <de...@pulsar.apache.org>
> > > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > > >
> > > > Andre,
> > > >
> > > > Is it possible to put it in a Google Doc (or similar collaboration
> > tool)
> > > > that allows other people to make comments? Also, it would be easier
> for
> > > the
> > > > committers to copy the PIP to Pulsar wiki pages.
> > > >
> > > > Thanks,
> > > > Sijie
> > > >
> > > > On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <
> > > Andre.Kramer@softwareag.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi Sijie,
> > > > >
> > > > > I had added a PIP style document to the pull request:
> > > > >
> > https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> > > > > X%20-%20Consumer-filtering.pdf Hopefully that could be used to
> start
> > > > > the discussion?
> > > > >
> > > > > Regards,
> > > > > Andre
> > > > >
> > > > > -----Original Message-----
> > > > > From: Sijie Guo <gu...@gmail.com>
> > > > > Sent: 12 November 2020 18:32
> > > > > To: Dev <de...@pulsar.apache.org>
> > > > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > > > >
> > > > > Hi Andre,
> > > > >
> > > > > I didn't see the attached writeup. Can you write a PIP for this
> > > feature?
> > > > > Given it is a big feature, it would be good to discuss it through a
> > > PIP.
> > > > >
> > > > > - Sijie
> > > > >
> > > > > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> > > > > <Andre.Kramer@softwareag.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > >
> > > > > >
> > > > > > We at Software AG have prototyped adding filtering on Consumer
> > > > > > subscriptions in the Pulsar broker and are submitting our changes
> > > > > > for consideration under Apache 2.0 license. Please see pull
> request
> > > > > > [Consumer Filtering #8544
> > > > > > https://github.com/apache/pulsar/pull/8544]
> > > > > > and attached write up. Comments welcome!
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Andre
> > > > > >
> > > > > >
> > > > > >
> > > > > > andre.kramer@softwareag.com
> > > > > > This communication contains information which is confidential and
> > > > > > may also be privileged. It is for the exclusive use of the
> intended
> > > > > > recipient(s). If you are not the intended recipient(s), please
> note
> > > > > > that any distribution, copying, or use of this communication or
> the
> > > > > > information in it, is strictly prohibited. If you have received
> > this
> > > > > > communication in error please notify us by e-mail and then delete
> > > > > > the
> > > > > e-mail and any copies of it.
> > > > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > > > *http://www.softwareag.com/uk
> > > > > > * <http://www.softwareag.com/uk>
> > > > > >
> > > > > This communication contains information which is confidential and
> may
> > > > > also be privileged. It is for the exclusive use of the intended
> > > > > recipient(s). If you are not the intended recipient(s), please note
> > > > > that any distribution, copying, or use of this communication or the
> > > > > information in it, is strictly prohibited. If you have received
> this
> > > > > communication in error please notify us by e-mail and then delete
> the
> > > > e-mail and any copies of it.
> > > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > > http://www.softwareag.com/uk
> > > > >
> > > > This communication contains information which is confidential and may
> > > also
> > > > be privileged. It is for the exclusive use of the intended
> > recipient(s).
> > > If
> > > > you are not the intended recipient(s), please note that any
> > distribution,
> > > > copying, or use of this communication or the information in it, is
> > > strictly
> > > > prohibited. If you have received this communication in error please
> > > notify
> > > > us by e-mail and then delete the e-mail and any copies of it.
> > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > http://www.softwareag.com/uk
> > > >
> > >
> >
>

RE: Proposal for Consumer Filtering in Pulsar brokers

Posted by "Kramer, Andre" <An...@softwareag.com>.
Hi,

The specific Consumer Filtering proposal is not to allow server extension but to provide a simple way to filter messages by property values as specified by any consumer. There is a way to define alternative implementation in the pull request but that would be mainly for experimentation with alternative implementations and easy stubbing out of the feature and is per-broker/server and not user definable.

The filter by tags feature itself obviously has server costs but these should be bound by a worst case of checking all tags and passing through all messages and grows linearly with the size of meta data (tags in consumer meta data, properties in message meta data for both of which there is currently no bounds).

When evaluating Pulsar a lot of projects will, like ours, probably list the ability to filter messages by simple string value matching as a requirement. Functions are not really an alternative if you need a lot of topics to be filtered or don't know what tags the consumers will be looking for in future. Client side filtering is not an option if the client is a low bandwidth/higher latency step distant.

Great if this can be achieved by a more general mechanism but our requirement is definitely simple filtering by property values matched by consumer specified tags on any topic (unless properties are encrypted or otherwise obscured).

Regards,
Andre

-----Original Message-----
From: Sijie Guo <gu...@gmail.com>
Sent: 16 November 2020 23:36
To: Dev <de...@pulsar.apache.org>
Subject: Re: Proposal for Consumer Filtering in Pulsar brokers

Joe - Very comprehensive writeup!

> So my vote is not to allow this (and any other server side logic
implementations) into the base dispatcher, but permit these kinds of changes as configurable dispatchers. I hope I have explained the reasons for that vote clearly.

+1. We can do it in a way like what we did for protocol handlers.

Thanks,
Sijie

On Mon, Nov 16, 2020 at 10:40 AM Joe F <jo...@apache.org> wrote:

> We have had discussions in the community list on server side logic
> previously. I would like to keep the specific proposal in this PIP
> aside, and address what this PIP is  implicitly changing in core
> Pulsar design.  I want to have an explicit discussion on that topic:
> what is the path for server-side business logic in Pulsar?
>
> Pulsar has been designed to do a few things very well.  It is designed
> to be run as a hosted service, meaning it can be scaled horizontally
> by adding storage or compute hardware, as traffic or tenants on the
> service grows. It is optimized for data streaming at  throughput and
> scale,  and does multi-tenancy extremely well.  Part of that design is
> that there is no business logic that is in the data flow path. Since
> business logic lives outside of the core data flow path in Pulsar, the
> core is optimized for data flow. Do plain byte movement - no ser/de,
> no byte copy, no computations - and do it extremely well. Other
> systems, like Kafka and Kinesis have taken the same approach;  no to server side business logic.
>
> This particular PIP  may be  expensive on the server, or not. The next
> PIP could be, and there is no rationale to stop adding any kind of
> business logic into the broker, once this concept is allowed.
>
> Selective consumers are an anti-pattern for data flow systems. There
> are systems out there that support implementation of business logic in the data
> flow path, and they don't scale.   Take the example of AMQ.   AMQ allows
> JMS/SQL-92 expressions server side. Once the door to this anti-pattern
> is opened, there is no rhyme or reason to deny anything, upto
> including a full-blown SQL query evaluation in the dispatch path.
>
> So why not allow that? Why not allow a full blown expression
> evaluation in the data flow path?
>
> Unfortunately there  is no way to answer this without bringing up the
> conflict of interest between small users vs. large scale users running
> multi-tenant hosted Pulsar, at huge traffic volumes.
>
> For low scale, single (or few) tenant installations, efficiency of
> flow, latency and throughput are not the driving concern. In a small
> cluster, the implications of cost and scale, is minimal in absolute
> terms,  when server side business logic is executed.
>
> For large scale users (like me) this is a no go. There are many
> problems with this,  that makes it very difficult to run a hosted
> platform with predictable  SLAs, once users can introduce business logic into the broker.
> These are on top of the performance and cost  implications
>
> First, broker throughput and performance becomes unpredictable.  The
> current Pulsar load model (and it is used in the load manager for load
> balancing) becomes unusable. Not only that, there will be no
> pre-computed model that can be used in the load manager. Since  the
> producer and consumer randomly decide on what is the business
> logic,and the computation can change based on the data,  the model
> itself becomes dynamic and the load manager has to rebuild the model
> anytime an user updates the business logic. That is a tall order, worth years of work to implement.
>
> Second, this introduces the noisy neighbor issue. Two tenants will
> happily run on the same broker, till one of them decides to change the
> logic on the subscription, and suddenly the  quality for the other
> tenant is degraded because the broker is impacted.  The system
> operator of the cluster has now to get involved out of the blue, because one tenant did a change.
> Basically  any tenant can disrupt the system by triggering additional
> business logic in the server, or by specific data patterns that can
> make the business logic expensive on the server
>
> Third, this makes provisioning capacity impossible. Today Pulsar users
> can be provisioned on flow - bw in/out. Msgs in/out.  With server side
> business logic, there is some random overhead that needs to be
> accounted in the capacity calculation.
>
> We, who run Pulsar as a hosted service, do not want any of our tenants
> to introduce server side logic into the service.  Because,  to do it
> well requires a load balancer that can continuously and dynamically
> adjust its load model and capacity model (based on ML on the traffic
> maybe).  The scope of building such a system will convert Pulsar  from
> a  data streaming project  to a load balancer/resource manager
> project. The only viable solution will be to give each tenant their
> own dedicated servers - at which point all claims to multi-tenancy in Pulsar  should be dropped.
>
>
> So large multi-tenant clusters will have big problems with the
> addition of business logic into the broker.
>
> But this problem - Pulsar users attempting to add server side logic
> into Pulsar - is not going to go away. There will always be yet
> another new user who will ask for adding ‘one more simple
> implementation' of server side business logic into the broker.
>
> My suggestion here is simple. Make the dispatcher a configurable module.
> Let users who want to do server side logic configure their own
> computational logic in custom dispatchers and   use it to their needs.
> Allow users  to implement custom dispatchers as a loadable module.
> Users can then implement whatever logic they need to, without
> depending on Pulsar, and the code and module will remain in user-land
> rather than Pulsar land.  No one will be required to  contribute their
> dispatchers to Pulsar, but if there are specific dispatchers which can
> have widespread use, they can contribute it back into Pulsar (like
> connectors)
>
> If this seems suspiciously similar to functions, then yes, it is.
> Functions were meant to fulfill this need, but without messing with the dispatcher.
> Functions were meant to do business logic outside the hosted service,
> so that the service itself is not impacted by random users injecting
> business logic into the platform.
>
> But if functions are not acceptable, and users still want to mess with
> the dispatcher, what I am proposing is a way to let users  do that
> without breaking the design goals of Pulsar.  That will avoid
> impacting the core data flow path,  for large system/ hosted service/multi-tenant use cases.
>
> So my vote is not to allow this (and any other server side logic
> implementations) into the base dispatcher, but permit these kinds of
> changes as configurable dispatchers. I hope I have explained the
> reasons for that vote clearly.
>
>
> Joe
>
>
> On Mon, Nov 16, 2020 at 10:03 AM Sijie Guo <gu...@gmail.com> wrote:
>
> > Andre,
> >
> > I left a comment on the pull request. But I will just copy them here
> > as well.
> >
> > I have a couple of comments and one suggestion.
> >
> > 1. What is the performance & GC implication with this change? I
> > think
> most
> > of the questions on this pull request is about the performance & GC
> > implication. It would be good to show your benchmarking/testing
> methodology
> > and the benchmark results to the community.
> >
> > 2. How are you going to handle topics with end-to-end encryption enabled?
> >
> > 3. How do you handle acknowledgment for the messages that have been
> > filtered out and never sent to the consumers? I don't see it is
> > discussed in the PIP. Especially, how is it related to different
> > subscription
> types?
> >
> > One suggestion - If this PIP is approved, my recommendation is to
> > use the NAR classloader to load the class. You can check how Pulsar
> > uses NAR classloader for other interfaces.
> >
> > Thanks,
> > Sijie
> >
> > On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <
> Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Sure, please feel free to copy the doc to wiki pages. It's mainly
> > > text
> so
> > > can be converted easily.
> > >
> > > Cheers,
> > > Andre
> > >
> > > -----Original Message-----
> > > From: Sijie Guo <gu...@gmail.com>
> > > Sent: 13 November 2020 19:08
> > > To: Dev <de...@pulsar.apache.org>
> > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > >
> > > Andre,
> > >
> > > Is it possible to put it in a Google Doc (or similar collaboration
> tool)
> > > that allows other people to make comments? Also, it would be
> > > easier for
> > the
> > > committers to copy the PIP to Pulsar wiki pages.
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <
> > Andre.Kramer@softwareag.com
> > > >
> > > wrote:
> > >
> > > > Hi Sijie,
> > > >
> > > > I had added a PIP style document to the pull request:
> > > >
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> > > > X%20-%20Consumer-filtering.pdf Hopefully that could be used to
> > > > start the discussion?
> > > >
> > > > Regards,
> > > > Andre
> > > >
> > > > -----Original Message-----
> > > > From: Sijie Guo <gu...@gmail.com>
> > > > Sent: 12 November 2020 18:32
> > > > To: Dev <de...@pulsar.apache.org>
> > > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > > >
> > > > Hi Andre,
> > > >
> > > > I didn't see the attached writeup. Can you write a PIP for this
> > feature?
> > > > Given it is a big feature, it would be good to discuss it
> > > > through a
> > PIP.
> > > >
> > > > - Sijie
> > > >
> > > > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> > > > <Andre.Kramer@softwareag.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > >
> > > > >
> > > > > We at Software AG have prototyped adding filtering on Consumer
> > > > > subscriptions in the Pulsar broker and are submitting our
> > > > > changes for consideration under Apache 2.0 license. Please see
> > > > > pull request [Consumer Filtering #8544
> > > > > https://github.com/apache/pulsar/pull/8544]
> > > > > and attached write up. Comments welcome!
> > > > >
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andre
> > > > >
> > > > >
> > > > >
> > > > > andre.kramer@softwareag.com
> > > > > This communication contains information which is confidential
> > > > > and may also be privileged. It is for the exclusive use of the
> > > > > intended recipient(s). If you are not the intended
> > > > > recipient(s), please note that any distribution, copying, or
> > > > > use of this communication or the information in it, is
> > > > > strictly prohibited. If you have received
> this
> > > > > communication in error please notify us by e-mail and then
> > > > > delete the
> > > > e-mail and any copies of it.
> > > > > Software AG (UK) Limited Registered in England & Wales 1310740
> > > > > - *http://www.softwareag.com/uk
> > > > > * <http://www.softwareag.com/uk>
> > > > >
> > > > This communication contains information which is confidential
> > > > and may also be privileged. It is for the exclusive use of the
> > > > intended recipient(s). If you are not the intended recipient(s),
> > > > please note that any distribution, copying, or use of this
> > > > communication or the information in it, is strictly prohibited.
> > > > If you have received this communication in error please notify
> > > > us by e-mail and then delete the
> > > e-mail and any copies of it.
> > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > http://www.softwareag.com/uk
> > > >
> > > This communication contains information which is confidential and
> > > may
> > also
> > > be privileged. It is for the exclusive use of the intended
> recipient(s).
> > If
> > > you are not the intended recipient(s), please note that any
> distribution,
> > > copying, or use of this communication or the information in it, is
> > strictly
> > > prohibited. If you have received this communication in error
> > > please
> > notify
> > > us by e-mail and then delete the e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > http://www.softwareag.com/uk
> > >
> >
>
This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Sijie Guo <gu...@gmail.com>.
Joe - Very comprehensive writeup!

> So my vote is not to allow this (and any other server side logic
implementations) into the base dispatcher, but permit these kinds of
changes as configurable dispatchers. I hope I have explained the reasons
for that vote clearly.

+1. We can do it in a way like what we did for protocol handlers.

Thanks,
Sijie

On Mon, Nov 16, 2020 at 10:40 AM Joe F <jo...@apache.org> wrote:

> We have had discussions in the community list on server side logic
> previously. I would like to keep the specific proposal in this PIP aside,
> and address what this PIP is  implicitly changing in core Pulsar design.  I
> want to have an explicit discussion on that topic: what is the path for
> server-side business logic in Pulsar?
>
> Pulsar has been designed to do a few things very well.  It is designed to
> be run as a hosted service, meaning it can be scaled horizontally by adding
> storage or compute hardware, as traffic or tenants on the service grows. It
> is optimized for data streaming at  throughput and scale,  and does
> multi-tenancy extremely well.  Part of that design is that there is no
> business logic that is in the data flow path. Since  business logic lives
> outside of the core data flow path in Pulsar, the core is optimized for
> data flow. Do plain byte movement - no ser/de, no byte copy, no
> computations - and do it extremely well. Other systems, like Kafka and
> Kinesis have taken the same approach;  no to server side business logic.
>
> This particular PIP  may be  expensive on the server, or not. The next PIP
> could be, and there is no rationale to stop adding any kind of business
> logic into the broker, once this concept is allowed.
>
> Selective consumers are an anti-pattern for data flow systems. There are
> systems out there that support implementation of business logic in the data
> flow path, and they don't scale.   Take the example of AMQ.   AMQ allows
> JMS/SQL-92 expressions server side. Once the door to this anti-pattern  is
> opened, there is no rhyme or reason to deny anything, upto  including a
> full-blown SQL query evaluation in the dispatch path.
>
> So why not allow that? Why not allow a full blown expression evaluation in
> the data flow path?
>
> Unfortunately there  is no way to answer this without bringing up the
> conflict of interest between small users vs. large scale users running
> multi-tenant hosted Pulsar, at huge traffic volumes.
>
> For low scale, single (or few) tenant installations, efficiency of flow,
> latency and throughput are not the driving concern. In a small cluster,
> the implications of cost and scale, is minimal in absolute terms,  when
> server side business logic is executed.
>
> For large scale users (like me) this is a no go. There are many problems
> with this,  that makes it very difficult to run a hosted platform with
> predictable  SLAs, once users can introduce business logic into the broker.
> These are on top of the performance and cost  implications
>
> First, broker throughput and performance becomes unpredictable.  The
> current Pulsar load model (and it is used in the load manager for load
> balancing) becomes unusable. Not only that, there will be no pre-computed
> model that can be used in the load manager. Since  the producer and
> consumer randomly decide on what is the business logic,and the computation
> can change based on the data,  the model itself becomes dynamic and the
> load manager has to rebuild the model anytime an user updates the business
> logic. That is a tall order, worth years of work to implement.
>
> Second, this introduces the noisy neighbor issue. Two tenants will happily
> run on the same broker, till one of them decides to change the logic on the
> subscription, and suddenly the  quality for the other tenant is degraded
> because the broker is impacted.  The system operator of the cluster has now
> to get involved out of the blue, because one tenant did a change.
> Basically  any tenant can disrupt the system by triggering additional
> business logic in the server, or by specific data patterns that can make
> the business logic expensive on the server
>
> Third, this makes provisioning capacity impossible. Today Pulsar users can
> be provisioned on flow - bw in/out. Msgs in/out.  With server side business
> logic, there is some random overhead that needs to be accounted in the
> capacity calculation.
>
> We, who run Pulsar as a hosted service, do not want any of our tenants to
> introduce server side logic into the service.  Because,  to do it well
> requires a load balancer that can continuously and dynamically adjust its
> load model and capacity model (based on ML on the traffic maybe).  The
> scope of building such a system will convert Pulsar  from a  data streaming
> project  to a load balancer/resource manager  project. The only viable
> solution will be to give each tenant their own dedicated servers - at which
> point all claims to multi-tenancy in Pulsar  should be dropped.
>
>
> So large multi-tenant clusters will have big problems with the addition of
> business logic into the broker.
>
> But this problem - Pulsar users attempting to add server side logic into
> Pulsar - is not going to go away. There will always be yet another new user
> who will ask for adding ‘one more simple implementation' of server side
> business logic into the broker.
>
> My suggestion here is simple. Make the dispatcher a configurable module.
> Let users who want to do server side logic configure their own
> computational logic in custom dispatchers and   use it to their needs.
> Allow users  to implement custom dispatchers as a loadable module.  Users
> can then implement whatever logic they need to, without depending on
> Pulsar, and the code and module will remain in user-land rather than Pulsar
> land.  No one will be required to  contribute their dispatchers to Pulsar,
> but if there are specific dispatchers which can have widespread use, they
> can contribute it back into Pulsar (like connectors)
>
> If this seems suspiciously similar to functions, then yes, it is. Functions
> were meant to fulfill this need, but without messing with the dispatcher.
> Functions were meant to do business logic outside the hosted service, so
> that the service itself is not impacted by random users injecting business
> logic into the platform.
>
> But if functions are not acceptable, and users still want to mess with the
> dispatcher, what I am proposing is a way to let users  do that without
> breaking the design goals of Pulsar.  That will avoid  impacting the core
> data flow path,  for large system/ hosted service/multi-tenant use cases.
>
> So my vote is not to allow this (and any other server side logic
> implementations) into the base dispatcher, but permit these kinds of
> changes as configurable dispatchers. I hope I have explained the reasons
> for that vote clearly.
>
>
> Joe
>
>
> On Mon, Nov 16, 2020 at 10:03 AM Sijie Guo <gu...@gmail.com> wrote:
>
> > Andre,
> >
> > I left a comment on the pull request. But I will just copy them here as
> > well.
> >
> > I have a couple of comments and one suggestion.
> >
> > 1. What is the performance & GC implication with this change? I think
> most
> > of the questions on this pull request is about the performance & GC
> > implication. It would be good to show your benchmarking/testing
> methodology
> > and the benchmark results to the community.
> >
> > 2. How are you going to handle topics with end-to-end encryption enabled?
> >
> > 3. How do you handle acknowledgment for the messages that have been
> > filtered out and never sent to the consumers? I don't see it is discussed
> > in the PIP. Especially, how is it related to different subscription
> types?
> >
> > One suggestion - If this PIP is approved, my recommendation is to use the
> > NAR classloader to load the class. You can check how Pulsar uses NAR
> > classloader for other interfaces.
> >
> > Thanks,
> > Sijie
> >
> > On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <
> Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Sure, please feel free to copy the doc to wiki pages. It's mainly text
> so
> > > can be converted easily.
> > >
> > > Cheers,
> > > Andre
> > >
> > > -----Original Message-----
> > > From: Sijie Guo <gu...@gmail.com>
> > > Sent: 13 November 2020 19:08
> > > To: Dev <de...@pulsar.apache.org>
> > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > >
> > > Andre,
> > >
> > > Is it possible to put it in a Google Doc (or similar collaboration
> tool)
> > > that allows other people to make comments? Also, it would be easier for
> > the
> > > committers to copy the PIP to Pulsar wiki pages.
> > >
> > > Thanks,
> > > Sijie
> > >
> > > On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <
> > Andre.Kramer@softwareag.com
> > > >
> > > wrote:
> > >
> > > > Hi Sijie,
> > > >
> > > > I had added a PIP style document to the pull request:
> > > >
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> > > > X%20-%20Consumer-filtering.pdf Hopefully that could be used to start
> > > > the discussion?
> > > >
> > > > Regards,
> > > > Andre
> > > >
> > > > -----Original Message-----
> > > > From: Sijie Guo <gu...@gmail.com>
> > > > Sent: 12 November 2020 18:32
> > > > To: Dev <de...@pulsar.apache.org>
> > > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > > >
> > > > Hi Andre,
> > > >
> > > > I didn't see the attached writeup. Can you write a PIP for this
> > feature?
> > > > Given it is a big feature, it would be good to discuss it through a
> > PIP.
> > > >
> > > > - Sijie
> > > >
> > > > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> > > > <Andre.Kramer@softwareag.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > >
> > > > >
> > > > > We at Software AG have prototyped adding filtering on Consumer
> > > > > subscriptions in the Pulsar broker and are submitting our changes
> > > > > for consideration under Apache 2.0 license. Please see pull request
> > > > > [Consumer Filtering #8544
> > > > > https://github.com/apache/pulsar/pull/8544]
> > > > > and attached write up. Comments welcome!
> > > > >
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andre
> > > > >
> > > > >
> > > > >
> > > > > andre.kramer@softwareag.com
> > > > > This communication contains information which is confidential and
> > > > > may also be privileged. It is for the exclusive use of the intended
> > > > > recipient(s). If you are not the intended recipient(s), please note
> > > > > that any distribution, copying, or use of this communication or the
> > > > > information in it, is strictly prohibited. If you have received
> this
> > > > > communication in error please notify us by e-mail and then delete
> > > > > the
> > > > e-mail and any copies of it.
> > > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > > *http://www.softwareag.com/uk
> > > > > * <http://www.softwareag.com/uk>
> > > > >
> > > > This communication contains information which is confidential and may
> > > > also be privileged. It is for the exclusive use of the intended
> > > > recipient(s). If you are not the intended recipient(s), please note
> > > > that any distribution, copying, or use of this communication or the
> > > > information in it, is strictly prohibited. If you have received this
> > > > communication in error please notify us by e-mail and then delete the
> > > e-mail and any copies of it.
> > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > http://www.softwareag.com/uk
> > > >
> > > This communication contains information which is confidential and may
> > also
> > > be privileged. It is for the exclusive use of the intended
> recipient(s).
> > If
> > > you are not the intended recipient(s), please note that any
> distribution,
> > > copying, or use of this communication or the information in it, is
> > strictly
> > > prohibited. If you have received this communication in error please
> > notify
> > > us by e-mail and then delete the e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > http://www.softwareag.com/uk
> > >
> >
>

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Dave Fisher <wa...@apache.org>.
Sorry, for the top post.

I was reading what you wrote and knew where you were heading.

Couldn’t these filtering functions / dispatchers be a special configuration of functions?

(1) Topic
(2) Filtered function
(3) Filtered topic

Isn’t the whole point of Functions to compute, filter, and distribute at any load needed?

And don’t these patterns have other projects that implement solutions like Apache Storm or Heron(Incubating)?

Regards,
Dave

PS. You remind of why I never liked massive procedures (particularly Java) in Oracle RDBMS.


> On Nov 16, 2020, at 10:40 AM, Joe F <jo...@apache.org> wrote:
> 
> We have had discussions in the community list on server side logic
> previously. I would like to keep the specific proposal in this PIP aside,
> and address what this PIP is  implicitly changing in core Pulsar design.  I
> want to have an explicit discussion on that topic: what is the path for
> server-side business logic in Pulsar?
> 
> Pulsar has been designed to do a few things very well.  It is designed to
> be run as a hosted service, meaning it can be scaled horizontally by adding
> storage or compute hardware, as traffic or tenants on the service grows. It
> is optimized for data streaming at  throughput and scale,  and does
> multi-tenancy extremely well.  Part of that design is that there is no
> business logic that is in the data flow path. Since  business logic lives
> outside of the core data flow path in Pulsar, the core is optimized for
> data flow. Do plain byte movement - no ser/de, no byte copy, no
> computations - and do it extremely well. Other systems, like Kafka and
> Kinesis have taken the same approach;  no to server side business logic.
> 
> This particular PIP  may be  expensive on the server, or not. The next PIP
> could be, and there is no rationale to stop adding any kind of business
> logic into the broker, once this concept is allowed.
> 
> Selective consumers are an anti-pattern for data flow systems. There are
> systems out there that support implementation of business logic in the data
> flow path, and they don't scale.   Take the example of AMQ.   AMQ allows
> JMS/SQL-92 expressions server side. Once the door to this anti-pattern  is
> opened, there is no rhyme or reason to deny anything, upto  including a
> full-blown SQL query evaluation in the dispatch path.
> 
> So why not allow that? Why not allow a full blown expression evaluation in
> the data flow path?
> 
> Unfortunately there  is no way to answer this without bringing up the
> conflict of interest between small users vs. large scale users running
> multi-tenant hosted Pulsar, at huge traffic volumes.
> 
> For low scale, single (or few) tenant installations, efficiency of flow,
> latency and throughput are not the driving concern. In a small cluster,
> the implications of cost and scale, is minimal in absolute terms,  when
> server side business logic is executed.
> 
> For large scale users (like me) this is a no go. There are many problems
> with this,  that makes it very difficult to run a hosted platform with
> predictable  SLAs, once users can introduce business logic into the broker.
> These are on top of the performance and cost  implications
> 
> First, broker throughput and performance becomes unpredictable.  The
> current Pulsar load model (and it is used in the load manager for load
> balancing) becomes unusable. Not only that, there will be no pre-computed
> model that can be used in the load manager. Since  the producer and
> consumer randomly decide on what is the business logic,and the computation
> can change based on the data,  the model itself becomes dynamic and the
> load manager has to rebuild the model anytime an user updates the business
> logic. That is a tall order, worth years of work to implement.
> 
> Second, this introduces the noisy neighbor issue. Two tenants will happily
> run on the same broker, till one of them decides to change the logic on the
> subscription, and suddenly the  quality for the other tenant is degraded
> because the broker is impacted.  The system operator of the cluster has now
> to get involved out of the blue, because one tenant did a change.
> Basically  any tenant can disrupt the system by triggering additional
> business logic in the server, or by specific data patterns that can make
> the business logic expensive on the server
> 
> Third, this makes provisioning capacity impossible. Today Pulsar users can
> be provisioned on flow - bw in/out. Msgs in/out.  With server side business
> logic, there is some random overhead that needs to be accounted in the
> capacity calculation.
> 
> We, who run Pulsar as a hosted service, do not want any of our tenants to
> introduce server side logic into the service.  Because,  to do it well
> requires a load balancer that can continuously and dynamically adjust its
> load model and capacity model (based on ML on the traffic maybe).  The
> scope of building such a system will convert Pulsar  from a  data streaming
> project  to a load balancer/resource manager  project. The only viable
> solution will be to give each tenant their own dedicated servers - at which
> point all claims to multi-tenancy in Pulsar  should be dropped.
> 
> 
> So large multi-tenant clusters will have big problems with the addition of
> business logic into the broker.
> 
> But this problem - Pulsar users attempting to add server side logic into
> Pulsar - is not going to go away. There will always be yet another new user
> who will ask for adding ‘one more simple implementation' of server side
> business logic into the broker.
> 
> My suggestion here is simple. Make the dispatcher a configurable module.
> Let users who want to do server side logic configure their own
> computational logic in custom dispatchers and   use it to their needs.
> Allow users  to implement custom dispatchers as a loadable module.  Users
> can then implement whatever logic they need to, without depending on
> Pulsar, and the code and module will remain in user-land rather than Pulsar
> land.  No one will be required to  contribute their dispatchers to Pulsar,
> but if there are specific dispatchers which can have widespread use, they
> can contribute it back into Pulsar (like connectors)
> 
> If this seems suspiciously similar to functions, then yes, it is. Functions
> were meant to fulfill this need, but without messing with the dispatcher.
> Functions were meant to do business logic outside the hosted service, so
> that the service itself is not impacted by random users injecting business
> logic into the platform.
> 
> But if functions are not acceptable, and users still want to mess with the
> dispatcher, what I am proposing is a way to let users  do that without
> breaking the design goals of Pulsar.  That will avoid  impacting the core
> data flow path,  for large system/ hosted service/multi-tenant use cases.
> 
> So my vote is not to allow this (and any other server side logic
> implementations) into the base dispatcher, but permit these kinds of
> changes as configurable dispatchers. I hope I have explained the reasons
> for that vote clearly.
> 
> 
> Joe
> 
> 
> On Mon, Nov 16, 2020 at 10:03 AM Sijie Guo <gu...@gmail.com> wrote:
> 
>> Andre,
>> 
>> I left a comment on the pull request. But I will just copy them here as
>> well.
>> 
>> I have a couple of comments and one suggestion.
>> 
>> 1. What is the performance & GC implication with this change? I think most
>> of the questions on this pull request is about the performance & GC
>> implication. It would be good to show your benchmarking/testing methodology
>> and the benchmark results to the community.
>> 
>> 2. How are you going to handle topics with end-to-end encryption enabled?
>> 
>> 3. How do you handle acknowledgment for the messages that have been
>> filtered out and never sent to the consumers? I don't see it is discussed
>> in the PIP. Especially, how is it related to different subscription types?
>> 
>> One suggestion - If this PIP is approved, my recommendation is to use the
>> NAR classloader to load the class. You can check how Pulsar uses NAR
>> classloader for other interfaces.
>> 
>> Thanks,
>> Sijie
>> 
>> On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <Andre.Kramer@softwareag.com
>>> 
>> wrote:
>> 
>>> Sure, please feel free to copy the doc to wiki pages. It's mainly text so
>>> can be converted easily.
>>> 
>>> Cheers,
>>> Andre
>>> 
>>> -----Original Message-----
>>> From: Sijie Guo <gu...@gmail.com>
>>> Sent: 13 November 2020 19:08
>>> To: Dev <de...@pulsar.apache.org>
>>> Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
>>> 
>>> Andre,
>>> 
>>> Is it possible to put it in a Google Doc (or similar collaboration tool)
>>> that allows other people to make comments? Also, it would be easier for
>> the
>>> committers to copy the PIP to Pulsar wiki pages.
>>> 
>>> Thanks,
>>> Sijie
>>> 
>>> On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <
>> Andre.Kramer@softwareag.com
>>>> 
>>> wrote:
>>> 
>>>> Hi Sijie,
>>>> 
>>>> I had added a PIP style document to the pull request:
>>>> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
>>>> X%20-%20Consumer-filtering.pdf Hopefully that could be used to start
>>>> the discussion?
>>>> 
>>>> Regards,
>>>> Andre
>>>> 
>>>> -----Original Message-----
>>>> From: Sijie Guo <gu...@gmail.com>
>>>> Sent: 12 November 2020 18:32
>>>> To: Dev <de...@pulsar.apache.org>
>>>> Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
>>>> 
>>>> Hi Andre,
>>>> 
>>>> I didn't see the attached writeup. Can you write a PIP for this
>> feature?
>>>> Given it is a big feature, it would be good to discuss it through a
>> PIP.
>>>> 
>>>> - Sijie
>>>> 
>>>> On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
>>>> <Andre.Kramer@softwareag.com
>>>>> 
>>>> wrote:
>>>> 
>>>>> Hello everyone,
>>>>> 
>>>>> 
>>>>> 
>>>>> We at Software AG have prototyped adding filtering on Consumer
>>>>> subscriptions in the Pulsar broker and are submitting our changes
>>>>> for consideration under Apache 2.0 license. Please see pull request
>>>>> [Consumer Filtering #8544
>>>>> https://github.com/apache/pulsar/pull/8544]
>>>>> and attached write up. Comments welcome!
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Andre
>>>>> 
>>>>> 
>>>>> 
>>>>> andre.kramer@softwareag.com
>>>>> This communication contains information which is confidential and
>>>>> may also be privileged. It is for the exclusive use of the intended
>>>>> recipient(s). If you are not the intended recipient(s), please note
>>>>> that any distribution, copying, or use of this communication or the
>>>>> information in it, is strictly prohibited. If you have received this
>>>>> communication in error please notify us by e-mail and then delete
>>>>> the
>>>> e-mail and any copies of it.
>>>>> Software AG (UK) Limited Registered in England & Wales 1310740 -
>>>>> *http://www.softwareag.com/uk
>>>>> * <http://www.softwareag.com/uk>
>>>>> 
>>>> This communication contains information which is confidential and may
>>>> also be privileged. It is for the exclusive use of the intended
>>>> recipient(s). If you are not the intended recipient(s), please note
>>>> that any distribution, copying, or use of this communication or the
>>>> information in it, is strictly prohibited. If you have received this
>>>> communication in error please notify us by e-mail and then delete the
>>> e-mail and any copies of it.
>>>> Software AG (UK) Limited Registered in England & Wales 1310740 -
>>>> http://www.softwareag.com/uk
>>>> 
>>> This communication contains information which is confidential and may
>> also
>>> be privileged. It is for the exclusive use of the intended recipient(s).
>> If
>>> you are not the intended recipient(s), please note that any distribution,
>>> copying, or use of this communication or the information in it, is
>> strictly
>>> prohibited. If you have received this communication in error please
>> notify
>>> us by e-mail and then delete the e-mail and any copies of it.
>>> Software AG (UK) Limited Registered in England & Wales 1310740 -
>>> http://www.softwareag.com/uk
>>> 
>> 


Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Joe F <jo...@apache.org>.
We have had discussions in the community list on server side logic
previously. I would like to keep the specific proposal in this PIP aside,
and address what this PIP is  implicitly changing in core Pulsar design.  I
want to have an explicit discussion on that topic: what is the path for
server-side business logic in Pulsar?

Pulsar has been designed to do a few things very well.  It is designed to
be run as a hosted service, meaning it can be scaled horizontally by adding
storage or compute hardware, as traffic or tenants on the service grows. It
is optimized for data streaming at  throughput and scale,  and does
multi-tenancy extremely well.  Part of that design is that there is no
business logic that is in the data flow path. Since  business logic lives
outside of the core data flow path in Pulsar, the core is optimized for
data flow. Do plain byte movement - no ser/de, no byte copy, no
computations - and do it extremely well. Other systems, like Kafka and
Kinesis have taken the same approach;  no to server side business logic.

This particular PIP  may be  expensive on the server, or not. The next PIP
could be, and there is no rationale to stop adding any kind of business
logic into the broker, once this concept is allowed.

Selective consumers are an anti-pattern for data flow systems. There are
systems out there that support implementation of business logic in the data
flow path, and they don't scale.   Take the example of AMQ.   AMQ allows
JMS/SQL-92 expressions server side. Once the door to this anti-pattern  is
opened, there is no rhyme or reason to deny anything, upto  including a
full-blown SQL query evaluation in the dispatch path.

So why not allow that? Why not allow a full blown expression evaluation in
the data flow path?

Unfortunately there  is no way to answer this without bringing up the
conflict of interest between small users vs. large scale users running
multi-tenant hosted Pulsar, at huge traffic volumes.

For low scale, single (or few) tenant installations, efficiency of flow,
latency and throughput are not the driving concern. In a small cluster,
the implications of cost and scale, is minimal in absolute terms,  when
server side business logic is executed.

For large scale users (like me) this is a no go. There are many problems
with this,  that makes it very difficult to run a hosted platform with
predictable  SLAs, once users can introduce business logic into the broker.
These are on top of the performance and cost  implications

First, broker throughput and performance becomes unpredictable.  The
current Pulsar load model (and it is used in the load manager for load
balancing) becomes unusable. Not only that, there will be no pre-computed
model that can be used in the load manager. Since  the producer and
consumer randomly decide on what is the business logic,and the computation
can change based on the data,  the model itself becomes dynamic and the
load manager has to rebuild the model anytime an user updates the business
logic. That is a tall order, worth years of work to implement.

Second, this introduces the noisy neighbor issue. Two tenants will happily
run on the same broker, till one of them decides to change the logic on the
subscription, and suddenly the  quality for the other tenant is degraded
because the broker is impacted.  The system operator of the cluster has now
to get involved out of the blue, because one tenant did a change.
Basically  any tenant can disrupt the system by triggering additional
business logic in the server, or by specific data patterns that can make
the business logic expensive on the server

Third, this makes provisioning capacity impossible. Today Pulsar users can
be provisioned on flow - bw in/out. Msgs in/out.  With server side business
logic, there is some random overhead that needs to be accounted in the
capacity calculation.

We, who run Pulsar as a hosted service, do not want any of our tenants to
introduce server side logic into the service.  Because,  to do it well
requires a load balancer that can continuously and dynamically adjust its
load model and capacity model (based on ML on the traffic maybe).  The
scope of building such a system will convert Pulsar  from a  data streaming
project  to a load balancer/resource manager  project. The only viable
solution will be to give each tenant their own dedicated servers - at which
point all claims to multi-tenancy in Pulsar  should be dropped.


So large multi-tenant clusters will have big problems with the addition of
business logic into the broker.

But this problem - Pulsar users attempting to add server side logic into
Pulsar - is not going to go away. There will always be yet another new user
who will ask for adding ‘one more simple implementation' of server side
business logic into the broker.

My suggestion here is simple. Make the dispatcher a configurable module.
Let users who want to do server side logic configure their own
computational logic in custom dispatchers and   use it to their needs.
Allow users  to implement custom dispatchers as a loadable module.  Users
can then implement whatever logic they need to, without depending on
Pulsar, and the code and module will remain in user-land rather than Pulsar
land.  No one will be required to  contribute their dispatchers to Pulsar,
but if there are specific dispatchers which can have widespread use, they
can contribute it back into Pulsar (like connectors)

If this seems suspiciously similar to functions, then yes, it is. Functions
were meant to fulfill this need, but without messing with the dispatcher.
Functions were meant to do business logic outside the hosted service, so
that the service itself is not impacted by random users injecting business
logic into the platform.

But if functions are not acceptable, and users still want to mess with the
dispatcher, what I am proposing is a way to let users  do that without
breaking the design goals of Pulsar.  That will avoid  impacting the core
data flow path,  for large system/ hosted service/multi-tenant use cases.

So my vote is not to allow this (and any other server side logic
implementations) into the base dispatcher, but permit these kinds of
changes as configurable dispatchers. I hope I have explained the reasons
for that vote clearly.


Joe


On Mon, Nov 16, 2020 at 10:03 AM Sijie Guo <gu...@gmail.com> wrote:

> Andre,
>
> I left a comment on the pull request. But I will just copy them here as
> well.
>
> I have a couple of comments and one suggestion.
>
> 1. What is the performance & GC implication with this change? I think most
> of the questions on this pull request is about the performance & GC
> implication. It would be good to show your benchmarking/testing methodology
> and the benchmark results to the community.
>
> 2. How are you going to handle topics with end-to-end encryption enabled?
>
> 3. How do you handle acknowledgment for the messages that have been
> filtered out and never sent to the consumers? I don't see it is discussed
> in the PIP. Especially, how is it related to different subscription types?
>
> One suggestion - If this PIP is approved, my recommendation is to use the
> NAR classloader to load the class. You can check how Pulsar uses NAR
> classloader for other interfaces.
>
> Thanks,
> Sijie
>
> On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Sure, please feel free to copy the doc to wiki pages. It's mainly text so
> > can be converted easily.
> >
> > Cheers,
> > Andre
> >
> > -----Original Message-----
> > From: Sijie Guo <gu...@gmail.com>
> > Sent: 13 November 2020 19:08
> > To: Dev <de...@pulsar.apache.org>
> > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> >
> > Andre,
> >
> > Is it possible to put it in a Google Doc (or similar collaboration tool)
> > that allows other people to make comments? Also, it would be easier for
> the
> > committers to copy the PIP to Pulsar wiki pages.
> >
> > Thanks,
> > Sijie
> >
> > On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <
> Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Hi Sijie,
> > >
> > > I had added a PIP style document to the pull request:
> > > https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> > > X%20-%20Consumer-filtering.pdf Hopefully that could be used to start
> > > the discussion?
> > >
> > > Regards,
> > > Andre
> > >
> > > -----Original Message-----
> > > From: Sijie Guo <gu...@gmail.com>
> > > Sent: 12 November 2020 18:32
> > > To: Dev <de...@pulsar.apache.org>
> > > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> > >
> > > Hi Andre,
> > >
> > > I didn't see the attached writeup. Can you write a PIP for this
> feature?
> > > Given it is a big feature, it would be good to discuss it through a
> PIP.
> > >
> > > - Sijie
> > >
> > > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> > > <Andre.Kramer@softwareag.com
> > > >
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > >
> > > >
> > > > We at Software AG have prototyped adding filtering on Consumer
> > > > subscriptions in the Pulsar broker and are submitting our changes
> > > > for consideration under Apache 2.0 license. Please see pull request
> > > > [Consumer Filtering #8544
> > > > https://github.com/apache/pulsar/pull/8544]
> > > > and attached write up. Comments welcome!
> > > >
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Andre
> > > >
> > > >
> > > >
> > > > andre.kramer@softwareag.com
> > > > This communication contains information which is confidential and
> > > > may also be privileged. It is for the exclusive use of the intended
> > > > recipient(s). If you are not the intended recipient(s), please note
> > > > that any distribution, copying, or use of this communication or the
> > > > information in it, is strictly prohibited. If you have received this
> > > > communication in error please notify us by e-mail and then delete
> > > > the
> > > e-mail and any copies of it.
> > > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > > *http://www.softwareag.com/uk
> > > > * <http://www.softwareag.com/uk>
> > > >
> > > This communication contains information which is confidential and may
> > > also be privileged. It is for the exclusive use of the intended
> > > recipient(s). If you are not the intended recipient(s), please note
> > > that any distribution, copying, or use of this communication or the
> > > information in it, is strictly prohibited. If you have received this
> > > communication in error please notify us by e-mail and then delete the
> > e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > http://www.softwareag.com/uk
> > >
> > This communication contains information which is confidential and may
> also
> > be privileged. It is for the exclusive use of the intended recipient(s).
> If
> > you are not the intended recipient(s), please note that any distribution,
> > copying, or use of this communication or the information in it, is
> strictly
> > prohibited. If you have received this communication in error please
> notify
> > us by e-mail and then delete the e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > http://www.softwareag.com/uk
> >
>

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Sijie Guo <gu...@gmail.com>.
Andre,

I left a comment on the pull request. But I will just copy them here as
well.

I have a couple of comments and one suggestion.

1. What is the performance & GC implication with this change? I think most
of the questions on this pull request is about the performance & GC
implication. It would be good to show your benchmarking/testing methodology
and the benchmark results to the community.

2. How are you going to handle topics with end-to-end encryption enabled?

3. How do you handle acknowledgment for the messages that have been
filtered out and never sent to the consumers? I don't see it is discussed
in the PIP. Especially, how is it related to different subscription types?

One suggestion - If this PIP is approved, my recommendation is to use the
NAR classloader to load the class. You can check how Pulsar uses NAR
classloader for other interfaces.

Thanks,
Sijie

On Mon, Nov 16, 2020 at 2:53 AM Kramer, Andre <An...@softwareag.com>
wrote:

> Sure, please feel free to copy the doc to wiki pages. It's mainly text so
> can be converted easily.
>
> Cheers,
> Andre
>
> -----Original Message-----
> From: Sijie Guo <gu...@gmail.com>
> Sent: 13 November 2020 19:08
> To: Dev <de...@pulsar.apache.org>
> Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
>
> Andre,
>
> Is it possible to put it in a Google Doc (or similar collaboration tool)
> that allows other people to make comments? Also, it would be easier for the
> committers to copy the PIP to Pulsar wiki pages.
>
> Thanks,
> Sijie
>
> On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Hi Sijie,
> >
> > I had added a PIP style document to the pull request:
> > https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> > X%20-%20Consumer-filtering.pdf Hopefully that could be used to start
> > the discussion?
> >
> > Regards,
> > Andre
> >
> > -----Original Message-----
> > From: Sijie Guo <gu...@gmail.com>
> > Sent: 12 November 2020 18:32
> > To: Dev <de...@pulsar.apache.org>
> > Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
> >
> > Hi Andre,
> >
> > I didn't see the attached writeup. Can you write a PIP for this feature?
> > Given it is a big feature, it would be good to discuss it through a PIP.
> >
> > - Sijie
> >
> > On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> > <Andre.Kramer@softwareag.com
> > >
> > wrote:
> >
> > > Hello everyone,
> > >
> > >
> > >
> > > We at Software AG have prototyped adding filtering on Consumer
> > > subscriptions in the Pulsar broker and are submitting our changes
> > > for consideration under Apache 2.0 license. Please see pull request
> > > [Consumer Filtering #8544
> > > https://github.com/apache/pulsar/pull/8544]
> > > and attached write up. Comments welcome!
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Andre
> > >
> > >
> > >
> > > andre.kramer@softwareag.com
> > > This communication contains information which is confidential and
> > > may also be privileged. It is for the exclusive use of the intended
> > > recipient(s). If you are not the intended recipient(s), please note
> > > that any distribution, copying, or use of this communication or the
> > > information in it, is strictly prohibited. If you have received this
> > > communication in error please notify us by e-mail and then delete
> > > the
> > e-mail and any copies of it.
> > > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > > *http://www.softwareag.com/uk
> > > * <http://www.softwareag.com/uk>
> > >
> > This communication contains information which is confidential and may
> > also be privileged. It is for the exclusive use of the intended
> > recipient(s). If you are not the intended recipient(s), please note
> > that any distribution, copying, or use of this communication or the
> > information in it, is strictly prohibited. If you have received this
> > communication in error please notify us by e-mail and then delete the
> e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > http://www.softwareag.com/uk
> >
> This communication contains information which is confidential and may also
> be privileged. It is for the exclusive use of the intended recipient(s). If
> you are not the intended recipient(s), please note that any distribution,
> copying, or use of this communication or the information in it, is strictly
> prohibited. If you have received this communication in error please notify
> us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 -
> http://www.softwareag.com/uk
>

RE: Proposal for Consumer Filtering in Pulsar brokers

Posted by "Kramer, Andre" <An...@softwareag.com>.
Sure, please feel free to copy the doc to wiki pages. It's mainly text so can be converted easily.

Cheers,
Andre

-----Original Message-----
From: Sijie Guo <gu...@gmail.com>
Sent: 13 November 2020 19:08
To: Dev <de...@pulsar.apache.org>
Subject: Re: Proposal for Consumer Filtering in Pulsar brokers

Andre,

Is it possible to put it in a Google Doc (or similar collaboration tool) that allows other people to make comments? Also, it would be easier for the committers to copy the PIP to Pulsar wiki pages.

Thanks,
Sijie

On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <An...@softwareag.com>
wrote:

> Hi Sijie,
>
> I had added a PIP style document to the pull request:
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-X
> X%20-%20Consumer-filtering.pdf Hopefully that could be used to start
> the discussion?
>
> Regards,
> Andre
>
> -----Original Message-----
> From: Sijie Guo <gu...@gmail.com>
> Sent: 12 November 2020 18:32
> To: Dev <de...@pulsar.apache.org>
> Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
>
> Hi Andre,
>
> I didn't see the attached writeup. Can you write a PIP for this feature?
> Given it is a big feature, it would be good to discuss it through a PIP.
>
> - Sijie
>
> On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre
> <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Hello everyone,
> >
> >
> >
> > We at Software AG have prototyped adding filtering on Consumer
> > subscriptions in the Pulsar broker and are submitting our changes
> > for consideration under Apache 2.0 license. Please see pull request
> > [Consumer Filtering #8544
> > https://github.com/apache/pulsar/pull/8544]
> > and attached write up. Comments welcome!
> >
> >
> >
> > Thanks,
> >
> > Andre
> >
> >
> >
> > andre.kramer@softwareag.com
> > This communication contains information which is confidential and
> > may also be privileged. It is for the exclusive use of the intended
> > recipient(s). If you are not the intended recipient(s), please note
> > that any distribution, copying, or use of this communication or the
> > information in it, is strictly prohibited. If you have received this
> > communication in error please notify us by e-mail and then delete
> > the
> e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > *http://www.softwareag.com/uk
> > * <http://www.softwareag.com/uk>
> >
> This communication contains information which is confidential and may
> also be privileged. It is for the exclusive use of the intended
> recipient(s). If you are not the intended recipient(s), please note
> that any distribution, copying, or use of this communication or the
> information in it, is strictly prohibited. If you have received this
> communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 -
> http://www.softwareag.com/uk
>
This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Sijie Guo <gu...@gmail.com>.
Andre,

Is it possible to put it in a Google Doc (or similar collaboration tool)
that allows other people to make comments? Also, it would be easier for the
committers to copy the PIP to Pulsar wiki pages.

Thanks,
Sijie

On Fri, Nov 13, 2020 at 2:44 AM Kramer, Andre <An...@softwareag.com>
wrote:

> Hi Sijie,
>
> I had added a PIP style document to the pull request:
> https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-XX%20-%20Consumer-filtering.pdf
> Hopefully that could be used to start the discussion?
>
> Regards,
> Andre
>
> -----Original Message-----
> From: Sijie Guo <gu...@gmail.com>
> Sent: 12 November 2020 18:32
> To: Dev <de...@pulsar.apache.org>
> Subject: Re: Proposal for Consumer Filtering in Pulsar brokers
>
> Hi Andre,
>
> I didn't see the attached writeup. Can you write a PIP for this feature?
> Given it is a big feature, it would be good to discuss it through a PIP.
>
> - Sijie
>
> On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre <Andre.Kramer@softwareag.com
> >
> wrote:
>
> > Hello everyone,
> >
> >
> >
> > We at Software AG have prototyped adding filtering on Consumer
> > subscriptions in the Pulsar broker and are submitting our changes for
> > consideration under Apache 2.0 license. Please see pull request
> > [Consumer Filtering #8544 https://github.com/apache/pulsar/pull/8544]
> > and attached write up. Comments welcome!
> >
> >
> >
> > Thanks,
> >
> > Andre
> >
> >
> >
> > andre.kramer@softwareag.com
> > This communication contains information which is confidential and may
> > also be privileged. It is for the exclusive use of the intended
> > recipient(s). If you are not the intended recipient(s), please note
> > that any distribution, copying, or use of this communication or the
> > information in it, is strictly prohibited. If you have received this
> > communication in error please notify us by e-mail and then delete the
> e-mail and any copies of it.
> > Software AG (UK) Limited Registered in England & Wales 1310740 -
> > *http://www.softwareag.com/uk
> > * <http://www.softwareag.com/uk>
> >
> This communication contains information which is confidential and may also
> be privileged. It is for the exclusive use of the intended recipient(s). If
> you are not the intended recipient(s), please note that any distribution,
> copying, or use of this communication or the information in it, is strictly
> prohibited. If you have received this communication in error please notify
> us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 -
> http://www.softwareag.com/uk
>

RE: Proposal for Consumer Filtering in Pulsar brokers

Posted by "Kramer, Andre" <An...@softwareag.com>.
Hi Sijie,

I had added a PIP style document to the pull request: https://github.com/andrekramer1/pulsar/blob/consumer-filter2-7-0/PIP-XX%20-%20Consumer-filtering.pdf Hopefully that could be used to start the discussion?

Regards,
Andre

-----Original Message-----
From: Sijie Guo <gu...@gmail.com>
Sent: 12 November 2020 18:32
To: Dev <de...@pulsar.apache.org>
Subject: Re: Proposal for Consumer Filtering in Pulsar brokers

Hi Andre,

I didn't see the attached writeup. Can you write a PIP for this feature?
Given it is a big feature, it would be good to discuss it through a PIP.

- Sijie

On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre <An...@softwareag.com>
wrote:

> Hello everyone,
>
>
>
> We at Software AG have prototyped adding filtering on Consumer
> subscriptions in the Pulsar broker and are submitting our changes for
> consideration under Apache 2.0 license. Please see pull request
> [Consumer Filtering #8544 https://github.com/apache/pulsar/pull/8544]
> and attached write up. Comments welcome!
>
>
>
> Thanks,
>
> Andre
>
>
>
> andre.kramer@softwareag.com
> This communication contains information which is confidential and may
> also be privileged. It is for the exclusive use of the intended
> recipient(s). If you are not the intended recipient(s), please note
> that any distribution, copying, or use of this communication or the
> information in it, is strictly prohibited. If you have received this
> communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 -
> *http://www.softwareag.com/uk
> * <http://www.softwareag.com/uk>
>
This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), please note that any distribution, copying, or use of this communication or the information in it, is strictly prohibited. If you have received this communication in error please notify us by e-mail and then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - http://www.softwareag.com/uk

Re: Proposal for Consumer Filtering in Pulsar brokers

Posted by Sijie Guo <gu...@gmail.com>.
Hi Andre,

I didn't see the attached writeup. Can you write a PIP for this feature?
Given it is a big feature, it would be good to discuss it through a PIP.

- Sijie

On Thu, Nov 12, 2020 at 6:17 AM Kramer, Andre <An...@softwareag.com>
wrote:

> Hello everyone,
>
>
>
> We at Software AG have prototyped adding filtering on Consumer
> subscriptions in the Pulsar broker and are submitting our changes for
> consideration under Apache 2.0 license. Please see pull request [Consumer
> Filtering #8544 https://github.com/apache/pulsar/pull/8544] and attached
> write up. Comments welcome!
>
>
>
> Thanks,
>
> Andre
>
>
>
> andre.kramer@softwareag.com
> This communication contains information which is confidential and may also
> be privileged. It is for the exclusive use of the intended recipient(s). If
> you are not the intended recipient(s), please note that any distribution,
> copying, or use of this communication or the information in it, is strictly
> prohibited. If you have received this communication in error please notify
> us by e-mail and then delete the e-mail and any copies of it.
> Software AG (UK) Limited Registered in England & Wales 1310740 - *http://www.softwareag.com/uk
> * <http://www.softwareag.com/uk>
>