You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Enrico Olivelli <eo...@gmail.com> on 2021/08/30 08:47:00 UTC

PIP-93 Pulsar Proxy Protocol Handlers

Hello Pulsar fellows,

I have prepared a PIP about adding support for Protocol Handlers

This is the GDoc

https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing


This is the PR for the implementation
https://github.com/apache/pulsar/pull/11838/files

I am pretty sure that this PIP will make life of developers of Protocol
Handlers and of Administrators who deploy Protocol Handlers very nicer

We are still working on the formal PIP process, at the moment I am sharing
with you the document.
My understanding is that after the discussion, I will start a VOTE thread,
and if the VOTE passes we can move forward with reviewing the PR, and
hopefully merge this feature for Pulsar 2.9.0

Enrico

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Yunze Xu <yz...@streamnative.io.INVALID>.
Thanks for your explanation and I’m looking forward for the prototype implementation.

Thanks,
Yunze

> 2021年8月31日 上午4:17,Enrico Olivelli <eo...@gmail.com> 写道:
> 
> Yunze,
> 
> Il Lun 30 Ago 2021, 18:48 Yunze Xu <yz...@streamnative.io.invalid> ha
> scritto:
> 
>> If I didn’t understand wrong, we’re going to use both broker version and
>> proxy version KoP:
>> - The proxy version is responsible for lookup/auth related requests like
>> METADATA and SASL_XXX requests
>> - The broker version is responsible for other requests that require broker
>> to be the topic owner, like PRODUCE and FETCH requests
>> Right?
>> 
> 
> You are on the right way.
> Probably it is better to discuss about KOP in a separate thread.
> 
> Enrico
> 
> Thank
>> Yunze
>> 
>>> 2021年8月30日 下午11:56,Enrico Olivelli <eo...@gmail.com> 写道:
>>> 
>>> Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
>>> <yz...@streamnative.io.invalid> ha scritto:
>>> 
>>>> +1. Great idea.
>>>> 
>>>> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
>>>> protocol handler
>>>> Reuse the existing code of a protocol handler?
>>>> 
>>> 
>>> The code that runs on proxy will be much different from the code you have
>>> in the Broker Protocol Handler.
>>> 
>>> Basically the Proxy protocol handles do these things:
>>> - run the custom wire protocol (by starting custom Netty endpoints)
>>> - use the discovery service to proxy the requests to the Broker that is
>> the
>>> owner of the topic
>>> - run authentication and forwards user identity (if needed) to the Broker
>>> - performs authorization
>>> 
>>> The Proxy protocol handler does not access the BrokerService and cannot
>>> access Pulsar broker internals
>>> 
>>> Enrico
>>> 
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> Yunze
>>>> 
>>>>> 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
>>>>> 
>>>>> Hello Pulsar fellows,
>>>>> 
>>>>> I have prepared a PIP about adding support for Protocol Handlers
>>>>> 
>>>>> This is the GDoc
>>>>> 
>>>>> 
>>>> 
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>>>>> 
>>>>> 
>>>>> This is the PR for the implementation
>>>>> https://github.com/apache/pulsar/pull/11838/files
>>>>> 
>>>>> I am pretty sure that this PIP will make life of developers of Protocol
>>>>> Handlers and of Administrators who deploy Protocol Handlers very nicer
>>>>> 
>>>>> We are still working on the formal PIP process, at the moment I am
>>>> sharing
>>>>> with you the document.
>>>>> My understanding is that after the discussion, I will start a VOTE
>>>> thread,
>>>>> and if the VOTE passes we can move forward with reviewing the PR, and
>>>>> hopefully merge this feature for Pulsar 2.9.0
>>>>> 
>>>>> Enrico
>>>> 
>>>> 
>> 
>> 


Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Yunze,

Il Lun 30 Ago 2021, 18:48 Yunze Xu <yz...@streamnative.io.invalid> ha
scritto:

> If I didn’t understand wrong, we’re going to use both broker version and
> proxy version KoP:
> - The proxy version is responsible for lookup/auth related requests like
> METADATA and SASL_XXX requests
> - The broker version is responsible for other requests that require broker
> to be the topic owner, like PRODUCE and FETCH requests
> Right?
>

You are on the right way.
Probably it is better to discuss about KOP in a separate thread.

Enrico

Thank
> Yunze
>
> > 2021年8月30日 下午11:56,Enrico Olivelli <eo...@gmail.com> 写道:
> >
> > Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
> > <yz...@streamnative.io.invalid> ha scritto:
> >
> >> +1. Great idea.
> >>
> >> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
> >> protocol handler
> >> Reuse the existing code of a protocol handler?
> >>
> >
> > The code that runs on proxy will be much different from the code you have
> > in the Broker Protocol Handler.
> >
> > Basically the Proxy protocol handles do these things:
> > - run the custom wire protocol (by starting custom Netty endpoints)
> > - use the discovery service to proxy the requests to the Broker that is
> the
> > owner of the topic
> > - run authentication and forwards user identity (if needed) to the Broker
> > - performs authorization
> >
> > The Proxy protocol handler does not access the BrokerService and cannot
> > access Pulsar broker internals
> >
> > Enrico
> >
> >
> >
> >>
> >> Thanks,
> >> Yunze
> >>
> >>> 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
> >>>
> >>> Hello Pulsar fellows,
> >>>
> >>> I have prepared a PIP about adding support for Protocol Handlers
> >>>
> >>> This is the GDoc
> >>>
> >>>
> >>
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >>>
> >>>
> >>> This is the PR for the implementation
> >>> https://github.com/apache/pulsar/pull/11838/files
> >>>
> >>> I am pretty sure that this PIP will make life of developers of Protocol
> >>> Handlers and of Administrators who deploy Protocol Handlers very nicer
> >>>
> >>> We are still working on the formal PIP process, at the moment I am
> >> sharing
> >>> with you the document.
> >>> My understanding is that after the discussion, I will start a VOTE
> >> thread,
> >>> and if the VOTE passes we can move forward with reviewing the PR, and
> >>> hopefully merge this feature for Pulsar 2.9.0
> >>>
> >>> Enrico
> >>
> >>
>
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Yunze Xu <yz...@streamnative.io.INVALID>.
If I didn’t understand wrong, we’re going to use both broker version and proxy version KoP:
- The proxy version is responsible for lookup/auth related requests like METADATA and SASL_XXX requests
- The broker version is responsible for other requests that require broker to be the topic owner, like PRODUCE and FETCH requests
Right?

Thanks,
Yunze

> 2021年8月30日 下午11:56,Enrico Olivelli <eo...@gmail.com> 写道:
> 
> Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
> <yz...@streamnative.io.invalid> ha scritto:
> 
>> +1. Great idea.
>> 
>> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
>> protocol handler
>> Reuse the existing code of a protocol handler?
>> 
> 
> The code that runs on proxy will be much different from the code you have
> in the Broker Protocol Handler.
> 
> Basically the Proxy protocol handles do these things:
> - run the custom wire protocol (by starting custom Netty endpoints)
> - use the discovery service to proxy the requests to the Broker that is the
> owner of the topic
> - run authentication and forwards user identity (if needed) to the Broker
> - performs authorization
> 
> The Proxy protocol handler does not access the BrokerService and cannot
> access Pulsar broker internals
> 
> Enrico
> 
> 
> 
>> 
>> Thanks,
>> Yunze
>> 
>>> 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
>>> 
>>> Hello Pulsar fellows,
>>> 
>>> I have prepared a PIP about adding support for Protocol Handlers
>>> 
>>> This is the GDoc
>>> 
>>> 
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>>> 
>>> 
>>> This is the PR for the implementation
>>> https://github.com/apache/pulsar/pull/11838/files
>>> 
>>> I am pretty sure that this PIP will make life of developers of Protocol
>>> Handlers and of Administrators who deploy Protocol Handlers very nicer
>>> 
>>> We are still working on the formal PIP process, at the moment I am
>> sharing
>>> with you the document.
>>> My understanding is that after the discussion, I will start a VOTE
>> thread,
>>> and if the VOTE passes we can move forward with reviewing the PR, and
>>> hopefully merge this feature for Pulsar 2.9.0
>>> 
>>> Enrico
>> 
>> 


Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Il Lun 30 Ago 2021, 18:03 Lan Liang <li...@163.com> ha scritto:

> +1. Thanks for your work.
>
>
> I want to know can support pulsar proxy PH and pulsar broker PH at the
> same thime for one protocol ?
>
>
> like kop/mop, should we implement again?


I have a prototype for KOP but MOP story may be similar

Enrico


>
>
> In MOP, some palace have use BrokerService, So we need change MOP or
> implement MOP for pulsar proxy PH again if we use MOP on pulsar proxy?
>
>
>
>
>
>
> - lan.liang​
>
>
> On 08/30/2021 23:56,Enrico Olivelli<eo...@gmail.com> wrote:
> Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
> <yz...@streamnative.io.invalid> ha scritto:
>
> > +1. Great idea.
> >
> > I’m not familiar with Pulsar Proxy and have a question. How can a proxy
> > protocol handler
> > Reuse the existing code of a protocol handler?
> >
>
> The code that runs on proxy will be much different from the code you have
> in the Broker Protocol Handler.
>
> Basically the Proxy protocol handles do these things:
> - run the custom wire protocol (by starting custom Netty endpoints)
> - use the discovery service to proxy the requests to the Broker that is the
> owner of the topic
> - run authentication and forwards user identity (if needed) to the Broker
> - performs authorization
>
> The Proxy protocol handler does not access the BrokerService and cannot
> access Pulsar broker internals
>
> Enrico
>
>
>
> >
> > Thanks,
> > Yunze
> >
> > > 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
> > >
> > > Hello Pulsar fellows,
> > >
> > > I have prepared a PIP about adding support for Protocol Handlers
> > >
> > > This is the GDoc
> > >
> > >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > >
> > >
> > > This is the PR for the implementation
> > > https://github.com/apache/pulsar/pull/11838/files
> > >
> > > I am pretty sure that this PIP will make life of developers of Protocol
> > > Handlers and of Administrators who deploy Protocol Handlers very nicer
> > >
> > > We are still working on the formal PIP process, at the moment I am
> > sharing
> > > with you the document.
> > > My understanding is that after the discussion, I will start a VOTE
> > thread,
> > > and if the VOTE passes we can move forward with reviewing the PR, and
> > > hopefully merge this feature for Pulsar 2.9.0
> > >
> > > Enrico
> >
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Lan Liang <li...@163.com>.
+1. Thanks for your work.


I want to know can support pulsar proxy PH and pulsar broker PH at the same thime for one protocol ?


like kop/mop, should we implement again?  


In MOP, some palace have use BrokerService, So we need change MOP or implement MOP for pulsar proxy PH again if we use MOP on pulsar proxy?






- lan.liang​


On 08/30/2021 23:56,Enrico Olivelli<eo...@gmail.com> wrote:
Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
<yz...@streamnative.io.invalid> ha scritto:

> +1. Great idea.
>
> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
> protocol handler
> Reuse the existing code of a protocol handler?
>

The code that runs on proxy will be much different from the code you have
in the Broker Protocol Handler.

Basically the Proxy protocol handles do these things:
- run the custom wire protocol (by starting custom Netty endpoints)
- use the discovery service to proxy the requests to the Broker that is the
owner of the topic
- run authentication and forwards user identity (if needed) to the Broker
- performs authorization

The Proxy protocol handler does not access the BrokerService and cannot
access Pulsar broker internals

Enrico



>
> Thanks,
> Yunze
>
> > 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
> >
> > Hello Pulsar fellows,
> >
> > I have prepared a PIP about adding support for Protocol Handlers
> >
> > This is the GDoc
> >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >
> >
> > This is the PR for the implementation
> > https://github.com/apache/pulsar/pull/11838/files
> >
> > I am pretty sure that this PIP will make life of developers of Protocol
> > Handlers and of Administrators who deploy Protocol Handlers very nicer
> >
> > We are still working on the formal PIP process, at the moment I am
> sharing
> > with you the document.
> > My understanding is that after the discussion, I will start a VOTE
> thread,
> > and if the VOTE passes we can move forward with reviewing the PR, and
> > hopefully merge this feature for Pulsar 2.9.0
> >
> > Enrico
>
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno lun 30 ago 2021 alle ore 17:22 Yunze Xu
<yz...@streamnative.io.invalid> ha scritto:

> +1. Great idea.
>
> I’m not familiar with Pulsar Proxy and have a question. How can a proxy
> protocol handler
> Reuse the existing code of a protocol handler?
>

The code that runs on proxy will be much different from the code you have
in the Broker Protocol Handler.

Basically the Proxy protocol handles do these things:
- run the custom wire protocol (by starting custom Netty endpoints)
- use the discovery service to proxy the requests to the Broker that is the
owner of the topic
- run authentication and forwards user identity (if needed) to the Broker
- performs authorization

The Proxy protocol handler does not access the BrokerService and cannot
access Pulsar broker internals

Enrico



>
> Thanks,
> Yunze
>
> > 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
> >
> > Hello Pulsar fellows,
> >
> > I have prepared a PIP about adding support for Protocol Handlers
> >
> > This is the GDoc
> >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >
> >
> > This is the PR for the implementation
> > https://github.com/apache/pulsar/pull/11838/files
> >
> > I am pretty sure that this PIP will make life of developers of Protocol
> > Handlers and of Administrators who deploy Protocol Handlers very nicer
> >
> > We are still working on the formal PIP process, at the moment I am
> sharing
> > with you the document.
> > My understanding is that after the discussion, I will start a VOTE
> thread,
> > and if the VOTE passes we can move forward with reviewing the PR, and
> > hopefully merge this feature for Pulsar 2.9.0
> >
> > Enrico
>
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Yunze Xu <yz...@streamnative.io.INVALID>.
+1. Great idea.

I’m not familiar with Pulsar Proxy and have a question. How can a proxy protocol handler
Reuse the existing code of a protocol handler?

Thanks,
Yunze

> 2021年8月30日 下午4:47,Enrico Olivelli <eo...@gmail.com> 写道:
> 
> Hello Pulsar fellows,
> 
> I have prepared a PIP about adding support for Protocol Handlers
> 
> This is the GDoc
> 
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> 
> 
> This is the PR for the implementation
> https://github.com/apache/pulsar/pull/11838/files
> 
> I am pretty sure that this PIP will make life of developers of Protocol
> Handlers and of Administrators who deploy Protocol Handlers very nicer
> 
> We are still working on the formal PIP process, at the moment I am sharing
> with you the document.
> My understanding is that after the discussion, I will start a VOTE thread,
> and if the VOTE passes we can move forward with reviewing the PR, and
> hopefully merge this feature for Pulsar 2.9.0
> 
> Enrico


Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Sijie Guo <gu...@gmail.com>.
On Fri, Sep 3, 2021 at 5:07 AM Enrico Olivelli <eo...@gmail.com> wrote:

> Sijie,
> Thanks for your questions, answers inline below.
>
> Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <gu...@gmail.com> ha
> scritto:
>
> > I would like to see the clarification between the broker protocol
> handlers
> > and proxy protocol handlers before moving it to a vote thread.
> >
>
> A PH in the broker is very useful as it allows you to directly access the
> ManagedLedger and implement high performance adapters for
> other wire protocols.
> The bigger limitation is that you can access efficiently only the topics
> owned by the local broker.
> If you try to forward/proxy the request to another broker (you can do it,
> and this was Matteo's suggestion at the latest Video Community meeting)
> you have the downside that the broker has to waste resources to do the
> "proxy work"
> and you generally want a broker machine to be used only to deal with the
> local traffic.
>
> The load balancing mechanism of the brokers is not meant to deal with
> additional work due to proxying requests related to the topics for which
> the broker is not owner.
>
> A PH in the proxy is useful to add new protocols that are running in front
> of the whole cluster and not only of one single broker.
> This is a very different use case in respect to having the PH in broker.
>
> The work of the proxy usually is to forward requests to the internal
> services of the cluster, and in case of new protocols in the proxy
> you need some logic to fill in the gaps in the original wireprotocol.
>
> System architects expect a different kind of load on the proxy and other
> kinds of load on the brokers.
> For instance you usually can run very few proxies to cover a big cluster
> with many brokers.
> So adding a PH on all the brokers is sometimes overkilling.
>

Why not using a mature "proxy" solution like Nginx, Envoy, and etc. If the
"proxy" is doing smart routing, that software provide solutions for doing
this job in a very efficient way comparing to using a JVM-based solution.


>
>
> >
> > I can see how it will cause confusion for protocol developers.
> >
>
> Protocol developers are very advanced users that do need to understand
> clearly the internals of Pulsar.
> In fact this request of having PHs in the Proxy layer came from myself and
> from other colleagues of mine who are working heavily in implementing
> new protocol handlers in Pulsar.
>
> And we faced the limitation of the need to create a new proxy service for
> each new protocol, but all of these "proxy services" have in common
> most of the features of the Pulsar proxy.
> When we also came to deal with System Architects it was clear the
> requirement to have only one single "place" to put all of the interactions
> at "cluster level" with Pulsar.
>
> I think this is a good picture of what I mean:
> - PH in the Broker -> add protocols inside the Broker, work for owned
> topics
> - PH in the Proxy -> add protocols in front of the whole Cluster
>

Why NOT just add an ingress service for broker statefullset?

The fundamental difference between messaging protocols is whether there is
a redirection protocol in them.

If there is a redirection protocol (like Pulsar and Kafka), you redirect
the requests. For such protocols, you need an additional "proxy" solution.
These can be addressed by using mature solutions like Envoy. For example,
in Kafka world, there were already very mature solutions employed by
Strimzi operator and Banzai Cloud operator. I don't see a value to
re-invent another solution.

If there is no redirection protocol, a service for brokers would be
sufficient. But if you want to introduce a proxy, you can still use
solutions like Envoy not rebuilding another solution.


>
>
> > Yunze brought a good idea on KoP.
>
>
> I also have good ideas and working solutions for a Pulsar-proxy like KOP
> Proxy.
> I will be happy to discuss this in a separate thread or at a separate table
> with Yunze.
>
> A smart KOP proxy can work if you run inside the Pulsar proxy process or
> you can copy/paste the Pulsar Proxy code and create another service.
>

I have provided my feedback on the KOP proxy. Please see my comment:
https://github.com/streamnative/kop/issues/717#issuecomment-915063387


>
>
> > But I don't think that's the right
> > direction. If you can give an example of the usage of a proxy handler and
> > how it is different from using a broker handler, that would help me
> > understand this PIP.
> >
>
> For some protocols you have to execute some non trivial work for mapping
> the wireprotocol and the concepts of the protocol to the Pulsar model.
> For instance some protocols do not have the concept of "lookup", and the
> proxy does the lookup and forwards the request to the internal broker.
>
> For some protocols you can just use the PulsarClient to connect to the
> internal brokers, you do not need and you do not want to access the
> ManagedLedgers:
> in this case adding the execution inside the broker is only complicating
> the overall design of the system and putting load on the brokers.
>
> There is a good amount of processing that should be executed on the proxy,
> and it is not good to run it on a broker.
> If you do not put the "custom code" in the Proxy and you can only write a
> Broker PH you end up in adding it to the Broker.
>
> If you expose directly (with some LoadBalancer or whatever) your brokers in
> which you run the PH code that you would put in the proxy
> you end up in putting on the broker some load that is not expected:
> - the broker will have to work even for topics for which it is not the
> owner
> - the broker will have to do things that cannot be dealt correctly by the
> Pulsar load balancer (because it expects that the load it proportional to
> the owned bundles)
>

Why not build a filter in Envoy? Envoy is the de-factor "proxy" for
Kubernetes.


>
>
> >
> > The reason why Pulsar proxy is built is to have a "smart" proxy that is
> > aware of Pulsar protocol. The Pulsar proxy can be replaced with other
> > mature proxy software with SNI routing or multiple advertised listeners
> > now. Hence I am afraid that we are taking the wrong direction here. Here
> > are various reasons.
> >
> > 1) The ProxyService is essentially a Pulsar admin client. Broker service
> > also provides a Pulsar admin client. I am not sure how Proxy PH will
> > simplify the protocol handler development. Please use an example to
> > demonstrate it.
> >
>
> In the cases I am highlighting, *the Broker is simply not the right place
> to run the code*.
>
> So the problem here is not to have PulsarAdmin in the Broker on in the
> Proxy.
> Is that if you want to write a smart proxy for another protocol:
> - you end up in copy/pasting the Proxy code
> - you use the internal Pulsar classes to have a consistent behaviour with
> the Pulsar Proxy
> - you add more components to the "picture" of the Pulsar cluster
>
>
> > 2) The Authorization & Authentication services in ProxyService are only
> > used when proxies are configured to use zookeeper for broker discovery.
> > However, this option is not recommended when running Pulsar proxies in
> > Kubernetes. Instead, using a broker discovery service is recommended. In
> > order to make PH work, you are forcing proxy to be tight with the
> > zookeeper.
> >
>
> This is not needed for all of the Proxy PH handlers.
> But Authorization & Authentication  are a core part of this story.
> If you implement your "smart proxy" somewhere else and not as a Plugin to
> the Pulsar Proxy (or Broker)
> you cannot leverage the same services, the same way.
> It leads to having more chances of having a behaviour different from
> standard Pulsar.
>
> PH developers are Pulsar experts, and you know that copy pasting code from
> Pulsar, leads to unpredictable behaviour
> when you run your plugin in another version of Pulsar.
> But if you use an API that is going to be maintained by Pulsar you are
> safer and you can think that your code is going to work.
>

The point here is that the proxy shouldn't care about Authorization and
Authentication.

The proxy should just forward the requests to the destination service.
Envoy or similar software has already provided such capabilities.

Why do you need to re-invent the solution here?


>
>
> >
> > 3) Configuring authentication and authorization in proxy is already
> > challenging. There are a few different combinations. A typical Pulsar
> setup
> > is to forward the authentication credentials to the brokers to
> authenticate
> > and authorize. If you don't do this correctly, it will introduce security
> > holes because a connection can potentially grab the superuser credential
> > configured in proxy and use superuser credentials to access brokers. From
> > this perspective, I think proxy protocol handler doesn't make things
> > simpler instead it makes things complicated when it comes to
> authentication
> > and authorization.
> >
>
> Yes, this is a very complex problem indeed.
>
> We can help developers by providing a standard framework to access these
> services.
>
> It is very important from my point of view, that we do not encourage
> developers to create
> their own versions of a Pulsar proxy.
>

I don't think people are creating new proxies. Kubernetes already has a
very successful toolchain for "proxies". They have already supported fair
amount of protocols. We should just use it instead of creating Pulsar's own
"proxies".


>
> My recent experience is that we can add many new wire protocols to Pulsar
> and this will help a lot with the adoption of Pulsar.
>
> As we are doing in many other places on Pulsar we should provide tools to
> write extensions
> and do not let people be too creative.
>
>
> >
> > I would like to see these questions are answered before moving to a vote.
> >
>
> I hope that we can reach consensus on the need of this API.
> because I see that there is a real need for making this happen.
>

To me, I think the solution can be achieved with existing tools. I don't
see a strong reason for us to re-invent the wheels again.


>
> It is the Pulsar momentum now, there are so many opportunities to reach out
> to users of other systems,
> let's not waste these opportunities.
>

Why not develop an Envoy filter for Pulsar and other message protocols?
This helps getting Pulsar exposed to a broader ecosystem.


>
>
> Enrico
>
>
>
> >
> > - Sijie
> >
> >
> >
> >
> > On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com>
> > wrote:
> >
> > > Any other comment?
> > >
> > > I would like to start a VOTE, but I feel we saw too few comments here
> > >
> > > Please take a look.
> > > I believe it will be a good fit for 2.9.0 release, that is going to be
> > > released in the end of September
> > >
> > >
> > > Enrico
> > >
> > > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
> > > scritto:
> > >
> > > > +1, just read through the PIP. Looks good to me.
> > > >
> > > > - Michael
> > > >
> > > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eolivelli@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hello Pulsar fellows,
> > > > >
> > > > > I have prepared a PIP about adding support for Protocol Handlers
> > > > >
> > > > > This is the GDoc
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > > > >
> > > > >
> > > > > This is the PR for the implementation
> > > > > https://github.com/apache/pulsar/pull/11838/files
> > > > >
> > > > > I am pretty sure that this PIP will make life of developers of
> > Protocol
> > > > > Handlers and of Administrators who deploy Protocol Handlers very
> > nicer
> > > > >
> > > > > We are still working on the formal PIP process, at the moment I am
> > > > sharing
> > > > > with you the document.
> > > > > My understanding is that after the discussion, I will start a VOTE
> > > > thread,
> > > > > and if the VOTE passes we can move forward with reviewing the PR,
> and
> > > > > hopefully merge this feature for Pulsar 2.9.0
> > > > >
> > > > > Enrico
> > > > >
> > > >
> > >
> >
>

PIP-99 Pulsar Proxy Extensions

Posted by Enrico Olivelli <eo...@gmail.com>.
(renaming to PIP-99 and acquiring a lock on the ID)

Please take a final look. Now the scope is clearer, we do not want to add a
new kind of "Protocol Handlers", but to add a way to add extensions to the
existing Pulsar proxy service.

I know that someone (namely Sijie and JoeF) have some concerns about the
proposal,
but on the other hand I did not see many other opinions.

I would like to run a VOTE early next week

Enrico

Il giorno gio 23 set 2021 alle ore 14:09 Enrico Olivelli <
eolivelli@gmail.com> ha scritto:

> Hello everyone,
> I have created a new version of the old PIP-93 ("Proxy Protocol Handlers),
> now it is "PIP-95 Pulsar Proxy Extensions".
>
> The name "Protocol Handlers" was too confusing, as the kind of extensions
> I want to build are very different from Broker Protocol Handlers.
>
> The idea behind PIP-95 is very simple:
> 1. You can add "extensions" to the Proxy service
> 2. Such extensions live in the Proxy service, use conf/proxy.conf,
> "bin/pulsar proxy"
> 3. They work out-of-the box with the Helm Chart, no need to add new
> services/deployment/pods.....
> 4. An extension can access Pulsar Authentication, Authorization and
> BrokerDiscovery services
>
> This is the PIP-95
> https://github.com/apache/pulsar/issues/12157
>
> This is the PR for the implementation
> https://github.com/apache/pulsar/pull/11838
>
> I hope that this helps to better understand the use cases I have presented
> during the discussion
> and that this allows the community to reach a consensus in adopting this
> feature.
>
> I would be nice to port the "Websocket proxy" to being a "proxy extension"
> one day, but this is a separate discussion, not part of PIP-95
>
> Best regards
> Enrico
>
> Il giorno ven 17 set 2021 alle ore 08:18 Sijie Guo <gu...@gmail.com>
> ha scritto:
>
>> > I totally understand this point. I wasn't there when the proxy was born
>> but
>> currently
>> my experience is that the Proxy is perceived as the primary endpoint in
>> front of the Pulsar cluster
>> especially when you run in k8s.
>>
>> The Pulsar Proxy was born because there is no great solution at that
>> point.
>> However, the Kubernetes stack has evolved beyond what it was before. So
>> does Pulsar evolve.
>>
>> For example,
>>
>> https://github.com/apache/pulsar/wiki/PIP-60%3A-Support-Proxy-server-with-SNI-routing
>> is introduced to use other mature proxy softwares with SNI routing.
>>
>> Multiple broker listeners have been introduced to allow better
>> integrations
>> with proxy and service mesh solutions. Hence I don't think "proxy" is the
>> primary endpoint in front of a Pulsar cluster anymore.
>>
>> Hence I don't think proxy PH is the right solution for the problems you
>> are
>> trying to solve. I would avoid introducing PH to proxy.
>>
>> - Sijie
>>
>> On Tue, Sep 14, 2021 at 8:02 AM Enrico Olivelli <eo...@gmail.com>
>> wrote:
>>
>> > other comments ?
>> >
>> > Enrico
>> >
>> > Il giorno gio 9 set 2021 alle ore 09:15 Enrico Olivelli <
>> > eolivelli@gmail.com>
>> > ha scritto:
>> >
>> > > Joe,
>> > >
>> > > Il giorno gio 9 set 2021 alle ore 04:31 Joe F <jo...@gmail.com>
>> ha
>> > > scritto:
>> > >
>> > >> Enrico, my initial comment  when you brought up PH was in relation to
>> > the
>> > >> larger question about proxying, rather than looking at this in a
>> limited
>> > >> fashion on how to  make it easy to add new PH in the proxy.
>> > >>
>> > >> But specifically with this, here are my comments. Two very
>> > >> distinct abstractions are being mixed up here, and I'm not sure
>> > >> whether that is a good idea or not.
>> > >>
>> > >
>> > > One way of seeing this PIP is to simply complete the work initiated
>> with
>> > > PIP-41 (Introduction of Broker PHs,
>> > >
>> >
>> https://github.com/apache/pulsar/wiki/PIP-41%3A-Pluggable-Protocol-Handler
>> > > ).
>> > >
>> > >
>> > >
>> > >>
>> > >> The proxy was designed to move bits and bytes without interpretation,
>> > >> from
>> > >> one network to the another.  The issue with Pulsar  is that  it
>> requires
>> > >> some interpretation of the data to find to which server  a client
>> should
>> > >> connect. .  Protocol translation crept into the proxy, just to be
>> able
>> > to
>> > >> ask this question. Since auth is required to answer this question,
>> auth
>> > >> also crept in.    Essentially the proxy was built as a TCP proxy, not
>> > as a
>> > >> wire protocol translator.   Some additional hacky things needed to be
>> > done
>> > >> to make it work as a TCP proxy,  and in my opinion those things
>> should
>> > >> die away to the fullest extent possible
>> > >>
>> > >
>> > > I totally understand this point. I wasn't there when the proxy was
>> born
>> > > but currently
>> > > my experience is that the Proxy is perceived as the primary endpoint
>> in
>> > > front of the Pulsar cluster
>> > > especially when you run in k8s.
>> > >
>> > >
>> > >
>> > >>
>> > >> Because of all this, the current implementation is not ideal.  It's
>> > usage
>> > >> is highly restricted in actual deployments, because of potential
>> > security
>> > >> risks if the proxy is  misconfigured. One needs to be strict about
>> > setting
>> > >> up the proxy  to meet security standards in highly regulated
>> > environments.
>> > >>
>> > >>
>> > >>
>> > >> >And we faced the limitation of the need to create a new proxy
>> service
>> > for
>> > >> >each new protocol, but all of these "proxy services" have in common
>> > >> >most of the features of the Pulsar proxy.
>> > >> >When we also came to deal with System Architects it was clear the
>> > >> >requirement to have only one single "place" to put all of the
>> > >> interactions
>> > >> >at "cluster level" with Pulsar.
>> > >>
>> > >> Good idea, a single place seems right. Can the proxy answer the
>> traffic
>> > >> routing question without interpreting the data? Essentially, move
>> what
>> > is
>> > >> done within the proxy now,  to a well known service within the
>> cluster,
>> > >> and
>> > >> use that ?
>> > >>
>> > >
>> > > In the usecases I know, simply routing PDUs to internal brokers is not
>> > > enough
>> > > but you often need to add complex mapping logic from the External
>> > Protocol
>> > > Concepts to Pulsar concepts on the Proxy component.
>> > >
>> > > So you have two ways:
>> > > 1. create your own service and deploy it separately: this was the
>> > > beginning of my work and the same did some colleagues of mine
>> > > 2. deploy your code inside the Pulsar Proxy, and leverage current
>> > > packaging, configuration, tools, security APIs, helm chart.....
>> > >
>> > > I started this discussion because I found option 1 very awkward for
>> Proxy
>> > > Component developers, for System Administrators and for System
>> > Architects.
>> > >
>> > > Developers:
>> > > - you have to copy/paste some Pulsar Proxy code, import Proxy jars,
>> use
>> > > internal Pulsar classes to implement Authentication, Authorization,
>> > Service
>> > > Discovery., Configuration...
>> > >
>> > > System Administrators:
>> > > - you have a new set of configuration files and tools to manage the
>> > > settings (and in k8s you have to modify the Helm Chart significantly)
>> > >
>> > > System Architects:
>> > > - you have multiple new components in the pictures, to explain, to
>> > > justify.....
>> > >
>> > > With this proposal:
>> > >
>> > > Developers:
>> > > - use a framework, do not reinvent the wheel, be able to ensure that
>> you
>> > > are compatible with a give Pulsar version, ensure that the behaviour
>> is
>> > > consistent with other Pulsar components (like using
>> ProxyConfiguration,
>> > or
>> > > the same service lifecycle, same libs) you can evolve more easily
>> > >
>> > > System Administrator:
>> > > - you use proxy.conf/broker.conf, you use Pulsar CLI tools, no need to
>> > > change the Helm Charts
>> > >
>> > > System Architects:
>> > > - nothing new in the table, every Pulsar docs applies, you have the
>> Proxy
>> > > that deals with external clients, but it is able to speak Pulsar,
>> Kafka,
>> > > RabbitMQ, MQTT, ActiveMQ
>> > >
>> > >
>> > >
>> > >
>> > >>
>> > >> >I think this is a good picture of what I mean:
>> > >> >- PH in the Broker -> add protocols inside the Broker, work for
>> owned
>> > >> topics
>> > >> >- PH in the Proxy -> add protocols in front of the whole Cluster
>> > >> >There is a good amount of processing that should be executed on the
>> > >> proxy,
>> > >> >and it is not good to run it on a broker.
>> > >>
>> > >>  Is a TCP proxy a good place to do wire protocol translation
>> > >> (computation)?
>> > >> Especially if that translation is a good amount of processing?  if
>> it's
>> > >> not
>> > >> good to run this much processing on the broker, then it's even worse
>> to
>> > >> run
>> > >> it on a network proxy. I can foresee this as a path that will lead to
>> > >> cluster and load management creeping into the proxy, as soon as you
>> move
>> > >> beyond what a single proxy can handle.
>> > >>
>> > >> But I think these issues (of n/w vs protocol translation) are moot
>> when
>> > >> you
>> > >> look at the larger needs of  generic proxy that will support ingress,
>> > >> configurable protocol handlers, load balancing etc for use with
>> Pulsar.
>> > >> You
>> > >> can run a bunch of Pulsar's  proxies today, and there is no means to
>> > >> manage
>> > >> them properly. eg: load balance between them/ manage them as a
>> cluster/
>> > >> have affinity of proxies to topics/tenants. etc. This applies even
>> > before
>> > >> this PIP (and more so once you add more processing into the proxy).
>> > >>
>> > >> The Pulsar proxy, as it is,  is not amenable to creating anything
>> like a
>> > >> service mesh. It would demand a lot of work in the proxy. Hence my
>> > >> initial comment about the proxy eventually becoming a mudball, and
>> why
>> > we
>> > >> should rethink this entire proxy.
>> > >>
>> > >>  It is tempting to evolve the Pulsar proxy into a service that
>> supports
>> > >> everything.. ingress, transformation chains, cluster management  etc
>> .
>> > >> This  will eventually end up  duplicating something which already
>> exists
>> > >> elsewhere.  My take is that this is better done by building on top of
>> > >> something like envoy ( or similar) which has built in and mature
>> > >> features,
>> > >> and supported by a wide user base.
>> > >>
>> > >
>> > > Unfortunately general purpose proxies or proxies specific to some
>> > protocol
>> > > will not be able to
>> > > do efficiently what we can do using Pulsar APIs, because they cannot
>> > "map"
>> > > directly External Concepts to the Pulsar model.
>> > >
>> > > I cannot imagine the cost of developing and maintaining a plugin for
>> > Envoy
>> > > that is able to deal
>> > > with Pulsar concepts. For instance it is not written in Java and you
>> > > cannot use Java Bindings for Pulsar, that are feature complete and
>> always
>> > > up-to-date with latest features.
>> > > Also developers that work on PHs are specialized in Pulsar code and in
>> > > Java (at very high levels), and so for them it is harder to write
>> super
>> > > efficient and high quality plugins using non-Java languages.
>> > >
>> > > So I see a huge value in adding this ability to the Pulsar Proxy.
>> > >
>> > > The only alternative to this PIP is to create a new framework for
>> > creating
>> > > such "Smart Proxies" in Java and using some official/maintained Pulsar
>> > API.
>> > >
>> > > So we will end up discussing the value of adding such a brand new
>> module,
>> > > and how to deploy/manage it.
>> > >
>> > > It is a huge cost and it will take so much time:
>> > > - design,
>> > > - adding new concepts to the architecture,
>> > > - adding a new service (new management tools),
>> > > - lot of new code (probably cut/paste from Pulsar Proxy)
>> > > - helm chart
>> > > - new configuration files
>> > > - docs
>> > >
>> > > I believe that we should spend our time in adding more
>> bindings/protocol
>> > > handlers instead of doing that.
>> > >
>> > > By the way I will be happy to drive this new effort if this is REALLY
>> > what
>> > > we want.
>> > >
>> > > So I am convinced that for the short/mid term this PIP is the best
>> choice
>> > > to help Pulsar adoption.
>> > >
>> > > This PIP will unlock some great potential that otherwise will be
>> > > available only to users of custom tools, not officially maintained
>> > > inside the Pulsar project.
>> > > I will be very sad about the outcome
>> > >
>> > >
>> > >
>> > > Enrico
>> > >
>> > >
>> > >
>> > >>
>> > >> -j
>> > >>
>> > >> On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eolivelli@gmail.com
>> >
>> > >> wrote:
>> > >>
>> > >> > (ping)
>> > >> >
>> > >> >
>> > >> > Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
>> > >> > eolivelli@gmail.com>
>> > >> > ha scritto:
>> > >> >
>> > >> > > Sijie,
>> > >> > > Thanks for your questions, answers inline below.
>> > >> > >
>> > >> > > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <
>> > guosijie@gmail.com
>> > >> >
>> > >> > ha
>> > >> > > scritto:
>> > >> > >
>> > >> > >> I would like to see the clarification between the broker
>> protocol
>> > >> > handlers
>> > >> > >> and proxy protocol handlers before moving it to a vote thread.
>> > >> > >>
>> > >> > >
>> > >> > > A PH in the broker is very useful as it allows you to directly
>> > access
>> > >> the
>> > >> > > ManagedLedger and implement high performance adapters for
>> > >> > > other wire protocols.
>> > >> > > The bigger limitation is that you can access efficiently only the
>> > >> topics
>> > >> > > owned by the local broker.
>> > >> > > If you try to forward/proxy the request to another broker (you
>> can
>> > do
>> > >> it,
>> > >> > > and this was Matteo's suggestion at the latest Video Community
>> > >> meeting)
>> > >> > > you have the downside that the broker has to waste resources to
>> do
>> > the
>> > >> > > "proxy work"
>> > >> > > and you generally want a broker machine to be used only to deal
>> with
>> > >> the
>> > >> > > local traffic.
>> > >> > >
>> > >> > > The load balancing mechanism of the brokers is not meant to deal
>> > with
>> > >> > > additional work due to proxying requests related to the topics
>> for
>> > >> which
>> > >> > > the broker is not owner.
>> > >> > >
>> > >> > > A PH in the proxy is useful to add new protocols that are
>> running in
>> > >> > front
>> > >> > > of the whole cluster and not only of one single broker.
>> > >> > > This is a very different use case in respect to having the PH in
>> > >> broker.
>> > >> > >
>> > >> > > The work of the proxy usually is to forward requests to the
>> internal
>> > >> > > services of the cluster, and in case of new protocols in the
>> proxy
>> > >> > > you need some logic to fill in the gaps in the original
>> > wireprotocol.
>> > >> > >
>> > >> > > System architects expect a different kind of load on the proxy
>> and
>> > >> other
>> > >> > > kinds of load on the brokers.
>> > >> > > For instance you usually can run very few proxies to cover a big
>> > >> cluster
>> > >> > > with many brokers.
>> > >> > > So adding a PH on all the brokers is sometimes overkilling.
>> > >> > >
>> > >> > >
>> > >> > >>
>> > >> > >> I can see how it will cause confusion for protocol developers.
>> > >> > >>
>> > >> > >
>> > >> > > Protocol developers are very advanced users that do need to
>> > understand
>> > >> > > clearly the internals of Pulsar.
>> > >> > > In fact this request of having PHs in the Proxy layer came from
>> > myself
>> > >> > and
>> > >> > > from other colleagues of mine who are working heavily in
>> > implementing
>> > >> > > new protocol handlers in Pulsar.
>> > >> > >
>> > >> > > And we faced the limitation of the need to create a new proxy
>> > service
>> > >> for
>> > >> > > each new protocol, but all of these "proxy services" have in
>> common
>> > >> > > most of the features of the Pulsar proxy.
>> > >> > > When we also came to deal with System Architects it was clear the
>> > >> > > requirement to have only one single "place" to put all of the
>> > >> > interactions
>> > >> > > at "cluster level" with Pulsar.
>> > >> > >
>> > >> > > I think this is a good picture of what I mean:
>> > >> > > - PH in the Broker -> add protocols inside the Broker, work for
>> > owned
>> > >> > > topics
>> > >> > > - PH in the Proxy -> add protocols in front of the whole Cluster
>> > >> > >
>> > >> > >
>> > >> > >> Yunze brought a good idea on KoP.
>> > >> > >
>> > >> > >
>> > >> > > I also have good ideas and working solutions for a Pulsar-proxy
>> like
>> > >> KOP
>> > >> > > Proxy.
>> > >> > > I will be happy to discuss this in a separate thread or at a
>> > separate
>> > >> > > table with Yunze.
>> > >> > >
>> > >> > > A smart KOP proxy can work if you run inside the Pulsar proxy
>> > process
>> > >> or
>> > >> > > you can copy/paste the Pulsar Proxy code and create another
>> service.
>> > >> > >
>> > >> > >
>> > >> > >> But I don't think that's the right
>> > >> > >> direction. If you can give an example of the usage of a proxy
>> > handler
>> > >> > and
>> > >> > >> how it is different from using a broker handler, that would
>> help me
>> > >> > >> understand this PIP.
>> > >> > >>
>> > >> > >
>> > >> > > For some protocols you have to execute some non trivial work for
>> > >> mapping
>> > >> > > the wireprotocol and the concepts of the protocol to the Pulsar
>> > model.
>> > >> > > For instance some protocols do not have the concept of "lookup",
>> and
>> > >> the
>> > >> > > proxy does the lookup and forwards the request to the internal
>> > broker.
>> > >> > >
>> > >> > > For some protocols you can just use the PulsarClient to connect
>> to
>> > the
>> > >> > > internal brokers, you do not need and you do not want to access
>> the
>> > >> > > ManagedLedgers:
>> > >> > > in this case adding the execution inside the broker is only
>> > >> complicating
>> > >> > > the overall design of the system and putting load on the brokers.
>> > >> > >
>> > >> > > There is a good amount of processing that should be executed on
>> the
>> > >> > proxy,
>> > >> > > and it is not good to run it on a broker.
>> > >> > > If you do not put the "custom code" in the Proxy and you can only
>> > >> write a
>> > >> > > Broker PH you end up in adding it to the Broker.
>> > >> > >
>> > >> > > If you expose directly (with some LoadBalancer or whatever) your
>> > >> brokers
>> > >> > > in which you run the PH code that you would put in the proxy
>> > >> > > you end up in putting on the broker some load that is not
>> expected:
>> > >> > > - the broker will have to work even for topics for which it is
>> not
>> > the
>> > >> > > owner
>> > >> > > - the broker will have to do things that cannot be dealt
>> correctly
>> > by
>> > >> the
>> > >> > > Pulsar load balancer (because it expects that the load it
>> > >> proportional to
>> > >> > > the owned bundles)
>> > >> > >
>> > >> > >
>> > >> > >>
>> > >> > >> The reason why Pulsar proxy is built is to have a "smart" proxy
>> > that
>> > >> is
>> > >> > >> aware of Pulsar protocol. The Pulsar proxy can be replaced with
>> > other
>> > >> > >> mature proxy software with SNI routing or multiple advertised
>> > >> listeners
>> > >> > >> now. Hence I am afraid that we are taking the wrong direction
>> here.
>> > >> Here
>> > >> > >> are various reasons.
>> > >> > >>
>> > >> > >> 1) The ProxyService is essentially a Pulsar admin client. Broker
>> > >> service
>> > >> > >> also provides a Pulsar admin client. I am not sure how Proxy PH
>> > will
>> > >> > >> simplify the protocol handler development. Please use an
>> example to
>> > >> > >> demonstrate it.
>> > >> > >>
>> > >> > >
>> > >> > > In the cases I am highlighting, *the Broker is simply not the
>> right
>> > >> place
>> > >> > > to run the code*.
>> > >> > >
>> > >> > > So the problem here is not to have PulsarAdmin in the Broker on
>> in
>> > the
>> > >> > > Proxy.
>> > >> > > Is that if you want to write a smart proxy for another protocol:
>> > >> > > - you end up in copy/pasting the Proxy code
>> > >> > > - you use the internal Pulsar classes to have a consistent
>> behaviour
>> > >> with
>> > >> > > the Pulsar Proxy
>> > >> > > - you add more components to the "picture" of the Pulsar cluster
>> > >> > >
>> > >> > >
>> > >> > >> 2) The Authorization & Authentication services in ProxyService
>> are
>> > >> only
>> > >> > >> used when proxies are configured to use zookeeper for broker
>> > >> discovery.
>> > >> > >> However, this option is not recommended when running Pulsar
>> proxies
>> > >> in
>> > >> > >> Kubernetes. Instead, using a broker discovery service is
>> > >> recommended. In
>> > >> > >> order to make PH work, you are forcing proxy to be tight with
>> the
>> > >> > >> zookeeper.
>> > >> > >>
>> > >> > >
>> > >> > > This is not needed for all of the Proxy PH handlers.
>> > >> > > But Authorization & Authentication  are a core part of this
>> story.
>> > >> > > If you implement your "smart proxy" somewhere else and not as a
>> > >> Plugin to
>> > >> > > the Pulsar Proxy (or Broker)
>> > >> > > you cannot leverage the same services, the same way.
>> > >> > > It leads to having more chances of having a behaviour different
>> from
>> > >> > > standard Pulsar.
>> > >> > >
>> > >> > > PH developers are Pulsar experts, and you know that copy pasting
>> > code
>> > >> > from
>> > >> > > Pulsar, leads to unpredictable behaviour
>> > >> > > when you run your plugin in another version of Pulsar.
>> > >> > > But if you use an API that is going to be maintained by Pulsar
>> you
>> > are
>> > >> > > safer and you can think that your code is going to work.
>> > >> > >
>> > >> > >
>> > >> > >>
>> > >> > >> 3) Configuring authentication and authorization in proxy is
>> already
>> > >> > >> challenging. There are a few different combinations. A typical
>> > Pulsar
>> > >> > >> setup
>> > >> > >> is to forward the authentication credentials to the brokers to
>> > >> > >> authenticate
>> > >> > >> and authorize. If you don't do this correctly, it will introduce
>> > >> > security
>> > >> > >> holes because a connection can potentially grab the superuser
>> > >> credential
>> > >> > >> configured in proxy and use superuser credentials to access
>> > brokers.
>> > >> > From
>> > >> > >> this perspective, I think proxy protocol handler doesn't make
>> > things
>> > >> > >> simpler instead it makes things complicated when it comes to
>> > >> > >> authentication
>> > >> > >> and authorization.
>> > >> > >>
>> > >> > >
>> > >> > > Yes, this is a very complex problem indeed.
>> > >> > >
>> > >> > > We can help developers by providing a standard framework to
>> access
>> > >> these
>> > >> > > services.
>> > >> > >
>> > >> > > It is very important from my point of view, that we do not
>> encourage
>> > >> > > developers to create
>> > >> > > their own versions of a Pulsar proxy.
>> > >> > >
>> > >> > > My recent experience is that we can add many new wire protocols
>> to
>> > >> Pulsar
>> > >> > > and this will help a lot with the adoption of Pulsar.
>> > >> > >
>> > >> > > As we are doing in many other places on Pulsar we should provide
>> > >> tools to
>> > >> > > write extensions
>> > >> > > and do not let people be too creative.
>> > >> > >
>> > >> > >
>> > >> > >>
>> > >> > >> I would like to see these questions are answered before moving
>> to a
>> > >> > vote.
>> > >> > >>
>> > >> > >
>> > >> > > I hope that we can reach consensus on the need of this API.
>> > >> > > because I see that there is a real need for making this happen.
>> > >> > >
>> > >> > > It is the Pulsar momentum now, there are so many opportunities to
>> > >> reach
>> > >> > > out to users of other systems,
>> > >> > > let's not waste these opportunities.
>> > >> > >
>> > >> > >
>> > >> > > Enrico
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > >>
>> > >> > >> - Sijie
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <
>> > eolivelli@gmail.com
>> > >> >
>> > >> > >> wrote:
>> > >> > >>
>> > >> > >> > Any other comment?
>> > >> > >> >
>> > >> > >> > I would like to start a VOTE, but I feel we saw too few
>> comments
>> > >> here
>> > >> > >> >
>> > >> > >> > Please take a look.
>> > >> > >> > I believe it will be a good fit for 2.9.0 release, that is
>> going
>> > >> to be
>> > >> > >> > released in the end of September
>> > >> > >> >
>> > >> > >> >
>> > >> > >> > Enrico
>> > >> > >> >
>> > >> > >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <
>> > mikemarsh17@gmail.com>
>> > >> ha
>> > >> > >> > scritto:
>> > >> > >> >
>> > >> > >> > > +1, just read through the PIP. Looks good to me.
>> > >> > >> > >
>> > >> > >> > > - Michael
>> > >> > >> > >
>> > >> > >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
>> > >> > eolivelli@gmail.com>
>> > >> > >> > > wrote:
>> > >> > >> > >
>> > >> > >> > > > Hello Pulsar fellows,
>> > >> > >> > > >
>> > >> > >> > > > I have prepared a PIP about adding support for Protocol
>> > >> Handlers
>> > >> > >> > > >
>> > >> > >> > > > This is the GDoc
>> > >> > >> > > >
>> > >> > >> > > >
>> > >> > >> > > >
>> > >> > >> > >
>> > >> > >> >
>> > >> > >>
>> > >> >
>> > >>
>> >
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>> > >> > >> > > >
>> > >> > >> > > >
>> > >> > >> > > > This is the PR for the implementation
>> > >> > >> > > > https://github.com/apache/pulsar/pull/11838/files
>> > >> > >> > > >
>> > >> > >> > > > I am pretty sure that this PIP will make life of
>> developers
>> > of
>> > >> > >> Protocol
>> > >> > >> > > > Handlers and of Administrators who deploy Protocol
>> Handlers
>> > >> very
>> > >> > >> nicer
>> > >> > >> > > >
>> > >> > >> > > > We are still working on the formal PIP process, at the
>> moment
>> > >> I am
>> > >> > >> > > sharing
>> > >> > >> > > > with you the document.
>> > >> > >> > > > My understanding is that after the discussion, I will
>> start a
>> > >> VOTE
>> > >> > >> > > thread,
>> > >> > >> > > > and if the VOTE passes we can move forward with reviewing
>> the
>> > >> PR,
>> > >> > >> and
>> > >> > >> > > > hopefully merge this feature for Pulsar 2.9.0
>> > >> > >> > > >
>> > >> > >> > > > Enrico
>> > >> > >> > > >
>> > >> > >> > >
>> > >> > >> >
>> > >> > >>
>> > >> > >
>> > >> >
>> > >>
>> > >
>> >
>>
>

PIP-95 Pulsar Proxy Extensions

Posted by Enrico Olivelli <eo...@gmail.com>.
Hello everyone,
I have created a new version of the old PIP-93 ("Proxy Protocol Handlers),
now it is "PIP-95 Pulsar Proxy Extensions".

The name "Protocol Handlers" was too confusing, as the kind of extensions I
want to build are very different from Broker Protocol Handlers.

The idea behind PIP-95 is very simple:
1. You can add "extensions" to the Proxy service
2. Such extensions live in the Proxy service, use conf/proxy.conf,
"bin/pulsar proxy"
3. They work out-of-the box with the Helm Chart, no need to add new
services/deployment/pods.....
4. An extension can access Pulsar Authentication, Authorization and
BrokerDiscovery services

This is the PIP-95
https://github.com/apache/pulsar/issues/12157

This is the PR for the implementation
https://github.com/apache/pulsar/pull/11838

I hope that this helps to better understand the use cases I have presented
during the discussion
and that this allows the community to reach a consensus in adopting this
feature.

I would be nice to port the "Websocket proxy" to being a "proxy extension"
one day, but this is a separate discussion, not part of PIP-95

Best regards
Enrico

Il giorno ven 17 set 2021 alle ore 08:18 Sijie Guo <gu...@gmail.com> ha
scritto:

> > I totally understand this point. I wasn't there when the proxy was born
> but
> currently
> my experience is that the Proxy is perceived as the primary endpoint in
> front of the Pulsar cluster
> especially when you run in k8s.
>
> The Pulsar Proxy was born because there is no great solution at that point.
> However, the Kubernetes stack has evolved beyond what it was before. So
> does Pulsar evolve.
>
> For example,
>
> https://github.com/apache/pulsar/wiki/PIP-60%3A-Support-Proxy-server-with-SNI-routing
> is introduced to use other mature proxy softwares with SNI routing.
>
> Multiple broker listeners have been introduced to allow better integrations
> with proxy and service mesh solutions. Hence I don't think "proxy" is the
> primary endpoint in front of a Pulsar cluster anymore.
>
> Hence I don't think proxy PH is the right solution for the problems you are
> trying to solve. I would avoid introducing PH to proxy.
>
> - Sijie
>
> On Tue, Sep 14, 2021 at 8:02 AM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > other comments ?
> >
> > Enrico
> >
> > Il giorno gio 9 set 2021 alle ore 09:15 Enrico Olivelli <
> > eolivelli@gmail.com>
> > ha scritto:
> >
> > > Joe,
> > >
> > > Il giorno gio 9 set 2021 alle ore 04:31 Joe F <jo...@gmail.com>
> ha
> > > scritto:
> > >
> > >> Enrico, my initial comment  when you brought up PH was in relation to
> > the
> > >> larger question about proxying, rather than looking at this in a
> limited
> > >> fashion on how to  make it easy to add new PH in the proxy.
> > >>
> > >> But specifically with this, here are my comments. Two very
> > >> distinct abstractions are being mixed up here, and I'm not sure
> > >> whether that is a good idea or not.
> > >>
> > >
> > > One way of seeing this PIP is to simply complete the work initiated
> with
> > > PIP-41 (Introduction of Broker PHs,
> > >
> >
> https://github.com/apache/pulsar/wiki/PIP-41%3A-Pluggable-Protocol-Handler
> > > ).
> > >
> > >
> > >
> > >>
> > >> The proxy was designed to move bits and bytes without interpretation,
> > >> from
> > >> one network to the another.  The issue with Pulsar  is that  it
> requires
> > >> some interpretation of the data to find to which server  a client
> should
> > >> connect. .  Protocol translation crept into the proxy, just to be able
> > to
> > >> ask this question. Since auth is required to answer this question,
> auth
> > >> also crept in.    Essentially the proxy was built as a TCP proxy, not
> > as a
> > >> wire protocol translator.   Some additional hacky things needed to be
> > done
> > >> to make it work as a TCP proxy,  and in my opinion those things
> should
> > >> die away to the fullest extent possible
> > >>
> > >
> > > I totally understand this point. I wasn't there when the proxy was born
> > > but currently
> > > my experience is that the Proxy is perceived as the primary endpoint in
> > > front of the Pulsar cluster
> > > especially when you run in k8s.
> > >
> > >
> > >
> > >>
> > >> Because of all this, the current implementation is not ideal.  It's
> > usage
> > >> is highly restricted in actual deployments, because of potential
> > security
> > >> risks if the proxy is  misconfigured. One needs to be strict about
> > setting
> > >> up the proxy  to meet security standards in highly regulated
> > environments.
> > >>
> > >>
> > >>
> > >> >And we faced the limitation of the need to create a new proxy service
> > for
> > >> >each new protocol, but all of these "proxy services" have in common
> > >> >most of the features of the Pulsar proxy.
> > >> >When we also came to deal with System Architects it was clear the
> > >> >requirement to have only one single "place" to put all of the
> > >> interactions
> > >> >at "cluster level" with Pulsar.
> > >>
> > >> Good idea, a single place seems right. Can the proxy answer the
> traffic
> > >> routing question without interpreting the data? Essentially, move what
> > is
> > >> done within the proxy now,  to a well known service within the
> cluster,
> > >> and
> > >> use that ?
> > >>
> > >
> > > In the usecases I know, simply routing PDUs to internal brokers is not
> > > enough
> > > but you often need to add complex mapping logic from the External
> > Protocol
> > > Concepts to Pulsar concepts on the Proxy component.
> > >
> > > So you have two ways:
> > > 1. create your own service and deploy it separately: this was the
> > > beginning of my work and the same did some colleagues of mine
> > > 2. deploy your code inside the Pulsar Proxy, and leverage current
> > > packaging, configuration, tools, security APIs, helm chart.....
> > >
> > > I started this discussion because I found option 1 very awkward for
> Proxy
> > > Component developers, for System Administrators and for System
> > Architects.
> > >
> > > Developers:
> > > - you have to copy/paste some Pulsar Proxy code, import Proxy jars, use
> > > internal Pulsar classes to implement Authentication, Authorization,
> > Service
> > > Discovery., Configuration...
> > >
> > > System Administrators:
> > > - you have a new set of configuration files and tools to manage the
> > > settings (and in k8s you have to modify the Helm Chart significantly)
> > >
> > > System Architects:
> > > - you have multiple new components in the pictures, to explain, to
> > > justify.....
> > >
> > > With this proposal:
> > >
> > > Developers:
> > > - use a framework, do not reinvent the wheel, be able to ensure that
> you
> > > are compatible with a give Pulsar version, ensure that the behaviour is
> > > consistent with other Pulsar components (like using ProxyConfiguration,
> > or
> > > the same service lifecycle, same libs) you can evolve more easily
> > >
> > > System Administrator:
> > > - you use proxy.conf/broker.conf, you use Pulsar CLI tools, no need to
> > > change the Helm Charts
> > >
> > > System Architects:
> > > - nothing new in the table, every Pulsar docs applies, you have the
> Proxy
> > > that deals with external clients, but it is able to speak Pulsar,
> Kafka,
> > > RabbitMQ, MQTT, ActiveMQ
> > >
> > >
> > >
> > >
> > >>
> > >> >I think this is a good picture of what I mean:
> > >> >- PH in the Broker -> add protocols inside the Broker, work for owned
> > >> topics
> > >> >- PH in the Proxy -> add protocols in front of the whole Cluster
> > >> >There is a good amount of processing that should be executed on the
> > >> proxy,
> > >> >and it is not good to run it on a broker.
> > >>
> > >>  Is a TCP proxy a good place to do wire protocol translation
> > >> (computation)?
> > >> Especially if that translation is a good amount of processing?  if
> it's
> > >> not
> > >> good to run this much processing on the broker, then it's even worse
> to
> > >> run
> > >> it on a network proxy. I can foresee this as a path that will lead to
> > >> cluster and load management creeping into the proxy, as soon as you
> move
> > >> beyond what a single proxy can handle.
> > >>
> > >> But I think these issues (of n/w vs protocol translation) are moot
> when
> > >> you
> > >> look at the larger needs of  generic proxy that will support ingress,
> > >> configurable protocol handlers, load balancing etc for use with
> Pulsar.
> > >> You
> > >> can run a bunch of Pulsar's  proxies today, and there is no means to
> > >> manage
> > >> them properly. eg: load balance between them/ manage them as a
> cluster/
> > >> have affinity of proxies to topics/tenants. etc. This applies even
> > before
> > >> this PIP (and more so once you add more processing into the proxy).
> > >>
> > >> The Pulsar proxy, as it is,  is not amenable to creating anything
> like a
> > >> service mesh. It would demand a lot of work in the proxy. Hence my
> > >> initial comment about the proxy eventually becoming a mudball, and why
> > we
> > >> should rethink this entire proxy.
> > >>
> > >>  It is tempting to evolve the Pulsar proxy into a service that
> supports
> > >> everything.. ingress, transformation chains, cluster management  etc .
> > >> This  will eventually end up  duplicating something which already
> exists
> > >> elsewhere.  My take is that this is better done by building on top of
> > >> something like envoy ( or similar) which has built in and mature
> > >> features,
> > >> and supported by a wide user base.
> > >>
> > >
> > > Unfortunately general purpose proxies or proxies specific to some
> > protocol
> > > will not be able to
> > > do efficiently what we can do using Pulsar APIs, because they cannot
> > "map"
> > > directly External Concepts to the Pulsar model.
> > >
> > > I cannot imagine the cost of developing and maintaining a plugin for
> > Envoy
> > > that is able to deal
> > > with Pulsar concepts. For instance it is not written in Java and you
> > > cannot use Java Bindings for Pulsar, that are feature complete and
> always
> > > up-to-date with latest features.
> > > Also developers that work on PHs are specialized in Pulsar code and in
> > > Java (at very high levels), and so for them it is harder to write super
> > > efficient and high quality plugins using non-Java languages.
> > >
> > > So I see a huge value in adding this ability to the Pulsar Proxy.
> > >
> > > The only alternative to this PIP is to create a new framework for
> > creating
> > > such "Smart Proxies" in Java and using some official/maintained Pulsar
> > API.
> > >
> > > So we will end up discussing the value of adding such a brand new
> module,
> > > and how to deploy/manage it.
> > >
> > > It is a huge cost and it will take so much time:
> > > - design,
> > > - adding new concepts to the architecture,
> > > - adding a new service (new management tools),
> > > - lot of new code (probably cut/paste from Pulsar Proxy)
> > > - helm chart
> > > - new configuration files
> > > - docs
> > >
> > > I believe that we should spend our time in adding more
> bindings/protocol
> > > handlers instead of doing that.
> > >
> > > By the way I will be happy to drive this new effort if this is REALLY
> > what
> > > we want.
> > >
> > > So I am convinced that for the short/mid term this PIP is the best
> choice
> > > to help Pulsar adoption.
> > >
> > > This PIP will unlock some great potential that otherwise will be
> > > available only to users of custom tools, not officially maintained
> > > inside the Pulsar project.
> > > I will be very sad about the outcome
> > >
> > >
> > >
> > > Enrico
> > >
> > >
> > >
> > >>
> > >> -j
> > >>
> > >> On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eo...@gmail.com>
> > >> wrote:
> > >>
> > >> > (ping)
> > >> >
> > >> >
> > >> > Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
> > >> > eolivelli@gmail.com>
> > >> > ha scritto:
> > >> >
> > >> > > Sijie,
> > >> > > Thanks for your questions, answers inline below.
> > >> > >
> > >> > > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <
> > guosijie@gmail.com
> > >> >
> > >> > ha
> > >> > > scritto:
> > >> > >
> > >> > >> I would like to see the clarification between the broker protocol
> > >> > handlers
> > >> > >> and proxy protocol handlers before moving it to a vote thread.
> > >> > >>
> > >> > >
> > >> > > A PH in the broker is very useful as it allows you to directly
> > access
> > >> the
> > >> > > ManagedLedger and implement high performance adapters for
> > >> > > other wire protocols.
> > >> > > The bigger limitation is that you can access efficiently only the
> > >> topics
> > >> > > owned by the local broker.
> > >> > > If you try to forward/proxy the request to another broker (you can
> > do
> > >> it,
> > >> > > and this was Matteo's suggestion at the latest Video Community
> > >> meeting)
> > >> > > you have the downside that the broker has to waste resources to do
> > the
> > >> > > "proxy work"
> > >> > > and you generally want a broker machine to be used only to deal
> with
> > >> the
> > >> > > local traffic.
> > >> > >
> > >> > > The load balancing mechanism of the brokers is not meant to deal
> > with
> > >> > > additional work due to proxying requests related to the topics for
> > >> which
> > >> > > the broker is not owner.
> > >> > >
> > >> > > A PH in the proxy is useful to add new protocols that are running
> in
> > >> > front
> > >> > > of the whole cluster and not only of one single broker.
> > >> > > This is a very different use case in respect to having the PH in
> > >> broker.
> > >> > >
> > >> > > The work of the proxy usually is to forward requests to the
> internal
> > >> > > services of the cluster, and in case of new protocols in the proxy
> > >> > > you need some logic to fill in the gaps in the original
> > wireprotocol.
> > >> > >
> > >> > > System architects expect a different kind of load on the proxy and
> > >> other
> > >> > > kinds of load on the brokers.
> > >> > > For instance you usually can run very few proxies to cover a big
> > >> cluster
> > >> > > with many brokers.
> > >> > > So adding a PH on all the brokers is sometimes overkilling.
> > >> > >
> > >> > >
> > >> > >>
> > >> > >> I can see how it will cause confusion for protocol developers.
> > >> > >>
> > >> > >
> > >> > > Protocol developers are very advanced users that do need to
> > understand
> > >> > > clearly the internals of Pulsar.
> > >> > > In fact this request of having PHs in the Proxy layer came from
> > myself
> > >> > and
> > >> > > from other colleagues of mine who are working heavily in
> > implementing
> > >> > > new protocol handlers in Pulsar.
> > >> > >
> > >> > > And we faced the limitation of the need to create a new proxy
> > service
> > >> for
> > >> > > each new protocol, but all of these "proxy services" have in
> common
> > >> > > most of the features of the Pulsar proxy.
> > >> > > When we also came to deal with System Architects it was clear the
> > >> > > requirement to have only one single "place" to put all of the
> > >> > interactions
> > >> > > at "cluster level" with Pulsar.
> > >> > >
> > >> > > I think this is a good picture of what I mean:
> > >> > > - PH in the Broker -> add protocols inside the Broker, work for
> > owned
> > >> > > topics
> > >> > > - PH in the Proxy -> add protocols in front of the whole Cluster
> > >> > >
> > >> > >
> > >> > >> Yunze brought a good idea on KoP.
> > >> > >
> > >> > >
> > >> > > I also have good ideas and working solutions for a Pulsar-proxy
> like
> > >> KOP
> > >> > > Proxy.
> > >> > > I will be happy to discuss this in a separate thread or at a
> > separate
> > >> > > table with Yunze.
> > >> > >
> > >> > > A smart KOP proxy can work if you run inside the Pulsar proxy
> > process
> > >> or
> > >> > > you can copy/paste the Pulsar Proxy code and create another
> service.
> > >> > >
> > >> > >
> > >> > >> But I don't think that's the right
> > >> > >> direction. If you can give an example of the usage of a proxy
> > handler
> > >> > and
> > >> > >> how it is different from using a broker handler, that would help
> me
> > >> > >> understand this PIP.
> > >> > >>
> > >> > >
> > >> > > For some protocols you have to execute some non trivial work for
> > >> mapping
> > >> > > the wireprotocol and the concepts of the protocol to the Pulsar
> > model.
> > >> > > For instance some protocols do not have the concept of "lookup",
> and
> > >> the
> > >> > > proxy does the lookup and forwards the request to the internal
> > broker.
> > >> > >
> > >> > > For some protocols you can just use the PulsarClient to connect to
> > the
> > >> > > internal brokers, you do not need and you do not want to access
> the
> > >> > > ManagedLedgers:
> > >> > > in this case adding the execution inside the broker is only
> > >> complicating
> > >> > > the overall design of the system and putting load on the brokers.
> > >> > >
> > >> > > There is a good amount of processing that should be executed on
> the
> > >> > proxy,
> > >> > > and it is not good to run it on a broker.
> > >> > > If you do not put the "custom code" in the Proxy and you can only
> > >> write a
> > >> > > Broker PH you end up in adding it to the Broker.
> > >> > >
> > >> > > If you expose directly (with some LoadBalancer or whatever) your
> > >> brokers
> > >> > > in which you run the PH code that you would put in the proxy
> > >> > > you end up in putting on the broker some load that is not
> expected:
> > >> > > - the broker will have to work even for topics for which it is not
> > the
> > >> > > owner
> > >> > > - the broker will have to do things that cannot be dealt correctly
> > by
> > >> the
> > >> > > Pulsar load balancer (because it expects that the load it
> > >> proportional to
> > >> > > the owned bundles)
> > >> > >
> > >> > >
> > >> > >>
> > >> > >> The reason why Pulsar proxy is built is to have a "smart" proxy
> > that
> > >> is
> > >> > >> aware of Pulsar protocol. The Pulsar proxy can be replaced with
> > other
> > >> > >> mature proxy software with SNI routing or multiple advertised
> > >> listeners
> > >> > >> now. Hence I am afraid that we are taking the wrong direction
> here.
> > >> Here
> > >> > >> are various reasons.
> > >> > >>
> > >> > >> 1) The ProxyService is essentially a Pulsar admin client. Broker
> > >> service
> > >> > >> also provides a Pulsar admin client. I am not sure how Proxy PH
> > will
> > >> > >> simplify the protocol handler development. Please use an example
> to
> > >> > >> demonstrate it.
> > >> > >>
> > >> > >
> > >> > > In the cases I am highlighting, *the Broker is simply not the
> right
> > >> place
> > >> > > to run the code*.
> > >> > >
> > >> > > So the problem here is not to have PulsarAdmin in the Broker on in
> > the
> > >> > > Proxy.
> > >> > > Is that if you want to write a smart proxy for another protocol:
> > >> > > - you end up in copy/pasting the Proxy code
> > >> > > - you use the internal Pulsar classes to have a consistent
> behaviour
> > >> with
> > >> > > the Pulsar Proxy
> > >> > > - you add more components to the "picture" of the Pulsar cluster
> > >> > >
> > >> > >
> > >> > >> 2) The Authorization & Authentication services in ProxyService
> are
> > >> only
> > >> > >> used when proxies are configured to use zookeeper for broker
> > >> discovery.
> > >> > >> However, this option is not recommended when running Pulsar
> proxies
> > >> in
> > >> > >> Kubernetes. Instead, using a broker discovery service is
> > >> recommended. In
> > >> > >> order to make PH work, you are forcing proxy to be tight with the
> > >> > >> zookeeper.
> > >> > >>
> > >> > >
> > >> > > This is not needed for all of the Proxy PH handlers.
> > >> > > But Authorization & Authentication  are a core part of this story.
> > >> > > If you implement your "smart proxy" somewhere else and not as a
> > >> Plugin to
> > >> > > the Pulsar Proxy (or Broker)
> > >> > > you cannot leverage the same services, the same way.
> > >> > > It leads to having more chances of having a behaviour different
> from
> > >> > > standard Pulsar.
> > >> > >
> > >> > > PH developers are Pulsar experts, and you know that copy pasting
> > code
> > >> > from
> > >> > > Pulsar, leads to unpredictable behaviour
> > >> > > when you run your plugin in another version of Pulsar.
> > >> > > But if you use an API that is going to be maintained by Pulsar you
> > are
> > >> > > safer and you can think that your code is going to work.
> > >> > >
> > >> > >
> > >> > >>
> > >> > >> 3) Configuring authentication and authorization in proxy is
> already
> > >> > >> challenging. There are a few different combinations. A typical
> > Pulsar
> > >> > >> setup
> > >> > >> is to forward the authentication credentials to the brokers to
> > >> > >> authenticate
> > >> > >> and authorize. If you don't do this correctly, it will introduce
> > >> > security
> > >> > >> holes because a connection can potentially grab the superuser
> > >> credential
> > >> > >> configured in proxy and use superuser credentials to access
> > brokers.
> > >> > From
> > >> > >> this perspective, I think proxy protocol handler doesn't make
> > things
> > >> > >> simpler instead it makes things complicated when it comes to
> > >> > >> authentication
> > >> > >> and authorization.
> > >> > >>
> > >> > >
> > >> > > Yes, this is a very complex problem indeed.
> > >> > >
> > >> > > We can help developers by providing a standard framework to access
> > >> these
> > >> > > services.
> > >> > >
> > >> > > It is very important from my point of view, that we do not
> encourage
> > >> > > developers to create
> > >> > > their own versions of a Pulsar proxy.
> > >> > >
> > >> > > My recent experience is that we can add many new wire protocols to
> > >> Pulsar
> > >> > > and this will help a lot with the adoption of Pulsar.
> > >> > >
> > >> > > As we are doing in many other places on Pulsar we should provide
> > >> tools to
> > >> > > write extensions
> > >> > > and do not let people be too creative.
> > >> > >
> > >> > >
> > >> > >>
> > >> > >> I would like to see these questions are answered before moving
> to a
> > >> > vote.
> > >> > >>
> > >> > >
> > >> > > I hope that we can reach consensus on the need of this API.
> > >> > > because I see that there is a real need for making this happen.
> > >> > >
> > >> > > It is the Pulsar momentum now, there are so many opportunities to
> > >> reach
> > >> > > out to users of other systems,
> > >> > > let's not waste these opportunities.
> > >> > >
> > >> > >
> > >> > > Enrico
> > >> > >
> > >> > >
> > >> > >
> > >> > >>
> > >> > >> - Sijie
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <
> > eolivelli@gmail.com
> > >> >
> > >> > >> wrote:
> > >> > >>
> > >> > >> > Any other comment?
> > >> > >> >
> > >> > >> > I would like to start a VOTE, but I feel we saw too few
> comments
> > >> here
> > >> > >> >
> > >> > >> > Please take a look.
> > >> > >> > I believe it will be a good fit for 2.9.0 release, that is
> going
> > >> to be
> > >> > >> > released in the end of September
> > >> > >> >
> > >> > >> >
> > >> > >> > Enrico
> > >> > >> >
> > >> > >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <
> > mikemarsh17@gmail.com>
> > >> ha
> > >> > >> > scritto:
> > >> > >> >
> > >> > >> > > +1, just read through the PIP. Looks good to me.
> > >> > >> > >
> > >> > >> > > - Michael
> > >> > >> > >
> > >> > >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
> > >> > eolivelli@gmail.com>
> > >> > >> > > wrote:
> > >> > >> > >
> > >> > >> > > > Hello Pulsar fellows,
> > >> > >> > > >
> > >> > >> > > > I have prepared a PIP about adding support for Protocol
> > >> Handlers
> > >> > >> > > >
> > >> > >> > > > This is the GDoc
> > >> > >> > > >
> > >> > >> > > >
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > >> > >> > > >
> > >> > >> > > >
> > >> > >> > > > This is the PR for the implementation
> > >> > >> > > > https://github.com/apache/pulsar/pull/11838/files
> > >> > >> > > >
> > >> > >> > > > I am pretty sure that this PIP will make life of developers
> > of
> > >> > >> Protocol
> > >> > >> > > > Handlers and of Administrators who deploy Protocol Handlers
> > >> very
> > >> > >> nicer
> > >> > >> > > >
> > >> > >> > > > We are still working on the formal PIP process, at the
> moment
> > >> I am
> > >> > >> > > sharing
> > >> > >> > > > with you the document.
> > >> > >> > > > My understanding is that after the discussion, I will
> start a
> > >> VOTE
> > >> > >> > > thread,
> > >> > >> > > > and if the VOTE passes we can move forward with reviewing
> the
> > >> PR,
> > >> > >> and
> > >> > >> > > > hopefully merge this feature for Pulsar 2.9.0
> > >> > >> > > >
> > >> > >> > > > Enrico
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Sijie Guo <gu...@gmail.com>.
> I totally understand this point. I wasn't there when the proxy was born
but
currently
my experience is that the Proxy is perceived as the primary endpoint in
front of the Pulsar cluster
especially when you run in k8s.

The Pulsar Proxy was born because there is no great solution at that point.
However, the Kubernetes stack has evolved beyond what it was before. So
does Pulsar evolve.

For example,
https://github.com/apache/pulsar/wiki/PIP-60%3A-Support-Proxy-server-with-SNI-routing
is introduced to use other mature proxy softwares with SNI routing.

Multiple broker listeners have been introduced to allow better integrations
with proxy and service mesh solutions. Hence I don't think "proxy" is the
primary endpoint in front of a Pulsar cluster anymore.

Hence I don't think proxy PH is the right solution for the problems you are
trying to solve. I would avoid introducing PH to proxy.

- Sijie

On Tue, Sep 14, 2021 at 8:02 AM Enrico Olivelli <eo...@gmail.com> wrote:

> other comments ?
>
> Enrico
>
> Il giorno gio 9 set 2021 alle ore 09:15 Enrico Olivelli <
> eolivelli@gmail.com>
> ha scritto:
>
> > Joe,
> >
> > Il giorno gio 9 set 2021 alle ore 04:31 Joe F <jo...@gmail.com> ha
> > scritto:
> >
> >> Enrico, my initial comment  when you brought up PH was in relation to
> the
> >> larger question about proxying, rather than looking at this in a limited
> >> fashion on how to  make it easy to add new PH in the proxy.
> >>
> >> But specifically with this, here are my comments. Two very
> >> distinct abstractions are being mixed up here, and I'm not sure
> >> whether that is a good idea or not.
> >>
> >
> > One way of seeing this PIP is to simply complete the work initiated with
> > PIP-41 (Introduction of Broker PHs,
> >
> https://github.com/apache/pulsar/wiki/PIP-41%3A-Pluggable-Protocol-Handler
> > ).
> >
> >
> >
> >>
> >> The proxy was designed to move bits and bytes without interpretation,
> >> from
> >> one network to the another.  The issue with Pulsar  is that  it requires
> >> some interpretation of the data to find to which server  a client should
> >> connect. .  Protocol translation crept into the proxy, just to be able
> to
> >> ask this question. Since auth is required to answer this question,  auth
> >> also crept in.    Essentially the proxy was built as a TCP proxy, not
> as a
> >> wire protocol translator.   Some additional hacky things needed to be
> done
> >> to make it work as a TCP proxy,  and in my opinion those things  should
> >> die away to the fullest extent possible
> >>
> >
> > I totally understand this point. I wasn't there when the proxy was born
> > but currently
> > my experience is that the Proxy is perceived as the primary endpoint in
> > front of the Pulsar cluster
> > especially when you run in k8s.
> >
> >
> >
> >>
> >> Because of all this, the current implementation is not ideal.  It's
> usage
> >> is highly restricted in actual deployments, because of potential
> security
> >> risks if the proxy is  misconfigured. One needs to be strict about
> setting
> >> up the proxy  to meet security standards in highly regulated
> environments.
> >>
> >>
> >>
> >> >And we faced the limitation of the need to create a new proxy service
> for
> >> >each new protocol, but all of these "proxy services" have in common
> >> >most of the features of the Pulsar proxy.
> >> >When we also came to deal with System Architects it was clear the
> >> >requirement to have only one single "place" to put all of the
> >> interactions
> >> >at "cluster level" with Pulsar.
> >>
> >> Good idea, a single place seems right. Can the proxy answer the traffic
> >> routing question without interpreting the data? Essentially, move what
> is
> >> done within the proxy now,  to a well known service within the cluster,
> >> and
> >> use that ?
> >>
> >
> > In the usecases I know, simply routing PDUs to internal brokers is not
> > enough
> > but you often need to add complex mapping logic from the External
> Protocol
> > Concepts to Pulsar concepts on the Proxy component.
> >
> > So you have two ways:
> > 1. create your own service and deploy it separately: this was the
> > beginning of my work and the same did some colleagues of mine
> > 2. deploy your code inside the Pulsar Proxy, and leverage current
> > packaging, configuration, tools, security APIs, helm chart.....
> >
> > I started this discussion because I found option 1 very awkward for Proxy
> > Component developers, for System Administrators and for System
> Architects.
> >
> > Developers:
> > - you have to copy/paste some Pulsar Proxy code, import Proxy jars, use
> > internal Pulsar classes to implement Authentication, Authorization,
> Service
> > Discovery., Configuration...
> >
> > System Administrators:
> > - you have a new set of configuration files and tools to manage the
> > settings (and in k8s you have to modify the Helm Chart significantly)
> >
> > System Architects:
> > - you have multiple new components in the pictures, to explain, to
> > justify.....
> >
> > With this proposal:
> >
> > Developers:
> > - use a framework, do not reinvent the wheel, be able to ensure that you
> > are compatible with a give Pulsar version, ensure that the behaviour is
> > consistent with other Pulsar components (like using ProxyConfiguration,
> or
> > the same service lifecycle, same libs) you can evolve more easily
> >
> > System Administrator:
> > - you use proxy.conf/broker.conf, you use Pulsar CLI tools, no need to
> > change the Helm Charts
> >
> > System Architects:
> > - nothing new in the table, every Pulsar docs applies, you have the Proxy
> > that deals with external clients, but it is able to speak Pulsar, Kafka,
> > RabbitMQ, MQTT, ActiveMQ
> >
> >
> >
> >
> >>
> >> >I think this is a good picture of what I mean:
> >> >- PH in the Broker -> add protocols inside the Broker, work for owned
> >> topics
> >> >- PH in the Proxy -> add protocols in front of the whole Cluster
> >> >There is a good amount of processing that should be executed on the
> >> proxy,
> >> >and it is not good to run it on a broker.
> >>
> >>  Is a TCP proxy a good place to do wire protocol translation
> >> (computation)?
> >> Especially if that translation is a good amount of processing?  if it's
> >> not
> >> good to run this much processing on the broker, then it's even worse to
> >> run
> >> it on a network proxy. I can foresee this as a path that will lead to
> >> cluster and load management creeping into the proxy, as soon as you move
> >> beyond what a single proxy can handle.
> >>
> >> But I think these issues (of n/w vs protocol translation) are moot when
> >> you
> >> look at the larger needs of  generic proxy that will support ingress,
> >> configurable protocol handlers, load balancing etc for use with Pulsar.
> >> You
> >> can run a bunch of Pulsar's  proxies today, and there is no means to
> >> manage
> >> them properly. eg: load balance between them/ manage them as a cluster/
> >> have affinity of proxies to topics/tenants. etc. This applies even
> before
> >> this PIP (and more so once you add more processing into the proxy).
> >>
> >> The Pulsar proxy, as it is,  is not amenable to creating anything like a
> >> service mesh. It would demand a lot of work in the proxy. Hence my
> >> initial comment about the proxy eventually becoming a mudball, and why
> we
> >> should rethink this entire proxy.
> >>
> >>  It is tempting to evolve the Pulsar proxy into a service that supports
> >> everything.. ingress, transformation chains, cluster management  etc .
> >> This  will eventually end up  duplicating something which already exists
> >> elsewhere.  My take is that this is better done by building on top of
> >> something like envoy ( or similar) which has built in and mature
> >> features,
> >> and supported by a wide user base.
> >>
> >
> > Unfortunately general purpose proxies or proxies specific to some
> protocol
> > will not be able to
> > do efficiently what we can do using Pulsar APIs, because they cannot
> "map"
> > directly External Concepts to the Pulsar model.
> >
> > I cannot imagine the cost of developing and maintaining a plugin for
> Envoy
> > that is able to deal
> > with Pulsar concepts. For instance it is not written in Java and you
> > cannot use Java Bindings for Pulsar, that are feature complete and always
> > up-to-date with latest features.
> > Also developers that work on PHs are specialized in Pulsar code and in
> > Java (at very high levels), and so for them it is harder to write super
> > efficient and high quality plugins using non-Java languages.
> >
> > So I see a huge value in adding this ability to the Pulsar Proxy.
> >
> > The only alternative to this PIP is to create a new framework for
> creating
> > such "Smart Proxies" in Java and using some official/maintained Pulsar
> API.
> >
> > So we will end up discussing the value of adding such a brand new module,
> > and how to deploy/manage it.
> >
> > It is a huge cost and it will take so much time:
> > - design,
> > - adding new concepts to the architecture,
> > - adding a new service (new management tools),
> > - lot of new code (probably cut/paste from Pulsar Proxy)
> > - helm chart
> > - new configuration files
> > - docs
> >
> > I believe that we should spend our time in adding more bindings/protocol
> > handlers instead of doing that.
> >
> > By the way I will be happy to drive this new effort if this is REALLY
> what
> > we want.
> >
> > So I am convinced that for the short/mid term this PIP is the best choice
> > to help Pulsar adoption.
> >
> > This PIP will unlock some great potential that otherwise will be
> > available only to users of custom tools, not officially maintained
> > inside the Pulsar project.
> > I will be very sad about the outcome
> >
> >
> >
> > Enrico
> >
> >
> >
> >>
> >> -j
> >>
> >> On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eo...@gmail.com>
> >> wrote:
> >>
> >> > (ping)
> >> >
> >> >
> >> > Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
> >> > eolivelli@gmail.com>
> >> > ha scritto:
> >> >
> >> > > Sijie,
> >> > > Thanks for your questions, answers inline below.
> >> > >
> >> > > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <
> guosijie@gmail.com
> >> >
> >> > ha
> >> > > scritto:
> >> > >
> >> > >> I would like to see the clarification between the broker protocol
> >> > handlers
> >> > >> and proxy protocol handlers before moving it to a vote thread.
> >> > >>
> >> > >
> >> > > A PH in the broker is very useful as it allows you to directly
> access
> >> the
> >> > > ManagedLedger and implement high performance adapters for
> >> > > other wire protocols.
> >> > > The bigger limitation is that you can access efficiently only the
> >> topics
> >> > > owned by the local broker.
> >> > > If you try to forward/proxy the request to another broker (you can
> do
> >> it,
> >> > > and this was Matteo's suggestion at the latest Video Community
> >> meeting)
> >> > > you have the downside that the broker has to waste resources to do
> the
> >> > > "proxy work"
> >> > > and you generally want a broker machine to be used only to deal with
> >> the
> >> > > local traffic.
> >> > >
> >> > > The load balancing mechanism of the brokers is not meant to deal
> with
> >> > > additional work due to proxying requests related to the topics for
> >> which
> >> > > the broker is not owner.
> >> > >
> >> > > A PH in the proxy is useful to add new protocols that are running in
> >> > front
> >> > > of the whole cluster and not only of one single broker.
> >> > > This is a very different use case in respect to having the PH in
> >> broker.
> >> > >
> >> > > The work of the proxy usually is to forward requests to the internal
> >> > > services of the cluster, and in case of new protocols in the proxy
> >> > > you need some logic to fill in the gaps in the original
> wireprotocol.
> >> > >
> >> > > System architects expect a different kind of load on the proxy and
> >> other
> >> > > kinds of load on the brokers.
> >> > > For instance you usually can run very few proxies to cover a big
> >> cluster
> >> > > with many brokers.
> >> > > So adding a PH on all the brokers is sometimes overkilling.
> >> > >
> >> > >
> >> > >>
> >> > >> I can see how it will cause confusion for protocol developers.
> >> > >>
> >> > >
> >> > > Protocol developers are very advanced users that do need to
> understand
> >> > > clearly the internals of Pulsar.
> >> > > In fact this request of having PHs in the Proxy layer came from
> myself
> >> > and
> >> > > from other colleagues of mine who are working heavily in
> implementing
> >> > > new protocol handlers in Pulsar.
> >> > >
> >> > > And we faced the limitation of the need to create a new proxy
> service
> >> for
> >> > > each new protocol, but all of these "proxy services" have in common
> >> > > most of the features of the Pulsar proxy.
> >> > > When we also came to deal with System Architects it was clear the
> >> > > requirement to have only one single "place" to put all of the
> >> > interactions
> >> > > at "cluster level" with Pulsar.
> >> > >
> >> > > I think this is a good picture of what I mean:
> >> > > - PH in the Broker -> add protocols inside the Broker, work for
> owned
> >> > > topics
> >> > > - PH in the Proxy -> add protocols in front of the whole Cluster
> >> > >
> >> > >
> >> > >> Yunze brought a good idea on KoP.
> >> > >
> >> > >
> >> > > I also have good ideas and working solutions for a Pulsar-proxy like
> >> KOP
> >> > > Proxy.
> >> > > I will be happy to discuss this in a separate thread or at a
> separate
> >> > > table with Yunze.
> >> > >
> >> > > A smart KOP proxy can work if you run inside the Pulsar proxy
> process
> >> or
> >> > > you can copy/paste the Pulsar Proxy code and create another service.
> >> > >
> >> > >
> >> > >> But I don't think that's the right
> >> > >> direction. If you can give an example of the usage of a proxy
> handler
> >> > and
> >> > >> how it is different from using a broker handler, that would help me
> >> > >> understand this PIP.
> >> > >>
> >> > >
> >> > > For some protocols you have to execute some non trivial work for
> >> mapping
> >> > > the wireprotocol and the concepts of the protocol to the Pulsar
> model.
> >> > > For instance some protocols do not have the concept of "lookup", and
> >> the
> >> > > proxy does the lookup and forwards the request to the internal
> broker.
> >> > >
> >> > > For some protocols you can just use the PulsarClient to connect to
> the
> >> > > internal brokers, you do not need and you do not want to access the
> >> > > ManagedLedgers:
> >> > > in this case adding the execution inside the broker is only
> >> complicating
> >> > > the overall design of the system and putting load on the brokers.
> >> > >
> >> > > There is a good amount of processing that should be executed on the
> >> > proxy,
> >> > > and it is not good to run it on a broker.
> >> > > If you do not put the "custom code" in the Proxy and you can only
> >> write a
> >> > > Broker PH you end up in adding it to the Broker.
> >> > >
> >> > > If you expose directly (with some LoadBalancer or whatever) your
> >> brokers
> >> > > in which you run the PH code that you would put in the proxy
> >> > > you end up in putting on the broker some load that is not expected:
> >> > > - the broker will have to work even for topics for which it is not
> the
> >> > > owner
> >> > > - the broker will have to do things that cannot be dealt correctly
> by
> >> the
> >> > > Pulsar load balancer (because it expects that the load it
> >> proportional to
> >> > > the owned bundles)
> >> > >
> >> > >
> >> > >>
> >> > >> The reason why Pulsar proxy is built is to have a "smart" proxy
> that
> >> is
> >> > >> aware of Pulsar protocol. The Pulsar proxy can be replaced with
> other
> >> > >> mature proxy software with SNI routing or multiple advertised
> >> listeners
> >> > >> now. Hence I am afraid that we are taking the wrong direction here.
> >> Here
> >> > >> are various reasons.
> >> > >>
> >> > >> 1) The ProxyService is essentially a Pulsar admin client. Broker
> >> service
> >> > >> also provides a Pulsar admin client. I am not sure how Proxy PH
> will
> >> > >> simplify the protocol handler development. Please use an example to
> >> > >> demonstrate it.
> >> > >>
> >> > >
> >> > > In the cases I am highlighting, *the Broker is simply not the right
> >> place
> >> > > to run the code*.
> >> > >
> >> > > So the problem here is not to have PulsarAdmin in the Broker on in
> the
> >> > > Proxy.
> >> > > Is that if you want to write a smart proxy for another protocol:
> >> > > - you end up in copy/pasting the Proxy code
> >> > > - you use the internal Pulsar classes to have a consistent behaviour
> >> with
> >> > > the Pulsar Proxy
> >> > > - you add more components to the "picture" of the Pulsar cluster
> >> > >
> >> > >
> >> > >> 2) The Authorization & Authentication services in ProxyService are
> >> only
> >> > >> used when proxies are configured to use zookeeper for broker
> >> discovery.
> >> > >> However, this option is not recommended when running Pulsar proxies
> >> in
> >> > >> Kubernetes. Instead, using a broker discovery service is
> >> recommended. In
> >> > >> order to make PH work, you are forcing proxy to be tight with the
> >> > >> zookeeper.
> >> > >>
> >> > >
> >> > > This is not needed for all of the Proxy PH handlers.
> >> > > But Authorization & Authentication  are a core part of this story.
> >> > > If you implement your "smart proxy" somewhere else and not as a
> >> Plugin to
> >> > > the Pulsar Proxy (or Broker)
> >> > > you cannot leverage the same services, the same way.
> >> > > It leads to having more chances of having a behaviour different from
> >> > > standard Pulsar.
> >> > >
> >> > > PH developers are Pulsar experts, and you know that copy pasting
> code
> >> > from
> >> > > Pulsar, leads to unpredictable behaviour
> >> > > when you run your plugin in another version of Pulsar.
> >> > > But if you use an API that is going to be maintained by Pulsar you
> are
> >> > > safer and you can think that your code is going to work.
> >> > >
> >> > >
> >> > >>
> >> > >> 3) Configuring authentication and authorization in proxy is already
> >> > >> challenging. There are a few different combinations. A typical
> Pulsar
> >> > >> setup
> >> > >> is to forward the authentication credentials to the brokers to
> >> > >> authenticate
> >> > >> and authorize. If you don't do this correctly, it will introduce
> >> > security
> >> > >> holes because a connection can potentially grab the superuser
> >> credential
> >> > >> configured in proxy and use superuser credentials to access
> brokers.
> >> > From
> >> > >> this perspective, I think proxy protocol handler doesn't make
> things
> >> > >> simpler instead it makes things complicated when it comes to
> >> > >> authentication
> >> > >> and authorization.
> >> > >>
> >> > >
> >> > > Yes, this is a very complex problem indeed.
> >> > >
> >> > > We can help developers by providing a standard framework to access
> >> these
> >> > > services.
> >> > >
> >> > > It is very important from my point of view, that we do not encourage
> >> > > developers to create
> >> > > their own versions of a Pulsar proxy.
> >> > >
> >> > > My recent experience is that we can add many new wire protocols to
> >> Pulsar
> >> > > and this will help a lot with the adoption of Pulsar.
> >> > >
> >> > > As we are doing in many other places on Pulsar we should provide
> >> tools to
> >> > > write extensions
> >> > > and do not let people be too creative.
> >> > >
> >> > >
> >> > >>
> >> > >> I would like to see these questions are answered before moving to a
> >> > vote.
> >> > >>
> >> > >
> >> > > I hope that we can reach consensus on the need of this API.
> >> > > because I see that there is a real need for making this happen.
> >> > >
> >> > > It is the Pulsar momentum now, there are so many opportunities to
> >> reach
> >> > > out to users of other systems,
> >> > > let's not waste these opportunities.
> >> > >
> >> > >
> >> > > Enrico
> >> > >
> >> > >
> >> > >
> >> > >>
> >> > >> - Sijie
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <
> eolivelli@gmail.com
> >> >
> >> > >> wrote:
> >> > >>
> >> > >> > Any other comment?
> >> > >> >
> >> > >> > I would like to start a VOTE, but I feel we saw too few comments
> >> here
> >> > >> >
> >> > >> > Please take a look.
> >> > >> > I believe it will be a good fit for 2.9.0 release, that is going
> >> to be
> >> > >> > released in the end of September
> >> > >> >
> >> > >> >
> >> > >> > Enrico
> >> > >> >
> >> > >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <
> mikemarsh17@gmail.com>
> >> ha
> >> > >> > scritto:
> >> > >> >
> >> > >> > > +1, just read through the PIP. Looks good to me.
> >> > >> > >
> >> > >> > > - Michael
> >> > >> > >
> >> > >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
> >> > eolivelli@gmail.com>
> >> > >> > > wrote:
> >> > >> > >
> >> > >> > > > Hello Pulsar fellows,
> >> > >> > > >
> >> > >> > > > I have prepared a PIP about adding support for Protocol
> >> Handlers
> >> > >> > > >
> >> > >> > > > This is the GDoc
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > This is the PR for the implementation
> >> > >> > > > https://github.com/apache/pulsar/pull/11838/files
> >> > >> > > >
> >> > >> > > > I am pretty sure that this PIP will make life of developers
> of
> >> > >> Protocol
> >> > >> > > > Handlers and of Administrators who deploy Protocol Handlers
> >> very
> >> > >> nicer
> >> > >> > > >
> >> > >> > > > We are still working on the formal PIP process, at the moment
> >> I am
> >> > >> > > sharing
> >> > >> > > > with you the document.
> >> > >> > > > My understanding is that after the discussion, I will start a
> >> VOTE
> >> > >> > > thread,
> >> > >> > > > and if the VOTE passes we can move forward with reviewing the
> >> PR,
> >> > >> and
> >> > >> > > > hopefully merge this feature for Pulsar 2.9.0
> >> > >> > > >
> >> > >> > > > Enrico
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> >
> >>
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
other comments ?

Enrico

Il giorno gio 9 set 2021 alle ore 09:15 Enrico Olivelli <eo...@gmail.com>
ha scritto:

> Joe,
>
> Il giorno gio 9 set 2021 alle ore 04:31 Joe F <jo...@gmail.com> ha
> scritto:
>
>> Enrico, my initial comment  when you brought up PH was in relation to the
>> larger question about proxying, rather than looking at this in a limited
>> fashion on how to  make it easy to add new PH in the proxy.
>>
>> But specifically with this, here are my comments. Two very
>> distinct abstractions are being mixed up here, and I'm not sure
>> whether that is a good idea or not.
>>
>
> One way of seeing this PIP is to simply complete the work initiated with
> PIP-41 (Introduction of Broker PHs,
> https://github.com/apache/pulsar/wiki/PIP-41%3A-Pluggable-Protocol-Handler
> ).
>
>
>
>>
>> The proxy was designed to move bits and bytes without interpretation,
>> from
>> one network to the another.  The issue with Pulsar  is that  it requires
>> some interpretation of the data to find to which server  a client should
>> connect. .  Protocol translation crept into the proxy, just to be able to
>> ask this question. Since auth is required to answer this question,  auth
>> also crept in.    Essentially the proxy was built as a TCP proxy, not as a
>> wire protocol translator.   Some additional hacky things needed to be done
>> to make it work as a TCP proxy,  and in my opinion those things  should
>> die away to the fullest extent possible
>>
>
> I totally understand this point. I wasn't there when the proxy was born
> but currently
> my experience is that the Proxy is perceived as the primary endpoint in
> front of the Pulsar cluster
> especially when you run in k8s.
>
>
>
>>
>> Because of all this, the current implementation is not ideal.  It's usage
>> is highly restricted in actual deployments, because of potential security
>> risks if the proxy is  misconfigured. One needs to be strict about setting
>> up the proxy  to meet security standards in highly regulated environments.
>>
>>
>>
>> >And we faced the limitation of the need to create a new proxy service for
>> >each new protocol, but all of these "proxy services" have in common
>> >most of the features of the Pulsar proxy.
>> >When we also came to deal with System Architects it was clear the
>> >requirement to have only one single "place" to put all of the
>> interactions
>> >at "cluster level" with Pulsar.
>>
>> Good idea, a single place seems right. Can the proxy answer the traffic
>> routing question without interpreting the data? Essentially, move what is
>> done within the proxy now,  to a well known service within the cluster,
>> and
>> use that ?
>>
>
> In the usecases I know, simply routing PDUs to internal brokers is not
> enough
> but you often need to add complex mapping logic from the External Protocol
> Concepts to Pulsar concepts on the Proxy component.
>
> So you have two ways:
> 1. create your own service and deploy it separately: this was the
> beginning of my work and the same did some colleagues of mine
> 2. deploy your code inside the Pulsar Proxy, and leverage current
> packaging, configuration, tools, security APIs, helm chart.....
>
> I started this discussion because I found option 1 very awkward for Proxy
> Component developers, for System Administrators and for System Architects.
>
> Developers:
> - you have to copy/paste some Pulsar Proxy code, import Proxy jars, use
> internal Pulsar classes to implement Authentication, Authorization, Service
> Discovery., Configuration...
>
> System Administrators:
> - you have a new set of configuration files and tools to manage the
> settings (and in k8s you have to modify the Helm Chart significantly)
>
> System Architects:
> - you have multiple new components in the pictures, to explain, to
> justify.....
>
> With this proposal:
>
> Developers:
> - use a framework, do not reinvent the wheel, be able to ensure that you
> are compatible with a give Pulsar version, ensure that the behaviour is
> consistent with other Pulsar components (like using ProxyConfiguration, or
> the same service lifecycle, same libs) you can evolve more easily
>
> System Administrator:
> - you use proxy.conf/broker.conf, you use Pulsar CLI tools, no need to
> change the Helm Charts
>
> System Architects:
> - nothing new in the table, every Pulsar docs applies, you have the Proxy
> that deals with external clients, but it is able to speak Pulsar, Kafka,
> RabbitMQ, MQTT, ActiveMQ
>
>
>
>
>>
>> >I think this is a good picture of what I mean:
>> >- PH in the Broker -> add protocols inside the Broker, work for owned
>> topics
>> >- PH in the Proxy -> add protocols in front of the whole Cluster
>> >There is a good amount of processing that should be executed on the
>> proxy,
>> >and it is not good to run it on a broker.
>>
>>  Is a TCP proxy a good place to do wire protocol translation
>> (computation)?
>> Especially if that translation is a good amount of processing?  if it's
>> not
>> good to run this much processing on the broker, then it's even worse to
>> run
>> it on a network proxy. I can foresee this as a path that will lead to
>> cluster and load management creeping into the proxy, as soon as you move
>> beyond what a single proxy can handle.
>>
>> But I think these issues (of n/w vs protocol translation) are moot when
>> you
>> look at the larger needs of  generic proxy that will support ingress,
>> configurable protocol handlers, load balancing etc for use with Pulsar.
>> You
>> can run a bunch of Pulsar's  proxies today, and there is no means to
>> manage
>> them properly. eg: load balance between them/ manage them as a cluster/
>> have affinity of proxies to topics/tenants. etc. This applies even before
>> this PIP (and more so once you add more processing into the proxy).
>>
>> The Pulsar proxy, as it is,  is not amenable to creating anything like a
>> service mesh. It would demand a lot of work in the proxy. Hence my
>> initial comment about the proxy eventually becoming a mudball, and why we
>> should rethink this entire proxy.
>>
>>  It is tempting to evolve the Pulsar proxy into a service that supports
>> everything.. ingress, transformation chains, cluster management  etc .
>> This  will eventually end up  duplicating something which already exists
>> elsewhere.  My take is that this is better done by building on top of
>> something like envoy ( or similar) which has built in and mature
>> features,
>> and supported by a wide user base.
>>
>
> Unfortunately general purpose proxies or proxies specific to some protocol
> will not be able to
> do efficiently what we can do using Pulsar APIs, because they cannot "map"
> directly External Concepts to the Pulsar model.
>
> I cannot imagine the cost of developing and maintaining a plugin for Envoy
> that is able to deal
> with Pulsar concepts. For instance it is not written in Java and you
> cannot use Java Bindings for Pulsar, that are feature complete and always
> up-to-date with latest features.
> Also developers that work on PHs are specialized in Pulsar code and in
> Java (at very high levels), and so for them it is harder to write super
> efficient and high quality plugins using non-Java languages.
>
> So I see a huge value in adding this ability to the Pulsar Proxy.
>
> The only alternative to this PIP is to create a new framework for creating
> such "Smart Proxies" in Java and using some official/maintained Pulsar API.
>
> So we will end up discussing the value of adding such a brand new module,
> and how to deploy/manage it.
>
> It is a huge cost and it will take so much time:
> - design,
> - adding new concepts to the architecture,
> - adding a new service (new management tools),
> - lot of new code (probably cut/paste from Pulsar Proxy)
> - helm chart
> - new configuration files
> - docs
>
> I believe that we should spend our time in adding more bindings/protocol
> handlers instead of doing that.
>
> By the way I will be happy to drive this new effort if this is REALLY what
> we want.
>
> So I am convinced that for the short/mid term this PIP is the best choice
> to help Pulsar adoption.
>
> This PIP will unlock some great potential that otherwise will be
> available only to users of custom tools, not officially maintained
> inside the Pulsar project.
> I will be very sad about the outcome
>
>
>
> Enrico
>
>
>
>>
>> -j
>>
>> On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eo...@gmail.com>
>> wrote:
>>
>> > (ping)
>> >
>> >
>> > Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
>> > eolivelli@gmail.com>
>> > ha scritto:
>> >
>> > > Sijie,
>> > > Thanks for your questions, answers inline below.
>> > >
>> > > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <guosijie@gmail.com
>> >
>> > ha
>> > > scritto:
>> > >
>> > >> I would like to see the clarification between the broker protocol
>> > handlers
>> > >> and proxy protocol handlers before moving it to a vote thread.
>> > >>
>> > >
>> > > A PH in the broker is very useful as it allows you to directly access
>> the
>> > > ManagedLedger and implement high performance adapters for
>> > > other wire protocols.
>> > > The bigger limitation is that you can access efficiently only the
>> topics
>> > > owned by the local broker.
>> > > If you try to forward/proxy the request to another broker (you can do
>> it,
>> > > and this was Matteo's suggestion at the latest Video Community
>> meeting)
>> > > you have the downside that the broker has to waste resources to do the
>> > > "proxy work"
>> > > and you generally want a broker machine to be used only to deal with
>> the
>> > > local traffic.
>> > >
>> > > The load balancing mechanism of the brokers is not meant to deal with
>> > > additional work due to proxying requests related to the topics for
>> which
>> > > the broker is not owner.
>> > >
>> > > A PH in the proxy is useful to add new protocols that are running in
>> > front
>> > > of the whole cluster and not only of one single broker.
>> > > This is a very different use case in respect to having the PH in
>> broker.
>> > >
>> > > The work of the proxy usually is to forward requests to the internal
>> > > services of the cluster, and in case of new protocols in the proxy
>> > > you need some logic to fill in the gaps in the original wireprotocol.
>> > >
>> > > System architects expect a different kind of load on the proxy and
>> other
>> > > kinds of load on the brokers.
>> > > For instance you usually can run very few proxies to cover a big
>> cluster
>> > > with many brokers.
>> > > So adding a PH on all the brokers is sometimes overkilling.
>> > >
>> > >
>> > >>
>> > >> I can see how it will cause confusion for protocol developers.
>> > >>
>> > >
>> > > Protocol developers are very advanced users that do need to understand
>> > > clearly the internals of Pulsar.
>> > > In fact this request of having PHs in the Proxy layer came from myself
>> > and
>> > > from other colleagues of mine who are working heavily in implementing
>> > > new protocol handlers in Pulsar.
>> > >
>> > > And we faced the limitation of the need to create a new proxy service
>> for
>> > > each new protocol, but all of these "proxy services" have in common
>> > > most of the features of the Pulsar proxy.
>> > > When we also came to deal with System Architects it was clear the
>> > > requirement to have only one single "place" to put all of the
>> > interactions
>> > > at "cluster level" with Pulsar.
>> > >
>> > > I think this is a good picture of what I mean:
>> > > - PH in the Broker -> add protocols inside the Broker, work for owned
>> > > topics
>> > > - PH in the Proxy -> add protocols in front of the whole Cluster
>> > >
>> > >
>> > >> Yunze brought a good idea on KoP.
>> > >
>> > >
>> > > I also have good ideas and working solutions for a Pulsar-proxy like
>> KOP
>> > > Proxy.
>> > > I will be happy to discuss this in a separate thread or at a separate
>> > > table with Yunze.
>> > >
>> > > A smart KOP proxy can work if you run inside the Pulsar proxy process
>> or
>> > > you can copy/paste the Pulsar Proxy code and create another service.
>> > >
>> > >
>> > >> But I don't think that's the right
>> > >> direction. If you can give an example of the usage of a proxy handler
>> > and
>> > >> how it is different from using a broker handler, that would help me
>> > >> understand this PIP.
>> > >>
>> > >
>> > > For some protocols you have to execute some non trivial work for
>> mapping
>> > > the wireprotocol and the concepts of the protocol to the Pulsar model.
>> > > For instance some protocols do not have the concept of "lookup", and
>> the
>> > > proxy does the lookup and forwards the request to the internal broker.
>> > >
>> > > For some protocols you can just use the PulsarClient to connect to the
>> > > internal brokers, you do not need and you do not want to access the
>> > > ManagedLedgers:
>> > > in this case adding the execution inside the broker is only
>> complicating
>> > > the overall design of the system and putting load on the brokers.
>> > >
>> > > There is a good amount of processing that should be executed on the
>> > proxy,
>> > > and it is not good to run it on a broker.
>> > > If you do not put the "custom code" in the Proxy and you can only
>> write a
>> > > Broker PH you end up in adding it to the Broker.
>> > >
>> > > If you expose directly (with some LoadBalancer or whatever) your
>> brokers
>> > > in which you run the PH code that you would put in the proxy
>> > > you end up in putting on the broker some load that is not expected:
>> > > - the broker will have to work even for topics for which it is not the
>> > > owner
>> > > - the broker will have to do things that cannot be dealt correctly by
>> the
>> > > Pulsar load balancer (because it expects that the load it
>> proportional to
>> > > the owned bundles)
>> > >
>> > >
>> > >>
>> > >> The reason why Pulsar proxy is built is to have a "smart" proxy that
>> is
>> > >> aware of Pulsar protocol. The Pulsar proxy can be replaced with other
>> > >> mature proxy software with SNI routing or multiple advertised
>> listeners
>> > >> now. Hence I am afraid that we are taking the wrong direction here.
>> Here
>> > >> are various reasons.
>> > >>
>> > >> 1) The ProxyService is essentially a Pulsar admin client. Broker
>> service
>> > >> also provides a Pulsar admin client. I am not sure how Proxy PH will
>> > >> simplify the protocol handler development. Please use an example to
>> > >> demonstrate it.
>> > >>
>> > >
>> > > In the cases I am highlighting, *the Broker is simply not the right
>> place
>> > > to run the code*.
>> > >
>> > > So the problem here is not to have PulsarAdmin in the Broker on in the
>> > > Proxy.
>> > > Is that if you want to write a smart proxy for another protocol:
>> > > - you end up in copy/pasting the Proxy code
>> > > - you use the internal Pulsar classes to have a consistent behaviour
>> with
>> > > the Pulsar Proxy
>> > > - you add more components to the "picture" of the Pulsar cluster
>> > >
>> > >
>> > >> 2) The Authorization & Authentication services in ProxyService are
>> only
>> > >> used when proxies are configured to use zookeeper for broker
>> discovery.
>> > >> However, this option is not recommended when running Pulsar proxies
>> in
>> > >> Kubernetes. Instead, using a broker discovery service is
>> recommended. In
>> > >> order to make PH work, you are forcing proxy to be tight with the
>> > >> zookeeper.
>> > >>
>> > >
>> > > This is not needed for all of the Proxy PH handlers.
>> > > But Authorization & Authentication  are a core part of this story.
>> > > If you implement your "smart proxy" somewhere else and not as a
>> Plugin to
>> > > the Pulsar Proxy (or Broker)
>> > > you cannot leverage the same services, the same way.
>> > > It leads to having more chances of having a behaviour different from
>> > > standard Pulsar.
>> > >
>> > > PH developers are Pulsar experts, and you know that copy pasting code
>> > from
>> > > Pulsar, leads to unpredictable behaviour
>> > > when you run your plugin in another version of Pulsar.
>> > > But if you use an API that is going to be maintained by Pulsar you are
>> > > safer and you can think that your code is going to work.
>> > >
>> > >
>> > >>
>> > >> 3) Configuring authentication and authorization in proxy is already
>> > >> challenging. There are a few different combinations. A typical Pulsar
>> > >> setup
>> > >> is to forward the authentication credentials to the brokers to
>> > >> authenticate
>> > >> and authorize. If you don't do this correctly, it will introduce
>> > security
>> > >> holes because a connection can potentially grab the superuser
>> credential
>> > >> configured in proxy and use superuser credentials to access brokers.
>> > From
>> > >> this perspective, I think proxy protocol handler doesn't make things
>> > >> simpler instead it makes things complicated when it comes to
>> > >> authentication
>> > >> and authorization.
>> > >>
>> > >
>> > > Yes, this is a very complex problem indeed.
>> > >
>> > > We can help developers by providing a standard framework to access
>> these
>> > > services.
>> > >
>> > > It is very important from my point of view, that we do not encourage
>> > > developers to create
>> > > their own versions of a Pulsar proxy.
>> > >
>> > > My recent experience is that we can add many new wire protocols to
>> Pulsar
>> > > and this will help a lot with the adoption of Pulsar.
>> > >
>> > > As we are doing in many other places on Pulsar we should provide
>> tools to
>> > > write extensions
>> > > and do not let people be too creative.
>> > >
>> > >
>> > >>
>> > >> I would like to see these questions are answered before moving to a
>> > vote.
>> > >>
>> > >
>> > > I hope that we can reach consensus on the need of this API.
>> > > because I see that there is a real need for making this happen.
>> > >
>> > > It is the Pulsar momentum now, there are so many opportunities to
>> reach
>> > > out to users of other systems,
>> > > let's not waste these opportunities.
>> > >
>> > >
>> > > Enrico
>> > >
>> > >
>> > >
>> > >>
>> > >> - Sijie
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eolivelli@gmail.com
>> >
>> > >> wrote:
>> > >>
>> > >> > Any other comment?
>> > >> >
>> > >> > I would like to start a VOTE, but I feel we saw too few comments
>> here
>> > >> >
>> > >> > Please take a look.
>> > >> > I believe it will be a good fit for 2.9.0 release, that is going
>> to be
>> > >> > released in the end of September
>> > >> >
>> > >> >
>> > >> > Enrico
>> > >> >
>> > >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com>
>> ha
>> > >> > scritto:
>> > >> >
>> > >> > > +1, just read through the PIP. Looks good to me.
>> > >> > >
>> > >> > > - Michael
>> > >> > >
>> > >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
>> > eolivelli@gmail.com>
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Hello Pulsar fellows,
>> > >> > > >
>> > >> > > > I have prepared a PIP about adding support for Protocol
>> Handlers
>> > >> > > >
>> > >> > > > This is the GDoc
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>> > >> > > >
>> > >> > > >
>> > >> > > > This is the PR for the implementation
>> > >> > > > https://github.com/apache/pulsar/pull/11838/files
>> > >> > > >
>> > >> > > > I am pretty sure that this PIP will make life of developers of
>> > >> Protocol
>> > >> > > > Handlers and of Administrators who deploy Protocol Handlers
>> very
>> > >> nicer
>> > >> > > >
>> > >> > > > We are still working on the formal PIP process, at the moment
>> I am
>> > >> > > sharing
>> > >> > > > with you the document.
>> > >> > > > My understanding is that after the discussion, I will start a
>> VOTE
>> > >> > > thread,
>> > >> > > > and if the VOTE passes we can move forward with reviewing the
>> PR,
>> > >> and
>> > >> > > > hopefully merge this feature for Pulsar 2.9.0
>> > >> > > >
>> > >> > > > Enrico
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> >
>>
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Joe,

Il giorno gio 9 set 2021 alle ore 04:31 Joe F <jo...@gmail.com> ha
scritto:

> Enrico, my initial comment  when you brought up PH was in relation to the
> larger question about proxying, rather than looking at this in a limited
> fashion on how to  make it easy to add new PH in the proxy.
>
> But specifically with this, here are my comments. Two very
> distinct abstractions are being mixed up here, and I'm not sure
> whether that is a good idea or not.
>

One way of seeing this PIP is to simply complete the work initiated with
PIP-41 (Introduction of Broker PHs,
https://github.com/apache/pulsar/wiki/PIP-41%3A-Pluggable-Protocol-Handler).



>
> The proxy was designed to move bits and bytes without interpretation,  from
> one network to the another.  The issue with Pulsar  is that  it requires
> some interpretation of the data to find to which server  a client should
> connect. .  Protocol translation crept into the proxy, just to be able to
> ask this question. Since auth is required to answer this question,  auth
> also crept in.    Essentially the proxy was built as a TCP proxy, not as a
> wire protocol translator.   Some additional hacky things needed to be done
> to make it work as a TCP proxy,  and in my opinion those things  should
> die away to the fullest extent possible
>

I totally understand this point. I wasn't there when the proxy was born but
currently
my experience is that the Proxy is perceived as the primary endpoint in
front of the Pulsar cluster
especially when you run in k8s.



>
> Because of all this, the current implementation is not ideal.  It's usage
> is highly restricted in actual deployments, because of potential security
> risks if the proxy is  misconfigured. One needs to be strict about setting
> up the proxy  to meet security standards in highly regulated environments.
>
>
>
> >And we faced the limitation of the need to create a new proxy service for
> >each new protocol, but all of these "proxy services" have in common
> >most of the features of the Pulsar proxy.
> >When we also came to deal with System Architects it was clear the
> >requirement to have only one single "place" to put all of the interactions
> >at "cluster level" with Pulsar.
>
> Good idea, a single place seems right. Can the proxy answer the traffic
> routing question without interpreting the data? Essentially, move what is
> done within the proxy now,  to a well known service within the cluster, and
> use that ?
>

In the usecases I know, simply routing PDUs to internal brokers is not
enough
but you often need to add complex mapping logic from the External Protocol
Concepts to Pulsar concepts on the Proxy component.

So you have two ways:
1. create your own service and deploy it separately: this was the beginning
of my work and the same did some colleagues of mine
2. deploy your code inside the Pulsar Proxy, and leverage current
packaging, configuration, tools, security APIs, helm chart.....

I started this discussion because I found option 1 very awkward for Proxy
Component developers, for System Administrators and for System Architects.

Developers:
- you have to copy/paste some Pulsar Proxy code, import Proxy jars, use
internal Pulsar classes to implement Authentication, Authorization, Service
Discovery., Configuration...

System Administrators:
- you have a new set of configuration files and tools to manage the
settings (and in k8s you have to modify the Helm Chart significantly)

System Architects:
- you have multiple new components in the pictures, to explain, to
justify.....

With this proposal:

Developers:
- use a framework, do not reinvent the wheel, be able to ensure that you
are compatible with a give Pulsar version, ensure that the behaviour is
consistent with other Pulsar components (like using ProxyConfiguration, or
the same service lifecycle, same libs) you can evolve more easily

System Administrator:
- you use proxy.conf/broker.conf, you use Pulsar CLI tools, no need to
change the Helm Charts

System Architects:
- nothing new in the table, every Pulsar docs applies, you have the Proxy
that deals with external clients, but it is able to speak Pulsar, Kafka,
RabbitMQ, MQTT, ActiveMQ




>
> >I think this is a good picture of what I mean:
> >- PH in the Broker -> add protocols inside the Broker, work for owned
> topics
> >- PH in the Proxy -> add protocols in front of the whole Cluster
> >There is a good amount of processing that should be executed on the proxy,
> >and it is not good to run it on a broker.
>
>  Is a TCP proxy a good place to do wire protocol translation (computation)?
> Especially if that translation is a good amount of processing?  if it's not
> good to run this much processing on the broker, then it's even worse to run
> it on a network proxy. I can foresee this as a path that will lead to
> cluster and load management creeping into the proxy, as soon as you move
> beyond what a single proxy can handle.
>
> But I think these issues (of n/w vs protocol translation) are moot when you
> look at the larger needs of  generic proxy that will support ingress,
> configurable protocol handlers, load balancing etc for use with Pulsar. You
> can run a bunch of Pulsar's  proxies today, and there is no means to manage
> them properly. eg: load balance between them/ manage them as a cluster/
> have affinity of proxies to topics/tenants. etc. This applies even before
> this PIP (and more so once you add more processing into the proxy).
>
> The Pulsar proxy, as it is,  is not amenable to creating anything like a
> service mesh. It would demand a lot of work in the proxy. Hence my
> initial comment about the proxy eventually becoming a mudball, and why we
> should rethink this entire proxy.
>
>  It is tempting to evolve the Pulsar proxy into a service that supports
> everything.. ingress, transformation chains, cluster management  etc .
> This  will eventually end up  duplicating something which already exists
> elsewhere.  My take is that this is better done by building on top of
> something like envoy ( or similar) which has built in and mature  features,
> and supported by a wide user base.
>

Unfortunately general purpose proxies or proxies specific to some protocol
will not be able to
do efficiently what we can do using Pulsar APIs, because they cannot "map"
directly External Concepts to the Pulsar model.

I cannot imagine the cost of developing and maintaining a plugin for Envoy
that is able to deal
with Pulsar concepts. For instance it is not written in Java and you cannot
use Java Bindings for Pulsar, that are feature complete and always
up-to-date with latest features.
Also developers that work on PHs are specialized in Pulsar code and in Java
(at very high levels), and so for them it is harder to write super
efficient and high quality plugins using non-Java languages.

So I see a huge value in adding this ability to the Pulsar Proxy.

The only alternative to this PIP is to create a new framework for creating
such "Smart Proxies" in Java and using some official/maintained Pulsar API.

So we will end up discussing the value of adding such a brand new module,
and how to deploy/manage it.

It is a huge cost and it will take so much time:
- design,
- adding new concepts to the architecture,
- adding a new service (new management tools),
- lot of new code (probably cut/paste from Pulsar Proxy)
- helm chart
- new configuration files
- docs

I believe that we should spend our time in adding more bindings/protocol
handlers instead of doing that.

By the way I will be happy to drive this new effort if this is REALLY what
we want.

So I am convinced that for the short/mid term this PIP is the best choice
to help Pulsar adoption.

This PIP will unlock some great potential that otherwise will be
available only to users of custom tools, not officially maintained
inside the Pulsar project.
I will be very sad about the outcome



Enrico



>
> -j
>
> On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > (ping)
> >
> >
> > Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
> > eolivelli@gmail.com>
> > ha scritto:
> >
> > > Sijie,
> > > Thanks for your questions, answers inline below.
> > >
> > > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <gu...@gmail.com>
> > ha
> > > scritto:
> > >
> > >> I would like to see the clarification between the broker protocol
> > handlers
> > >> and proxy protocol handlers before moving it to a vote thread.
> > >>
> > >
> > > A PH in the broker is very useful as it allows you to directly access
> the
> > > ManagedLedger and implement high performance adapters for
> > > other wire protocols.
> > > The bigger limitation is that you can access efficiently only the
> topics
> > > owned by the local broker.
> > > If you try to forward/proxy the request to another broker (you can do
> it,
> > > and this was Matteo's suggestion at the latest Video Community meeting)
> > > you have the downside that the broker has to waste resources to do the
> > > "proxy work"
> > > and you generally want a broker machine to be used only to deal with
> the
> > > local traffic.
> > >
> > > The load balancing mechanism of the brokers is not meant to deal with
> > > additional work due to proxying requests related to the topics for
> which
> > > the broker is not owner.
> > >
> > > A PH in the proxy is useful to add new protocols that are running in
> > front
> > > of the whole cluster and not only of one single broker.
> > > This is a very different use case in respect to having the PH in
> broker.
> > >
> > > The work of the proxy usually is to forward requests to the internal
> > > services of the cluster, and in case of new protocols in the proxy
> > > you need some logic to fill in the gaps in the original wireprotocol.
> > >
> > > System architects expect a different kind of load on the proxy and
> other
> > > kinds of load on the brokers.
> > > For instance you usually can run very few proxies to cover a big
> cluster
> > > with many brokers.
> > > So adding a PH on all the brokers is sometimes overkilling.
> > >
> > >
> > >>
> > >> I can see how it will cause confusion for protocol developers.
> > >>
> > >
> > > Protocol developers are very advanced users that do need to understand
> > > clearly the internals of Pulsar.
> > > In fact this request of having PHs in the Proxy layer came from myself
> > and
> > > from other colleagues of mine who are working heavily in implementing
> > > new protocol handlers in Pulsar.
> > >
> > > And we faced the limitation of the need to create a new proxy service
> for
> > > each new protocol, but all of these "proxy services" have in common
> > > most of the features of the Pulsar proxy.
> > > When we also came to deal with System Architects it was clear the
> > > requirement to have only one single "place" to put all of the
> > interactions
> > > at "cluster level" with Pulsar.
> > >
> > > I think this is a good picture of what I mean:
> > > - PH in the Broker -> add protocols inside the Broker, work for owned
> > > topics
> > > - PH in the Proxy -> add protocols in front of the whole Cluster
> > >
> > >
> > >> Yunze brought a good idea on KoP.
> > >
> > >
> > > I also have good ideas and working solutions for a Pulsar-proxy like
> KOP
> > > Proxy.
> > > I will be happy to discuss this in a separate thread or at a separate
> > > table with Yunze.
> > >
> > > A smart KOP proxy can work if you run inside the Pulsar proxy process
> or
> > > you can copy/paste the Pulsar Proxy code and create another service.
> > >
> > >
> > >> But I don't think that's the right
> > >> direction. If you can give an example of the usage of a proxy handler
> > and
> > >> how it is different from using a broker handler, that would help me
> > >> understand this PIP.
> > >>
> > >
> > > For some protocols you have to execute some non trivial work for
> mapping
> > > the wireprotocol and the concepts of the protocol to the Pulsar model.
> > > For instance some protocols do not have the concept of "lookup", and
> the
> > > proxy does the lookup and forwards the request to the internal broker.
> > >
> > > For some protocols you can just use the PulsarClient to connect to the
> > > internal brokers, you do not need and you do not want to access the
> > > ManagedLedgers:
> > > in this case adding the execution inside the broker is only
> complicating
> > > the overall design of the system and putting load on the brokers.
> > >
> > > There is a good amount of processing that should be executed on the
> > proxy,
> > > and it is not good to run it on a broker.
> > > If you do not put the "custom code" in the Proxy and you can only
> write a
> > > Broker PH you end up in adding it to the Broker.
> > >
> > > If you expose directly (with some LoadBalancer or whatever) your
> brokers
> > > in which you run the PH code that you would put in the proxy
> > > you end up in putting on the broker some load that is not expected:
> > > - the broker will have to work even for topics for which it is not the
> > > owner
> > > - the broker will have to do things that cannot be dealt correctly by
> the
> > > Pulsar load balancer (because it expects that the load it proportional
> to
> > > the owned bundles)
> > >
> > >
> > >>
> > >> The reason why Pulsar proxy is built is to have a "smart" proxy that
> is
> > >> aware of Pulsar protocol. The Pulsar proxy can be replaced with other
> > >> mature proxy software with SNI routing or multiple advertised
> listeners
> > >> now. Hence I am afraid that we are taking the wrong direction here.
> Here
> > >> are various reasons.
> > >>
> > >> 1) The ProxyService is essentially a Pulsar admin client. Broker
> service
> > >> also provides a Pulsar admin client. I am not sure how Proxy PH will
> > >> simplify the protocol handler development. Please use an example to
> > >> demonstrate it.
> > >>
> > >
> > > In the cases I am highlighting, *the Broker is simply not the right
> place
> > > to run the code*.
> > >
> > > So the problem here is not to have PulsarAdmin in the Broker on in the
> > > Proxy.
> > > Is that if you want to write a smart proxy for another protocol:
> > > - you end up in copy/pasting the Proxy code
> > > - you use the internal Pulsar classes to have a consistent behaviour
> with
> > > the Pulsar Proxy
> > > - you add more components to the "picture" of the Pulsar cluster
> > >
> > >
> > >> 2) The Authorization & Authentication services in ProxyService are
> only
> > >> used when proxies are configured to use zookeeper for broker
> discovery.
> > >> However, this option is not recommended when running Pulsar proxies in
> > >> Kubernetes. Instead, using a broker discovery service is recommended.
> In
> > >> order to make PH work, you are forcing proxy to be tight with the
> > >> zookeeper.
> > >>
> > >
> > > This is not needed for all of the Proxy PH handlers.
> > > But Authorization & Authentication  are a core part of this story.
> > > If you implement your "smart proxy" somewhere else and not as a Plugin
> to
> > > the Pulsar Proxy (or Broker)
> > > you cannot leverage the same services, the same way.
> > > It leads to having more chances of having a behaviour different from
> > > standard Pulsar.
> > >
> > > PH developers are Pulsar experts, and you know that copy pasting code
> > from
> > > Pulsar, leads to unpredictable behaviour
> > > when you run your plugin in another version of Pulsar.
> > > But if you use an API that is going to be maintained by Pulsar you are
> > > safer and you can think that your code is going to work.
> > >
> > >
> > >>
> > >> 3) Configuring authentication and authorization in proxy is already
> > >> challenging. There are a few different combinations. A typical Pulsar
> > >> setup
> > >> is to forward the authentication credentials to the brokers to
> > >> authenticate
> > >> and authorize. If you don't do this correctly, it will introduce
> > security
> > >> holes because a connection can potentially grab the superuser
> credential
> > >> configured in proxy and use superuser credentials to access brokers.
> > From
> > >> this perspective, I think proxy protocol handler doesn't make things
> > >> simpler instead it makes things complicated when it comes to
> > >> authentication
> > >> and authorization.
> > >>
> > >
> > > Yes, this is a very complex problem indeed.
> > >
> > > We can help developers by providing a standard framework to access
> these
> > > services.
> > >
> > > It is very important from my point of view, that we do not encourage
> > > developers to create
> > > their own versions of a Pulsar proxy.
> > >
> > > My recent experience is that we can add many new wire protocols to
> Pulsar
> > > and this will help a lot with the adoption of Pulsar.
> > >
> > > As we are doing in many other places on Pulsar we should provide tools
> to
> > > write extensions
> > > and do not let people be too creative.
> > >
> > >
> > >>
> > >> I would like to see these questions are answered before moving to a
> > vote.
> > >>
> > >
> > > I hope that we can reach consensus on the need of this API.
> > > because I see that there is a real need for making this happen.
> > >
> > > It is the Pulsar momentum now, there are so many opportunities to reach
> > > out to users of other systems,
> > > let's not waste these opportunities.
> > >
> > >
> > > Enrico
> > >
> > >
> > >
> > >>
> > >> - Sijie
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com>
> > >> wrote:
> > >>
> > >> > Any other comment?
> > >> >
> > >> > I would like to start a VOTE, but I feel we saw too few comments
> here
> > >> >
> > >> > Please take a look.
> > >> > I believe it will be a good fit for 2.9.0 release, that is going to
> be
> > >> > released in the end of September
> > >> >
> > >> >
> > >> > Enrico
> > >> >
> > >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com>
> ha
> > >> > scritto:
> > >> >
> > >> > > +1, just read through the PIP. Looks good to me.
> > >> > >
> > >> > > - Michael
> > >> > >
> > >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
> > eolivelli@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hello Pulsar fellows,
> > >> > > >
> > >> > > > I have prepared a PIP about adding support for Protocol Handlers
> > >> > > >
> > >> > > > This is the GDoc
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > >> > > >
> > >> > > >
> > >> > > > This is the PR for the implementation
> > >> > > > https://github.com/apache/pulsar/pull/11838/files
> > >> > > >
> > >> > > > I am pretty sure that this PIP will make life of developers of
> > >> Protocol
> > >> > > > Handlers and of Administrators who deploy Protocol Handlers very
> > >> nicer
> > >> > > >
> > >> > > > We are still working on the formal PIP process, at the moment I
> am
> > >> > > sharing
> > >> > > > with you the document.
> > >> > > > My understanding is that after the discussion, I will start a
> VOTE
> > >> > > thread,
> > >> > > > and if the VOTE passes we can move forward with reviewing the
> PR,
> > >> and
> > >> > > > hopefully merge this feature for Pulsar 2.9.0
> > >> > > >
> > >> > > > Enrico
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Joe F <jo...@gmail.com>.
Enrico, my initial comment  when you brought up PH was in relation to the
larger question about proxying, rather than looking at this in a limited
fashion on how to  make it easy to add new PH in the proxy.

But specifically with this, here are my comments. Two very
distinct abstractions are being mixed up here, and I'm not sure
whether that is a good idea or not.

The proxy was designed to move bits and bytes without interpretation,  from
one network to the another.  The issue with Pulsar  is that  it requires
some interpretation of the data to find to which server  a client should
connect. .  Protocol translation crept into the proxy, just to be able to
ask this question. Since auth is required to answer this question,  auth
also crept in.    Essentially the proxy was built as a TCP proxy, not as a
wire protocol translator.   Some additional hacky things needed to be done
to make it work as a TCP proxy,  and in my opinion those things  should
die away to the fullest extent possible

Because of all this, the current implementation is not ideal.  It's usage
is highly restricted in actual deployments, because of potential security
risks if the proxy is  misconfigured. One needs to be strict about setting
up the proxy  to meet security standards in highly regulated environments.



>And we faced the limitation of the need to create a new proxy service for
>each new protocol, but all of these "proxy services" have in common
>most of the features of the Pulsar proxy.
>When we also came to deal with System Architects it was clear the
>requirement to have only one single "place" to put all of the interactions
>at "cluster level" with Pulsar.

Good idea, a single place seems right. Can the proxy answer the traffic
routing question without interpreting the data? Essentially, move what is
done within the proxy now,  to a well known service within the cluster, and
use that ?

>I think this is a good picture of what I mean:
>- PH in the Broker -> add protocols inside the Broker, work for owned
topics
>- PH in the Proxy -> add protocols in front of the whole Cluster
>There is a good amount of processing that should be executed on the proxy,
>and it is not good to run it on a broker.

 Is a TCP proxy a good place to do wire protocol translation (computation)?
Especially if that translation is a good amount of processing?  if it's not
good to run this much processing on the broker, then it's even worse to run
it on a network proxy. I can foresee this as a path that will lead to
cluster and load management creeping into the proxy, as soon as you move
beyond what a single proxy can handle.

But I think these issues (of n/w vs protocol translation) are moot when you
look at the larger needs of  generic proxy that will support ingress,
configurable protocol handlers, load balancing etc for use with Pulsar. You
can run a bunch of Pulsar's  proxies today, and there is no means to manage
them properly. eg: load balance between them/ manage them as a cluster/
have affinity of proxies to topics/tenants. etc. This applies even before
this PIP (and more so once you add more processing into the proxy).

The Pulsar proxy, as it is,  is not amenable to creating anything like a
service mesh. It would demand a lot of work in the proxy. Hence my
initial comment about the proxy eventually becoming a mudball, and why we
should rethink this entire proxy.

 It is tempting to evolve the Pulsar proxy into a service that supports
everything.. ingress, transformation chains, cluster management  etc .
This  will eventually end up  duplicating something which already exists
elsewhere.  My take is that this is better done by building on top of
something like envoy ( or similar) which has built in and mature  features,
and supported by a wide user base.

-j

On Tue, Sep 7, 2021 at 11:11 PM Enrico Olivelli <eo...@gmail.com> wrote:

> (ping)
>
>
> Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <
> eolivelli@gmail.com>
> ha scritto:
>
> > Sijie,
> > Thanks for your questions, answers inline below.
> >
> > Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <gu...@gmail.com>
> ha
> > scritto:
> >
> >> I would like to see the clarification between the broker protocol
> handlers
> >> and proxy protocol handlers before moving it to a vote thread.
> >>
> >
> > A PH in the broker is very useful as it allows you to directly access the
> > ManagedLedger and implement high performance adapters for
> > other wire protocols.
> > The bigger limitation is that you can access efficiently only the topics
> > owned by the local broker.
> > If you try to forward/proxy the request to another broker (you can do it,
> > and this was Matteo's suggestion at the latest Video Community meeting)
> > you have the downside that the broker has to waste resources to do the
> > "proxy work"
> > and you generally want a broker machine to be used only to deal with the
> > local traffic.
> >
> > The load balancing mechanism of the brokers is not meant to deal with
> > additional work due to proxying requests related to the topics for which
> > the broker is not owner.
> >
> > A PH in the proxy is useful to add new protocols that are running in
> front
> > of the whole cluster and not only of one single broker.
> > This is a very different use case in respect to having the PH in broker.
> >
> > The work of the proxy usually is to forward requests to the internal
> > services of the cluster, and in case of new protocols in the proxy
> > you need some logic to fill in the gaps in the original wireprotocol.
> >
> > System architects expect a different kind of load on the proxy and other
> > kinds of load on the brokers.
> > For instance you usually can run very few proxies to cover a big cluster
> > with many brokers.
> > So adding a PH on all the brokers is sometimes overkilling.
> >
> >
> >>
> >> I can see how it will cause confusion for protocol developers.
> >>
> >
> > Protocol developers are very advanced users that do need to understand
> > clearly the internals of Pulsar.
> > In fact this request of having PHs in the Proxy layer came from myself
> and
> > from other colleagues of mine who are working heavily in implementing
> > new protocol handlers in Pulsar.
> >
> > And we faced the limitation of the need to create a new proxy service for
> > each new protocol, but all of these "proxy services" have in common
> > most of the features of the Pulsar proxy.
> > When we also came to deal with System Architects it was clear the
> > requirement to have only one single "place" to put all of the
> interactions
> > at "cluster level" with Pulsar.
> >
> > I think this is a good picture of what I mean:
> > - PH in the Broker -> add protocols inside the Broker, work for owned
> > topics
> > - PH in the Proxy -> add protocols in front of the whole Cluster
> >
> >
> >> Yunze brought a good idea on KoP.
> >
> >
> > I also have good ideas and working solutions for a Pulsar-proxy like KOP
> > Proxy.
> > I will be happy to discuss this in a separate thread or at a separate
> > table with Yunze.
> >
> > A smart KOP proxy can work if you run inside the Pulsar proxy process or
> > you can copy/paste the Pulsar Proxy code and create another service.
> >
> >
> >> But I don't think that's the right
> >> direction. If you can give an example of the usage of a proxy handler
> and
> >> how it is different from using a broker handler, that would help me
> >> understand this PIP.
> >>
> >
> > For some protocols you have to execute some non trivial work for mapping
> > the wireprotocol and the concepts of the protocol to the Pulsar model.
> > For instance some protocols do not have the concept of "lookup", and the
> > proxy does the lookup and forwards the request to the internal broker.
> >
> > For some protocols you can just use the PulsarClient to connect to the
> > internal brokers, you do not need and you do not want to access the
> > ManagedLedgers:
> > in this case adding the execution inside the broker is only complicating
> > the overall design of the system and putting load on the brokers.
> >
> > There is a good amount of processing that should be executed on the
> proxy,
> > and it is not good to run it on a broker.
> > If you do not put the "custom code" in the Proxy and you can only write a
> > Broker PH you end up in adding it to the Broker.
> >
> > If you expose directly (with some LoadBalancer or whatever) your brokers
> > in which you run the PH code that you would put in the proxy
> > you end up in putting on the broker some load that is not expected:
> > - the broker will have to work even for topics for which it is not the
> > owner
> > - the broker will have to do things that cannot be dealt correctly by the
> > Pulsar load balancer (because it expects that the load it proportional to
> > the owned bundles)
> >
> >
> >>
> >> The reason why Pulsar proxy is built is to have a "smart" proxy that is
> >> aware of Pulsar protocol. The Pulsar proxy can be replaced with other
> >> mature proxy software with SNI routing or multiple advertised listeners
> >> now. Hence I am afraid that we are taking the wrong direction here. Here
> >> are various reasons.
> >>
> >> 1) The ProxyService is essentially a Pulsar admin client. Broker service
> >> also provides a Pulsar admin client. I am not sure how Proxy PH will
> >> simplify the protocol handler development. Please use an example to
> >> demonstrate it.
> >>
> >
> > In the cases I am highlighting, *the Broker is simply not the right place
> > to run the code*.
> >
> > So the problem here is not to have PulsarAdmin in the Broker on in the
> > Proxy.
> > Is that if you want to write a smart proxy for another protocol:
> > - you end up in copy/pasting the Proxy code
> > - you use the internal Pulsar classes to have a consistent behaviour with
> > the Pulsar Proxy
> > - you add more components to the "picture" of the Pulsar cluster
> >
> >
> >> 2) The Authorization & Authentication services in ProxyService are only
> >> used when proxies are configured to use zookeeper for broker discovery.
> >> However, this option is not recommended when running Pulsar proxies in
> >> Kubernetes. Instead, using a broker discovery service is recommended. In
> >> order to make PH work, you are forcing proxy to be tight with the
> >> zookeeper.
> >>
> >
> > This is not needed for all of the Proxy PH handlers.
> > But Authorization & Authentication  are a core part of this story.
> > If you implement your "smart proxy" somewhere else and not as a Plugin to
> > the Pulsar Proxy (or Broker)
> > you cannot leverage the same services, the same way.
> > It leads to having more chances of having a behaviour different from
> > standard Pulsar.
> >
> > PH developers are Pulsar experts, and you know that copy pasting code
> from
> > Pulsar, leads to unpredictable behaviour
> > when you run your plugin in another version of Pulsar.
> > But if you use an API that is going to be maintained by Pulsar you are
> > safer and you can think that your code is going to work.
> >
> >
> >>
> >> 3) Configuring authentication and authorization in proxy is already
> >> challenging. There are a few different combinations. A typical Pulsar
> >> setup
> >> is to forward the authentication credentials to the brokers to
> >> authenticate
> >> and authorize. If you don't do this correctly, it will introduce
> security
> >> holes because a connection can potentially grab the superuser credential
> >> configured in proxy and use superuser credentials to access brokers.
> From
> >> this perspective, I think proxy protocol handler doesn't make things
> >> simpler instead it makes things complicated when it comes to
> >> authentication
> >> and authorization.
> >>
> >
> > Yes, this is a very complex problem indeed.
> >
> > We can help developers by providing a standard framework to access these
> > services.
> >
> > It is very important from my point of view, that we do not encourage
> > developers to create
> > their own versions of a Pulsar proxy.
> >
> > My recent experience is that we can add many new wire protocols to Pulsar
> > and this will help a lot with the adoption of Pulsar.
> >
> > As we are doing in many other places on Pulsar we should provide tools to
> > write extensions
> > and do not let people be too creative.
> >
> >
> >>
> >> I would like to see these questions are answered before moving to a
> vote.
> >>
> >
> > I hope that we can reach consensus on the need of this API.
> > because I see that there is a real need for making this happen.
> >
> > It is the Pulsar momentum now, there are so many opportunities to reach
> > out to users of other systems,
> > let's not waste these opportunities.
> >
> >
> > Enrico
> >
> >
> >
> >>
> >> - Sijie
> >>
> >>
> >>
> >>
> >> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com>
> >> wrote:
> >>
> >> > Any other comment?
> >> >
> >> > I would like to start a VOTE, but I feel we saw too few comments here
> >> >
> >> > Please take a look.
> >> > I believe it will be a good fit for 2.9.0 release, that is going to be
> >> > released in the end of September
> >> >
> >> >
> >> > Enrico
> >> >
> >> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
> >> > scritto:
> >> >
> >> > > +1, just read through the PIP. Looks good to me.
> >> > >
> >> > > - Michael
> >> > >
> >> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <
> eolivelli@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hello Pulsar fellows,
> >> > > >
> >> > > > I have prepared a PIP about adding support for Protocol Handlers
> >> > > >
> >> > > > This is the GDoc
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >> > > >
> >> > > >
> >> > > > This is the PR for the implementation
> >> > > > https://github.com/apache/pulsar/pull/11838/files
> >> > > >
> >> > > > I am pretty sure that this PIP will make life of developers of
> >> Protocol
> >> > > > Handlers and of Administrators who deploy Protocol Handlers very
> >> nicer
> >> > > >
> >> > > > We are still working on the formal PIP process, at the moment I am
> >> > > sharing
> >> > > > with you the document.
> >> > > > My understanding is that after the discussion, I will start a VOTE
> >> > > thread,
> >> > > > and if the VOTE passes we can move forward with reviewing the PR,
> >> and
> >> > > > hopefully merge this feature for Pulsar 2.9.0
> >> > > >
> >> > > > Enrico
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
(ping)


Il giorno ven 3 set 2021 alle ore 14:06 Enrico Olivelli <eo...@gmail.com>
ha scritto:

> Sijie,
> Thanks for your questions, answers inline below.
>
> Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <gu...@gmail.com> ha
> scritto:
>
>> I would like to see the clarification between the broker protocol handlers
>> and proxy protocol handlers before moving it to a vote thread.
>>
>
> A PH in the broker is very useful as it allows you to directly access the
> ManagedLedger and implement high performance adapters for
> other wire protocols.
> The bigger limitation is that you can access efficiently only the topics
> owned by the local broker.
> If you try to forward/proxy the request to another broker (you can do it,
> and this was Matteo's suggestion at the latest Video Community meeting)
> you have the downside that the broker has to waste resources to do the
> "proxy work"
> and you generally want a broker machine to be used only to deal with the
> local traffic.
>
> The load balancing mechanism of the brokers is not meant to deal with
> additional work due to proxying requests related to the topics for which
> the broker is not owner.
>
> A PH in the proxy is useful to add new protocols that are running in front
> of the whole cluster and not only of one single broker.
> This is a very different use case in respect to having the PH in broker.
>
> The work of the proxy usually is to forward requests to the internal
> services of the cluster, and in case of new protocols in the proxy
> you need some logic to fill in the gaps in the original wireprotocol.
>
> System architects expect a different kind of load on the proxy and other
> kinds of load on the brokers.
> For instance you usually can run very few proxies to cover a big cluster
> with many brokers.
> So adding a PH on all the brokers is sometimes overkilling.
>
>
>>
>> I can see how it will cause confusion for protocol developers.
>>
>
> Protocol developers are very advanced users that do need to understand
> clearly the internals of Pulsar.
> In fact this request of having PHs in the Proxy layer came from myself and
> from other colleagues of mine who are working heavily in implementing
> new protocol handlers in Pulsar.
>
> And we faced the limitation of the need to create a new proxy service for
> each new protocol, but all of these "proxy services" have in common
> most of the features of the Pulsar proxy.
> When we also came to deal with System Architects it was clear the
> requirement to have only one single "place" to put all of the interactions
> at "cluster level" with Pulsar.
>
> I think this is a good picture of what I mean:
> - PH in the Broker -> add protocols inside the Broker, work for owned
> topics
> - PH in the Proxy -> add protocols in front of the whole Cluster
>
>
>> Yunze brought a good idea on KoP.
>
>
> I also have good ideas and working solutions for a Pulsar-proxy like KOP
> Proxy.
> I will be happy to discuss this in a separate thread or at a separate
> table with Yunze.
>
> A smart KOP proxy can work if you run inside the Pulsar proxy process or
> you can copy/paste the Pulsar Proxy code and create another service.
>
>
>> But I don't think that's the right
>> direction. If you can give an example of the usage of a proxy handler and
>> how it is different from using a broker handler, that would help me
>> understand this PIP.
>>
>
> For some protocols you have to execute some non trivial work for mapping
> the wireprotocol and the concepts of the protocol to the Pulsar model.
> For instance some protocols do not have the concept of "lookup", and the
> proxy does the lookup and forwards the request to the internal broker.
>
> For some protocols you can just use the PulsarClient to connect to the
> internal brokers, you do not need and you do not want to access the
> ManagedLedgers:
> in this case adding the execution inside the broker is only complicating
> the overall design of the system and putting load on the brokers.
>
> There is a good amount of processing that should be executed on the proxy,
> and it is not good to run it on a broker.
> If you do not put the "custom code" in the Proxy and you can only write a
> Broker PH you end up in adding it to the Broker.
>
> If you expose directly (with some LoadBalancer or whatever) your brokers
> in which you run the PH code that you would put in the proxy
> you end up in putting on the broker some load that is not expected:
> - the broker will have to work even for topics for which it is not the
> owner
> - the broker will have to do things that cannot be dealt correctly by the
> Pulsar load balancer (because it expects that the load it proportional to
> the owned bundles)
>
>
>>
>> The reason why Pulsar proxy is built is to have a "smart" proxy that is
>> aware of Pulsar protocol. The Pulsar proxy can be replaced with other
>> mature proxy software with SNI routing or multiple advertised listeners
>> now. Hence I am afraid that we are taking the wrong direction here. Here
>> are various reasons.
>>
>> 1) The ProxyService is essentially a Pulsar admin client. Broker service
>> also provides a Pulsar admin client. I am not sure how Proxy PH will
>> simplify the protocol handler development. Please use an example to
>> demonstrate it.
>>
>
> In the cases I am highlighting, *the Broker is simply not the right place
> to run the code*.
>
> So the problem here is not to have PulsarAdmin in the Broker on in the
> Proxy.
> Is that if you want to write a smart proxy for another protocol:
> - you end up in copy/pasting the Proxy code
> - you use the internal Pulsar classes to have a consistent behaviour with
> the Pulsar Proxy
> - you add more components to the "picture" of the Pulsar cluster
>
>
>> 2) The Authorization & Authentication services in ProxyService are only
>> used when proxies are configured to use zookeeper for broker discovery.
>> However, this option is not recommended when running Pulsar proxies in
>> Kubernetes. Instead, using a broker discovery service is recommended. In
>> order to make PH work, you are forcing proxy to be tight with the
>> zookeeper.
>>
>
> This is not needed for all of the Proxy PH handlers.
> But Authorization & Authentication  are a core part of this story.
> If you implement your "smart proxy" somewhere else and not as a Plugin to
> the Pulsar Proxy (or Broker)
> you cannot leverage the same services, the same way.
> It leads to having more chances of having a behaviour different from
> standard Pulsar.
>
> PH developers are Pulsar experts, and you know that copy pasting code from
> Pulsar, leads to unpredictable behaviour
> when you run your plugin in another version of Pulsar.
> But if you use an API that is going to be maintained by Pulsar you are
> safer and you can think that your code is going to work.
>
>
>>
>> 3) Configuring authentication and authorization in proxy is already
>> challenging. There are a few different combinations. A typical Pulsar
>> setup
>> is to forward the authentication credentials to the brokers to
>> authenticate
>> and authorize. If you don't do this correctly, it will introduce security
>> holes because a connection can potentially grab the superuser credential
>> configured in proxy and use superuser credentials to access brokers. From
>> this perspective, I think proxy protocol handler doesn't make things
>> simpler instead it makes things complicated when it comes to
>> authentication
>> and authorization.
>>
>
> Yes, this is a very complex problem indeed.
>
> We can help developers by providing a standard framework to access these
> services.
>
> It is very important from my point of view, that we do not encourage
> developers to create
> their own versions of a Pulsar proxy.
>
> My recent experience is that we can add many new wire protocols to Pulsar
> and this will help a lot with the adoption of Pulsar.
>
> As we are doing in many other places on Pulsar we should provide tools to
> write extensions
> and do not let people be too creative.
>
>
>>
>> I would like to see these questions are answered before moving to a vote.
>>
>
> I hope that we can reach consensus on the need of this API.
> because I see that there is a real need for making this happen.
>
> It is the Pulsar momentum now, there are so many opportunities to reach
> out to users of other systems,
> let's not waste these opportunities.
>
>
> Enrico
>
>
>
>>
>> - Sijie
>>
>>
>>
>>
>> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com>
>> wrote:
>>
>> > Any other comment?
>> >
>> > I would like to start a VOTE, but I feel we saw too few comments here
>> >
>> > Please take a look.
>> > I believe it will be a good fit for 2.9.0 release, that is going to be
>> > released in the end of September
>> >
>> >
>> > Enrico
>> >
>> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
>> > scritto:
>> >
>> > > +1, just read through the PIP. Looks good to me.
>> > >
>> > > - Michael
>> > >
>> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eo...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hello Pulsar fellows,
>> > > >
>> > > > I have prepared a PIP about adding support for Protocol Handlers
>> > > >
>> > > > This is the GDoc
>> > > >
>> > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>> > > >
>> > > >
>> > > > This is the PR for the implementation
>> > > > https://github.com/apache/pulsar/pull/11838/files
>> > > >
>> > > > I am pretty sure that this PIP will make life of developers of
>> Protocol
>> > > > Handlers and of Administrators who deploy Protocol Handlers very
>> nicer
>> > > >
>> > > > We are still working on the formal PIP process, at the moment I am
>> > > sharing
>> > > > with you the document.
>> > > > My understanding is that after the discussion, I will start a VOTE
>> > > thread,
>> > > > and if the VOTE passes we can move forward with reviewing the PR,
>> and
>> > > > hopefully merge this feature for Pulsar 2.9.0
>> > > >
>> > > > Enrico
>> > > >
>> > >
>> >
>>
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Sijie,
Thanks for your questions, answers inline below.

Il giorno gio 2 set 2021 alle ore 02:23 Sijie Guo <gu...@gmail.com> ha
scritto:

> I would like to see the clarification between the broker protocol handlers
> and proxy protocol handlers before moving it to a vote thread.
>

A PH in the broker is very useful as it allows you to directly access the
ManagedLedger and implement high performance adapters for
other wire protocols.
The bigger limitation is that you can access efficiently only the topics
owned by the local broker.
If you try to forward/proxy the request to another broker (you can do it,
and this was Matteo's suggestion at the latest Video Community meeting)
you have the downside that the broker has to waste resources to do the
"proxy work"
and you generally want a broker machine to be used only to deal with the
local traffic.

The load balancing mechanism of the brokers is not meant to deal with
additional work due to proxying requests related to the topics for which
the broker is not owner.

A PH in the proxy is useful to add new protocols that are running in front
of the whole cluster and not only of one single broker.
This is a very different use case in respect to having the PH in broker.

The work of the proxy usually is to forward requests to the internal
services of the cluster, and in case of new protocols in the proxy
you need some logic to fill in the gaps in the original wireprotocol.

System architects expect a different kind of load on the proxy and other
kinds of load on the brokers.
For instance you usually can run very few proxies to cover a big cluster
with many brokers.
So adding a PH on all the brokers is sometimes overkilling.


>
> I can see how it will cause confusion for protocol developers.
>

Protocol developers are very advanced users that do need to understand
clearly the internals of Pulsar.
In fact this request of having PHs in the Proxy layer came from myself and
from other colleagues of mine who are working heavily in implementing
new protocol handlers in Pulsar.

And we faced the limitation of the need to create a new proxy service for
each new protocol, but all of these "proxy services" have in common
most of the features of the Pulsar proxy.
When we also came to deal with System Architects it was clear the
requirement to have only one single "place" to put all of the interactions
at "cluster level" with Pulsar.

I think this is a good picture of what I mean:
- PH in the Broker -> add protocols inside the Broker, work for owned topics
- PH in the Proxy -> add protocols in front of the whole Cluster


> Yunze brought a good idea on KoP.


I also have good ideas and working solutions for a Pulsar-proxy like KOP
Proxy.
I will be happy to discuss this in a separate thread or at a separate table
with Yunze.

A smart KOP proxy can work if you run inside the Pulsar proxy process or
you can copy/paste the Pulsar Proxy code and create another service.


> But I don't think that's the right
> direction. If you can give an example of the usage of a proxy handler and
> how it is different from using a broker handler, that would help me
> understand this PIP.
>

For some protocols you have to execute some non trivial work for mapping
the wireprotocol and the concepts of the protocol to the Pulsar model.
For instance some protocols do not have the concept of "lookup", and the
proxy does the lookup and forwards the request to the internal broker.

For some protocols you can just use the PulsarClient to connect to the
internal brokers, you do not need and you do not want to access the
ManagedLedgers:
in this case adding the execution inside the broker is only complicating
the overall design of the system and putting load on the brokers.

There is a good amount of processing that should be executed on the proxy,
and it is not good to run it on a broker.
If you do not put the "custom code" in the Proxy and you can only write a
Broker PH you end up in adding it to the Broker.

If you expose directly (with some LoadBalancer or whatever) your brokers in
which you run the PH code that you would put in the proxy
you end up in putting on the broker some load that is not expected:
- the broker will have to work even for topics for which it is not the owner
- the broker will have to do things that cannot be dealt correctly by the
Pulsar load balancer (because it expects that the load it proportional to
the owned bundles)


>
> The reason why Pulsar proxy is built is to have a "smart" proxy that is
> aware of Pulsar protocol. The Pulsar proxy can be replaced with other
> mature proxy software with SNI routing or multiple advertised listeners
> now. Hence I am afraid that we are taking the wrong direction here. Here
> are various reasons.
>
> 1) The ProxyService is essentially a Pulsar admin client. Broker service
> also provides a Pulsar admin client. I am not sure how Proxy PH will
> simplify the protocol handler development. Please use an example to
> demonstrate it.
>

In the cases I am highlighting, *the Broker is simply not the right place
to run the code*.

So the problem here is not to have PulsarAdmin in the Broker on in the
Proxy.
Is that if you want to write a smart proxy for another protocol:
- you end up in copy/pasting the Proxy code
- you use the internal Pulsar classes to have a consistent behaviour with
the Pulsar Proxy
- you add more components to the "picture" of the Pulsar cluster


> 2) The Authorization & Authentication services in ProxyService are only
> used when proxies are configured to use zookeeper for broker discovery.
> However, this option is not recommended when running Pulsar proxies in
> Kubernetes. Instead, using a broker discovery service is recommended. In
> order to make PH work, you are forcing proxy to be tight with the
> zookeeper.
>

This is not needed for all of the Proxy PH handlers.
But Authorization & Authentication  are a core part of this story.
If you implement your "smart proxy" somewhere else and not as a Plugin to
the Pulsar Proxy (or Broker)
you cannot leverage the same services, the same way.
It leads to having more chances of having a behaviour different from
standard Pulsar.

PH developers are Pulsar experts, and you know that copy pasting code from
Pulsar, leads to unpredictable behaviour
when you run your plugin in another version of Pulsar.
But if you use an API that is going to be maintained by Pulsar you are
safer and you can think that your code is going to work.


>
> 3) Configuring authentication and authorization in proxy is already
> challenging. There are a few different combinations. A typical Pulsar setup
> is to forward the authentication credentials to the brokers to authenticate
> and authorize. If you don't do this correctly, it will introduce security
> holes because a connection can potentially grab the superuser credential
> configured in proxy and use superuser credentials to access brokers. From
> this perspective, I think proxy protocol handler doesn't make things
> simpler instead it makes things complicated when it comes to authentication
> and authorization.
>

Yes, this is a very complex problem indeed.

We can help developers by providing a standard framework to access these
services.

It is very important from my point of view, that we do not encourage
developers to create
their own versions of a Pulsar proxy.

My recent experience is that we can add many new wire protocols to Pulsar
and this will help a lot with the adoption of Pulsar.

As we are doing in many other places on Pulsar we should provide tools to
write extensions
and do not let people be too creative.


>
> I would like to see these questions are answered before moving to a vote.
>

I hope that we can reach consensus on the need of this API.
because I see that there is a real need for making this happen.

It is the Pulsar momentum now, there are so many opportunities to reach out
to users of other systems,
let's not waste these opportunities.


Enrico



>
> - Sijie
>
>
>
>
> On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > Any other comment?
> >
> > I would like to start a VOTE, but I feel we saw too few comments here
> >
> > Please take a look.
> > I believe it will be a good fit for 2.9.0 release, that is going to be
> > released in the end of September
> >
> >
> > Enrico
> >
> > Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
> > scritto:
> >
> > > +1, just read through the PIP. Looks good to me.
> > >
> > > - Michael
> > >
> > > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eo...@gmail.com>
> > > wrote:
> > >
> > > > Hello Pulsar fellows,
> > > >
> > > > I have prepared a PIP about adding support for Protocol Handlers
> > > >
> > > > This is the GDoc
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > > >
> > > >
> > > > This is the PR for the implementation
> > > > https://github.com/apache/pulsar/pull/11838/files
> > > >
> > > > I am pretty sure that this PIP will make life of developers of
> Protocol
> > > > Handlers and of Administrators who deploy Protocol Handlers very
> nicer
> > > >
> > > > We are still working on the formal PIP process, at the moment I am
> > > sharing
> > > > with you the document.
> > > > My understanding is that after the discussion, I will start a VOTE
> > > thread,
> > > > and if the VOTE passes we can move forward with reviewing the PR, and
> > > > hopefully merge this feature for Pulsar 2.9.0
> > > >
> > > > Enrico
> > > >
> > >
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Sijie Guo <gu...@gmail.com>.
I would like to see the clarification between the broker protocol handlers
and proxy protocol handlers before moving it to a vote thread.

I can see how it will cause confusion for protocol developers.

Yunze brought a good idea on KoP. But I don't think that's the right
direction. If you can give an example of the usage of a proxy handler and
how it is different from using a broker handler, that would help me
understand this PIP.

The reason why Pulsar proxy is built is to have a "smart" proxy that is
aware of Pulsar protocol. The Pulsar proxy can be replaced with other
mature proxy software with SNI routing or multiple advertised listeners
now. Hence I am afraid that we are taking the wrong direction here. Here
are various reasons.

1) The ProxyService is essentially a Pulsar admin client. Broker service
also provides a Pulsar admin client. I am not sure how Proxy PH will
simplify the protocol handler development. Please use an example to
demonstrate it.

2) The Authorization & Authentication services in ProxyService are only
used when proxies are configured to use zookeeper for broker discovery.
However, this option is not recommended when running Pulsar proxies in
Kubernetes. Instead, using a broker discovery service is recommended. In
order to make PH work, you are forcing proxy to be tight with the zookeeper.

3) Configuring authentication and authorization in proxy is already
challenging. There are a few different combinations. A typical Pulsar setup
is to forward the authentication credentials to the brokers to authenticate
and authorize. If you don't do this correctly, it will introduce security
holes because a connection can potentially grab the superuser credential
configured in proxy and use superuser credentials to access brokers. From
this perspective, I think proxy protocol handler doesn't make things
simpler instead it makes things complicated when it comes to authentication
and authorization.

I would like to see these questions are answered before moving to a vote.

- Sijie




On Wed, Sep 1, 2021 at 12:55 PM Enrico Olivelli <eo...@gmail.com> wrote:

> Any other comment?
>
> I would like to start a VOTE, but I feel we saw too few comments here
>
> Please take a look.
> I believe it will be a good fit for 2.9.0 release, that is going to be
> released in the end of September
>
>
> Enrico
>
> Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
> scritto:
>
> > +1, just read through the PIP. Looks good to me.
> >
> > - Michael
> >
> > On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eo...@gmail.com>
> > wrote:
> >
> > > Hello Pulsar fellows,
> > >
> > > I have prepared a PIP about adding support for Protocol Handlers
> > >
> > > This is the GDoc
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> > >
> > >
> > > This is the PR for the implementation
> > > https://github.com/apache/pulsar/pull/11838/files
> > >
> > > I am pretty sure that this PIP will make life of developers of Protocol
> > > Handlers and of Administrators who deploy Protocol Handlers very nicer
> > >
> > > We are still working on the formal PIP process, at the moment I am
> > sharing
> > > with you the document.
> > > My understanding is that after the discussion, I will start a VOTE
> > thread,
> > > and if the VOTE passes we can move forward with reviewing the PR, and
> > > hopefully merge this feature for Pulsar 2.9.0
> > >
> > > Enrico
> > >
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Enrico Olivelli <eo...@gmail.com>.
Any other comment?

I would like to start a VOTE, but I feel we saw too few comments here

Please take a look.
I believe it will be a good fit for 2.9.0 release, that is going to be
released in the end of September


Enrico

Il Mar 31 Ago 2021, 18:14 Michael Marshall <mi...@gmail.com> ha
scritto:

> +1, just read through the PIP. Looks good to me.
>
> - Michael
>
> On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > Hello Pulsar fellows,
> >
> > I have prepared a PIP about adding support for Protocol Handlers
> >
> > This is the GDoc
> >
> >
> >
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
> >
> >
> > This is the PR for the implementation
> > https://github.com/apache/pulsar/pull/11838/files
> >
> > I am pretty sure that this PIP will make life of developers of Protocol
> > Handlers and of Administrators who deploy Protocol Handlers very nicer
> >
> > We are still working on the formal PIP process, at the moment I am
> sharing
> > with you the document.
> > My understanding is that after the discussion, I will start a VOTE
> thread,
> > and if the VOTE passes we can move forward with reviewing the PR, and
> > hopefully merge this feature for Pulsar 2.9.0
> >
> > Enrico
> >
>

Re: PIP-93 Pulsar Proxy Protocol Handlers

Posted by Michael Marshall <mi...@gmail.com>.
+1, just read through the PIP. Looks good to me.

- Michael

On Mon, Aug 30, 2021 at 3:47 AM Enrico Olivelli <eo...@gmail.com> wrote:

> Hello Pulsar fellows,
>
> I have prepared a PIP about adding support for Protocol Handlers
>
> This is the GDoc
>
>
> https://docs.google.com/document/d/1Hlc_BOpQTkWX8FgrvWSfk6h5xTQKMXnTcSuil0Nznrg/edit?usp=sharing
>
>
> This is the PR for the implementation
> https://github.com/apache/pulsar/pull/11838/files
>
> I am pretty sure that this PIP will make life of developers of Protocol
> Handlers and of Administrators who deploy Protocol Handlers very nicer
>
> We are still working on the formal PIP process, at the moment I am sharing
> with you the document.
> My understanding is that after the discussion, I will start a VOTE thread,
> and if the VOTE passes we can move forward with reviewing the PR, and
> hopefully merge this feature for Pulsar 2.9.0
>
> Enrico
>