You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Michael Marshall <mi...@gmail.com> on 2021/12/01 18:21:12 UTC

Re: [DISCUSS] Pulsar Protocol For Client Timeouts and Creating Producers

Following up here, I am still in need of reviews for PR [0]. It
introduces an important clarification to the Pulsar Protocol Spec.
Please take a look, if you are able.

[0] - https://github.com/apache/pulsar/pull/12948

Thanks!
Michael

On Tue, Nov 23, 2021 at 1:10 PM Michael Marshall <mi...@gmail.com> wrote:
>
> I created a PR to update the protocol's documentation:
> https://github.com/apache/pulsar/pull/12948. Please take a look, if
> you're able.
>
> Once this PR is accepted/merged, I will follow up with an update to
> the Java client.
>
> - Michael
>
> On Thu, Nov 18, 2021 at 1:29 PM Michael Marshall <mi...@gmail.com> wrote:
> >
> > I view this as an edge case of the Pulsar Protocol that requires
> > clarification. Once we clarify the spec, we can update the clients to
> > conform to the spec. I don't think we need to give end users control
> > over this small part of the protocol.
> >
> > After reviewing the design a bit more, I think we should update the
> > Java client to send the `CloseProducer` command, as well as update the
> > spec to recommend this implementation. While the `ServerCnx` class
> > "works" both ways, the current Java client implementation leads to
> > unnecessary calls, unnecessary errors, and a longer time to recovery.
> >
> > Specifically, if the client fails to send a `CloseProducer` command,
> > it ends up getting into a sequence of retries where each new
> > `Producer` command receives an immediate `ErrorResponse` because the
> > `ServerCnx` already has a pending producer. By sending a
> > `CloseProducer` command, the client gives the broker permission to
> > stop keeping track of the original create producer request. It also
> > means that if the topic eventually loads, the broker will respond to
> > the right request id with a `ProducerSuccessResponse` command.
> >
> > I will follow up with an update to the client and the protocol spec,
> > unless there are any objections.
> >
> > Thanks,
> > Michael
> >
> > On Thu, Nov 18, 2021 at 12:25 PM Neng Lu <nl...@apache.org> wrote:
> > >
> > > How about making the behavior when timeout configurable? And by default, it will send the `CloseProducer` cmd?
> > >
> > > On 2021/11/17 05:52:21 Michael Marshall wrote:
> > > > Hi All,
> > > >
> > > > I noticed that the `ServerCnxTest#testCreateProducerTimeout` test
> > > > indicates that a producer will send a `CloserProducer` command when
> > > > producer creation times out for the client.
> > > >
> > > > The Java client does not follow this protocol. When the producer
> > > > creation times out, it just schedules another attempt to create the
> > > > producer.
> > > >
> > > > The C++ client does follow this protocol and is annotated with the
> > > > following comment:
> > > >
> > > > > // Creating the producer has timed out. We need to ensure the broker closes the producer
> > > > > // in case it was indeed created, otherwise it might prevent new create producer operation,
> > > > > // since we are not closing the connection
> > > >
> > > > I don't see anything in our official protocol spec indicating the
> > > > "right" behavior. Given the test coverage, it seems like the initial
> > > > design was to expect a `CloseProducer` command. However, I also see that
> > > > the broker's `ServerCnx` class will reply to a redundant `Producer`
> > > > command with a `ProducerSuccess` command, as long as the producer
> > > > is already created.
> > > >
> > > > Should I submit a PR to update the Java client to send a
> > > > `CloseProducer` command when a `Producer` command times out?
> > > >
> > > > Thanks,
> > > > Michael
> > > >

Re: [DISCUSS] Pulsar Protocol For Client Timeouts and Creating Producers

Posted by Dave Fisher <wa...@apache.org>.

> On Dec 1, 2021, at 10:21 AM, Michael Marshall <mi...@gmail.com> wrote:
> 
> Following up here, I am still in need of reviews for PR [0]. It
> introduces an important clarification to the Pulsar Protocol Spec.
> Please take a look, if you are able.
> 
> [0] - https://github.com/apache/pulsar/pull/12948

I added some suggested reviewers to the PR.

Regards,
Dave

> 
> Thanks!
> Michael
> 
> On Tue, Nov 23, 2021 at 1:10 PM Michael Marshall <mi...@gmail.com> wrote:
>> 
>> I created a PR to update the protocol's documentation:
>> https://github.com/apache/pulsar/pull/12948. Please take a look, if
>> you're able.
>> 
>> Once this PR is accepted/merged, I will follow up with an update to
>> the Java client.
>> 
>> - Michael
>> 
>> On Thu, Nov 18, 2021 at 1:29 PM Michael Marshall <mi...@gmail.com> wrote:
>>> 
>>> I view this as an edge case of the Pulsar Protocol that requires
>>> clarification. Once we clarify the spec, we can update the clients to
>>> conform to the spec. I don't think we need to give end users control
>>> over this small part of the protocol.
>>> 
>>> After reviewing the design a bit more, I think we should update the
>>> Java client to send the `CloseProducer` command, as well as update the
>>> spec to recommend this implementation. While the `ServerCnx` class
>>> "works" both ways, the current Java client implementation leads to
>>> unnecessary calls, unnecessary errors, and a longer time to recovery.
>>> 
>>> Specifically, if the client fails to send a `CloseProducer` command,
>>> it ends up getting into a sequence of retries where each new
>>> `Producer` command receives an immediate `ErrorResponse` because the
>>> `ServerCnx` already has a pending producer. By sending a
>>> `CloseProducer` command, the client gives the broker permission to
>>> stop keeping track of the original create producer request. It also
>>> means that if the topic eventually loads, the broker will respond to
>>> the right request id with a `ProducerSuccessResponse` command.
>>> 
>>> I will follow up with an update to the client and the protocol spec,
>>> unless there are any objections.
>>> 
>>> Thanks,
>>> Michael
>>> 
>>> On Thu, Nov 18, 2021 at 12:25 PM Neng Lu <nl...@apache.org> wrote:
>>>> 
>>>> How about making the behavior when timeout configurable? And by default, it will send the `CloseProducer` cmd?
>>>> 
>>>> On 2021/11/17 05:52:21 Michael Marshall wrote:
>>>>> Hi All,
>>>>> 
>>>>> I noticed that the `ServerCnxTest#testCreateProducerTimeout` test
>>>>> indicates that a producer will send a `CloserProducer` command when
>>>>> producer creation times out for the client.
>>>>> 
>>>>> The Java client does not follow this protocol. When the producer
>>>>> creation times out, it just schedules another attempt to create the
>>>>> producer.
>>>>> 
>>>>> The C++ client does follow this protocol and is annotated with the
>>>>> following comment:
>>>>> 
>>>>>> // Creating the producer has timed out. We need to ensure the broker closes the producer
>>>>>> // in case it was indeed created, otherwise it might prevent new create producer operation,
>>>>>> // since we are not closing the connection
>>>>> 
>>>>> I don't see anything in our official protocol spec indicating the
>>>>> "right" behavior. Given the test coverage, it seems like the initial
>>>>> design was to expect a `CloseProducer` command. However, I also see that
>>>>> the broker's `ServerCnx` class will reply to a redundant `Producer`
>>>>> command with a `ProducerSuccess` command, as long as the producer
>>>>> is already created.
>>>>> 
>>>>> Should I submit a PR to update the Java client to send a
>>>>> `CloseProducer` command when a `Producer` command times out?
>>>>> 
>>>>> Thanks,
>>>>> Michael
>>>>>