You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by PengHui Li <pe...@apache.org> on 2022/02/09 08:25:00 UTC

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Hi all,

Sorry for the late reply, due to my vacation these days, we got a delay
here.

Most of the changes of 2.10.0 are getting merged, for now, there are 14
opened PRs(10 approved)
https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0

I will take care of them and try to get them merged.
After the above PRs get merged, I will build the release and start the vote.
Please let me know if you have any questions about the 2.10.0 release.
And, also looking forward to more people taking a look at the opened PRs.

Regards,
Penghui




On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:

> +1.
>
> All make sense to me!
>
> We probably need to move to the feature frozen stage in order to cut a
> release at the end of January.
>
> - Sijie
>
> On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org> wrote:
>
> > Hi, everyone
> >
> > I hope you’ve all been doing well. I would like to start an email thread
> to
> > discuss features that we planned for 2.10.0.
> > According to the time-based release plan
> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
> > we should release 2.10.0 at the end of December 2021, since we have
> reached
> > the end of December,
> > I would like to target the 2.10.0 to the end of January 2022
> >
> > There are some powerful features and enhancements in 2.10.0 such as
> >
> > - PIP 84: Message redelivery epoch
> > - PIP 104: Add new consumer type: TableView
> > - PIP 106: Negative acknowledgment backoff
> > - PIP 110: Topic customized metadata support
> > - PIP 117: Change Pulsar standalone defaults
> > - PIP 118: Do not restart brokers when ZooKeeper session expires
> > - PIP 119: Enable consistent hashing by default on KeyShared dispatcher
> > - PIP 120: Enable client memory limit by default
> > - PIP 121: Pulsar cluster level auto failover
> > - PIP 123: Pulsar metadata CLI tool
> > - Metadata service batch operations
> > - RocksDB metadata service backend
> > - Etcd metadata service backend
> > - Ack timeout redelivery backoff policy
> > - Global topic policies
> >
> > Most of them have been completed, some work in progress we need to try to
> > complete within 2 weeks.
> > This can give me a 2 week buffer period to prepare for release and
> complete
> > the release vote.
> > For the unfinished parts, we can move them to 2.11.0.
> >
> > Some proposals are just being discussed, so I do not list them because
> I'm
> > not sure if we can complete them in two weeks.
> >
> > You can find all the change lists from
> >
> >
> https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
> > There are more than 500 commits.
> >
> > If I missed something or you have any suggestions please let me know.
> >
> > Regards,
> > Penghui
> >
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi all,

The release note[1] for 2.10.0 is available for review, please help review
and feel free to leave comments.

[1] https://github.com/apache/pulsar/pull/14398

Best,
Penghui

On Thu, Feb 17, 2022 at 9:58 PM PengHui Li <pe...@apache.org> wrote:

> Thanks, Rui
>
> I have merged the PR and cherry-picked it into branch-2.10,
> I will take care of the tests of branch-2.10 and do more tests.
>
> Regards,
> Penghui
>
> On Thu, Feb 17, 2022 at 6:42 PM Rui Fu <rf...@apache.org> wrote:
>
>> Hi all,
>>
>> For https://github.com/apache/pulsar/pull/13376, I have done some tests
>> to verify the upgrade of Pulsar Functions from root image to non-root image.
>> The test runs on my local machine with KinD cluster, with a fresh Pulsar
>> +auth 2.8.2 cluster. The cluster uses Kubernetes Runtime and multiple
>> functions been created and runs fine.
>> Then the upgrade uses `helm upgrade` with the self built docker image
>> (freeznet/pulsar-all:2.10.0-SNAPSHOT) based on
>> https://github.com/apache/pulsar/pull/13376, after the broker and
>> functions worker been upgraded, I have used `pulsar-admin functions
>> restart` to trigger the function upgradation, and the existing function
>> upgraded to non-root image and works fine after restart. I have also tested
>> sinks and sources as well.
>>
>> So from my tests, the non-root image works fine and #13376 should not
>> block the 2.10 release.
>>
>> On 2022/02/09 08:25:00 PengHui Li wrote:
>> > Hi all,
>> >
>> > Sorry for the late reply, due to my vacation these days, we got a delay
>> > here.
>> >
>> > Most of the changes of 2.10.0 are getting merged, for now, there are 14
>> > opened PRs(10 approved)
>> >
>> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
>> >
>> > I will take care of them and try to get them merged.
>> > After the above PRs get merged, I will build the release and start the
>> vote.
>> > Please let me know if you have any questions about the 2.10.0 release.
>> > And, also looking forward to more people taking a look at the opened
>> PRs.
>> >
>> > Regards,
>> > Penghui
>> >
>> >
>> >
>> >
>> > On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:
>> >
>> > > +1.
>> > >
>> > > All make sense to me!
>> > >
>> > > We probably need to move to the feature frozen stage in order to cut a
>> > > release at the end of January.
>> > >
>> > > - Sijie
>> > >
>> > > On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org>
>> wrote:
>> > >
>> > > > Hi, everyone
>> > > >
>> > > > I hope you’ve all been doing well. I would like to start an email
>> thread
>> > > to
>> > > > discuss features that we planned for 2.10.0.
>> > > > According to the time-based release plan
>> > > >
>> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
>> > > > we should release 2.10.0 at the end of December 2021, since we have
>> > > reached
>> > > > the end of December,
>> > > > I would like to target the 2.10.0 to the end of January 2022
>> > > >
>> > > > There are some powerful features and enhancements in 2.10.0 such as
>> > > >
>> > > > - PIP 84: Message redelivery epoch
>> > > > - PIP 104: Add new consumer type: TableView
>> > > > - PIP 106: Negative acknowledgment backoff
>> > > > - PIP 110: Topic customized metadata support
>> > > > - PIP 117: Change Pulsar standalone defaults
>> > > > - PIP 118: Do not restart brokers when ZooKeeper session expires
>> > > > - PIP 119: Enable consistent hashing by default on KeyShared
>> dispatcher
>> > > > - PIP 120: Enable client memory limit by default
>> > > > - PIP 121: Pulsar cluster level auto failover
>> > > > - PIP 123: Pulsar metadata CLI tool
>> > > > - Metadata service batch operations
>> > > > - RocksDB metadata service backend
>> > > > - Etcd metadata service backend
>> > > > - Ack timeout redelivery backoff policy
>> > > > - Global topic policies
>> > > >
>> > > > Most of them have been completed, some work in progress we need to
>> try to
>> > > > complete within 2 weeks.
>> > > > This can give me a 2 week buffer period to prepare for release and
>> > > complete
>> > > > the release vote.
>> > > > For the unfinished parts, we can move them to 2.11.0.
>> > > >
>> > > > Some proposals are just being discussed, so I do not list them
>> because
>> > > I'm
>> > > > not sure if we can complete them in two weeks.
>> > > >
>> > > > You can find all the change lists from
>> > > >
>> > > >
>> > >
>> https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
>> > > > There are more than 500 commits.
>> > > >
>> > > > If I missed something or you have any suggestions please let me
>> know.
>> > > >
>> > > > Regards,
>> > > > Penghui
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Thanks, Rui

I have merged the PR and cherry-picked it into branch-2.10,
I will take care of the tests of branch-2.10 and do more tests.

Regards,
Penghui

On Thu, Feb 17, 2022 at 6:42 PM Rui Fu <rf...@apache.org> wrote:

> Hi all,
>
> For https://github.com/apache/pulsar/pull/13376, I have done some tests
> to verify the upgrade of Pulsar Functions from root image to non-root image.
> The test runs on my local machine with KinD cluster, with a fresh Pulsar
> +auth 2.8.2 cluster. The cluster uses Kubernetes Runtime and multiple
> functions been created and runs fine.
> Then the upgrade uses `helm upgrade` with the self built docker image
> (freeznet/pulsar-all:2.10.0-SNAPSHOT) based on
> https://github.com/apache/pulsar/pull/13376, after the broker and
> functions worker been upgraded, I have used `pulsar-admin functions
> restart` to trigger the function upgradation, and the existing function
> upgraded to non-root image and works fine after restart. I have also tested
> sinks and sources as well.
>
> So from my tests, the non-root image works fine and #13376 should not
> block the 2.10 release.
>
> On 2022/02/09 08:25:00 PengHui Li wrote:
> > Hi all,
> >
> > Sorry for the late reply, due to my vacation these days, we got a delay
> > here.
> >
> > Most of the changes of 2.10.0 are getting merged, for now, there are 14
> > opened PRs(10 approved)
> >
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
> >
> > I will take care of them and try to get them merged.
> > After the above PRs get merged, I will build the release and start the
> vote.
> > Please let me know if you have any questions about the 2.10.0 release.
> > And, also looking forward to more people taking a look at the opened PRs.
> >
> > Regards,
> > Penghui
> >
> >
> >
> >
> > On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > +1.
> > >
> > > All make sense to me!
> > >
> > > We probably need to move to the feature frozen stage in order to cut a
> > > release at the end of January.
> > >
> > > - Sijie
> > >
> > > On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org> wrote:
> > >
> > > > Hi, everyone
> > > >
> > > > I hope you’ve all been doing well. I would like to start an email
> thread
> > > to
> > > > discuss features that we planned for 2.10.0.
> > > > According to the time-based release plan
> > > >
> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
> > > > we should release 2.10.0 at the end of December 2021, since we have
> > > reached
> > > > the end of December,
> > > > I would like to target the 2.10.0 to the end of January 2022
> > > >
> > > > There are some powerful features and enhancements in 2.10.0 such as
> > > >
> > > > - PIP 84: Message redelivery epoch
> > > > - PIP 104: Add new consumer type: TableView
> > > > - PIP 106: Negative acknowledgment backoff
> > > > - PIP 110: Topic customized metadata support
> > > > - PIP 117: Change Pulsar standalone defaults
> > > > - PIP 118: Do not restart brokers when ZooKeeper session expires
> > > > - PIP 119: Enable consistent hashing by default on KeyShared
> dispatcher
> > > > - PIP 120: Enable client memory limit by default
> > > > - PIP 121: Pulsar cluster level auto failover
> > > > - PIP 123: Pulsar metadata CLI tool
> > > > - Metadata service batch operations
> > > > - RocksDB metadata service backend
> > > > - Etcd metadata service backend
> > > > - Ack timeout redelivery backoff policy
> > > > - Global topic policies
> > > >
> > > > Most of them have been completed, some work in progress we need to
> try to
> > > > complete within 2 weeks.
> > > > This can give me a 2 week buffer period to prepare for release and
> > > complete
> > > > the release vote.
> > > > For the unfinished parts, we can move them to 2.11.0.
> > > >
> > > > Some proposals are just being discussed, so I do not list them
> because
> > > I'm
> > > > not sure if we can complete them in two weeks.
> > > >
> > > > You can find all the change lists from
> > > >
> > > >
> > >
> https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
> > > > There are more than 500 commits.
> > > >
> > > > If I missed something or you have any suggestions please let me know.
> > > >
> > > > Regards,
> > > > Penghui
> > > >
> > >
> >
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Rui Fu <rf...@apache.org>.
Hi all,

For https://github.com/apache/pulsar/pull/13376, I have done some tests to verify the upgrade of Pulsar Functions from root image to non-root image.
The test runs on my local machine with KinD cluster, with a fresh Pulsar +auth 2.8.2 cluster. The cluster uses Kubernetes Runtime and multiple functions been created and runs fine.
Then the upgrade uses `helm upgrade` with the self built docker image (freeznet/pulsar-all:2.10.0-SNAPSHOT) based on https://github.com/apache/pulsar/pull/13376, after the broker and functions worker been upgraded, I have used `pulsar-admin functions restart` to trigger the function upgradation, and the existing function upgraded to non-root image and works fine after restart. I have also tested sinks and sources as well.

So from my tests, the non-root image works fine and #13376 should not block the 2.10 release.

On 2022/02/09 08:25:00 PengHui Li wrote:
> Hi all,
> 
> Sorry for the late reply, due to my vacation these days, we got a delay
> here.
> 
> Most of the changes of 2.10.0 are getting merged, for now, there are 14
> opened PRs(10 approved)
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
> 
> I will take care of them and try to get them merged.
> After the above PRs get merged, I will build the release and start the vote.
> Please let me know if you have any questions about the 2.10.0 release.
> And, also looking forward to more people taking a look at the opened PRs.
> 
> Regards,
> Penghui
> 
> 
> 
> 
> On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:
> 
> > +1.
> >
> > All make sense to me!
> >
> > We probably need to move to the feature frozen stage in order to cut a
> > release at the end of January.
> >
> > - Sijie
> >
> > On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org> wrote:
> >
> > > Hi, everyone
> > >
> > > I hope you’ve all been doing well. I would like to start an email thread
> > to
> > > discuss features that we planned for 2.10.0.
> > > According to the time-based release plan
> > > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
> > > we should release 2.10.0 at the end of December 2021, since we have
> > reached
> > > the end of December,
> > > I would like to target the 2.10.0 to the end of January 2022
> > >
> > > There are some powerful features and enhancements in 2.10.0 such as
> > >
> > > - PIP 84: Message redelivery epoch
> > > - PIP 104: Add new consumer type: TableView
> > > - PIP 106: Negative acknowledgment backoff
> > > - PIP 110: Topic customized metadata support
> > > - PIP 117: Change Pulsar standalone defaults
> > > - PIP 118: Do not restart brokers when ZooKeeper session expires
> > > - PIP 119: Enable consistent hashing by default on KeyShared dispatcher
> > > - PIP 120: Enable client memory limit by default
> > > - PIP 121: Pulsar cluster level auto failover
> > > - PIP 123: Pulsar metadata CLI tool
> > > - Metadata service batch operations
> > > - RocksDB metadata service backend
> > > - Etcd metadata service backend
> > > - Ack timeout redelivery backoff policy
> > > - Global topic policies
> > >
> > > Most of them have been completed, some work in progress we need to try to
> > > complete within 2 weeks.
> > > This can give me a 2 week buffer period to prepare for release and
> > complete
> > > the release vote.
> > > For the unfinished parts, we can move them to 2.11.0.
> > >
> > > Some proposals are just being discussed, so I do not list them because
> > I'm
> > > not sure if we can complete them in two weeks.
> > >
> > > You can find all the change lists from
> > >
> > >
> > https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
> > > There are more than 500 commits.
> > >
> > > If I missed something or you have any suggestions please let me know.
> > >
> > > Regards,
> > > Penghui
> > >
> >
> 

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
> The contributors found there are many places that might also have the same
> problem.

I'm trying to understand "the problem" even though I haven't received replies from any of the contributors. For example, my question in https://github.com/apache/pulsar/issues/14013#issuecomment-1033528348 has been ignored. 

In the case of the "make async" changes, it seems to be about migrating from the blocking Servlet API to the Asynchronous Servlet API. For example this change:
https://github.com/apache/pulsar/pull/14188/files#diff-0a1b84bf6bf3128bd4fc80875a959888ac68b989474afe536ac7cdd75d594400

There are a lot of these "make async" changes: https://github.com/apache/pulsar/pulls?q=is%3Apr+make+async . 

In Jetty documentation, there's an explanation about Asynchronous Servlets: https://wiki.eclipse.org/Jetty/Feature/Continuations#Why_Asynchronous_Servlets_.3F
It also explains the benefits of asynchronous servlets:

"The servlet API (<=2.5) supports only a synchronous call style, so that any waiting that a servlet needs to do must be with blocking. Unfortunately this means that the thread allocated to the request must be held during that wait along with all its resources: kernel thread, stack memory and often pooled buffers, character converters, EE authentication context, etc. It is wasteful of system resources to hold these resources while waiting.

Significantly better scalability and quality of service can be achieved if waiting is done asynchronously."

You can achieve better resource efficiency and better scalability with asynchronous servlets. 
Do we have such problems in Apache Pulsar with the Servlet API?

-Lari

On 2022/02/15 14:51:59 Lari Hotari wrote:
> On 2022/02/15 14:13:59 PengHui Li wrote:
> > The rationale for these changes, I think it starts from this PR
> > https://github.com/apache/pulsar/pull/13666
> > This is the only one example, we have seen the same issue again and again.
> > After #13666 get merged,
> > The contributors found there are many places that might also have the same
> > problem.
> 
> Thanks for replying, Penghui. The problem is that there is no rationale nor description in that PR, https://github.com/apache/pulsar/pull/13666 . The only sentence there is "Avoid call sync method in async rest API for delete subscription". 
> 
> > "we have seen the same issue again and again."
> 
> What issue did you see? Please share more context. Thanks for the patience.
> 
> BR, Lari
> 

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Matteo Merli <ma...@gmail.com>.
On Wed, Feb 16, 2022 at 9:55 PM Michael Marshall <mm...@apache.org> wrote:
> Given that our community members who focus on testing are otherwise
> about to prepare for a quick 3 day round of testing, I don't believe
> they would object to a last minute change that gives them more time
> for testing. In your view, who needs this advanced communication to
> make our code freeze meaningful?

That developers know when is the cutoff date and can plan accordingly.

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Michael Marshall <mm...@apache.org>.
Thanks for creating the branch, Penghui.

> Yes, but I think that the code freeze is only meaningful if it’s
> communicated in advance.

Given that our community members who focus on testing are otherwise
about to prepare for a quick 3 day round of testing, I don't believe
they would object to a last minute change that gives them more time
for testing. In your view, who needs this advanced communication to
make our code freeze meaningful?

Thanks,
Michael

On Wed, Feb 16, 2022 at 8:07 PM PengHui Li <pe...@apache.org> wrote:
>
> Hi all,
>
> Put an update here, I have created branch-2.10[1]
>
> [1]https://github.com/apache/pulsar/tree/branch-2.10
>
> On Thu, Feb 17, 2022 at 9:23 AM PengHui Li <pe...@apache.org> wrote:
>
> > Hi Micheal
> >
> > > Penghui, is your current plan to create branch-2.10, create the
> > release artifacts, and start a vote on them all within a few days of
> > each other?
> >
> > Yes, I will create branch-2.10 today.
> >
> > For starting the vote, we need to confirm these 2 PRs[1] will not introduce
> > breaking changes. Very grateful if someone can also help verify them.
> >
> >
> > [1] https://github.com/apache/pulsar/pull/13376,
> > https://github.com/apache/pulsar/pull/13341
> >
> > Thanks,
> > Penghui
> >
> > On Thu, Feb 17, 2022 at 8:59 AM Matteo Merli <ma...@gmail.com>
> > wrote:
> >
> >> Yes, but I think that the code freeze is only meaningful if it’s
> >> communicated in advance.
> >>
> >> The fact that it was included in the original PIP but never followed in
> >> the
> >> practice means it would be a last minute change.
> >>
> >> On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mm...@apache.org>
> >> wrote:
> >>
> >> > When we discussed the code freeze in the community meeting on 2/3, I
> >> > was under the impression that it was a new development to our existing
> >> > release process. I subsequently learned it was already defined in
> >> > PIP 47. Even if we haven't been following this part of PIP 47, what
> >> > is the value in waiting until 2.11 to follow our already defined
> >> process?
> >> > While I agree it is helpful to provide guidance on when a version will
> >> > ship,
> >> > I think it is more important to give the community time to test a
> >> release,
> >> > even if that means we're a little late on our release schedule. So far,
> >> > we haven't even created a branch to begin testing.
> >> >
> >> > Note also that Sijie suggested using a feature freeze early on in this
> >> > thread.
> >> >
> >> > The 2.9.0 release is relevant here. It had 4 release candidates over 4
> >> > weeks and the final result was broken. That indicates to me that tagging
> >> > an RC early does not guarantee an early release and that our current
> >> > process isn't optimal and likely needs adjustments. I do not think we
> >> > should wait to address these issues. I propose we start following
> >> > PIP 47's guidance on code freeze and release stabilization periods.
> >> >
> >> > > I don't think that changes the picture here. There are *always* last
> >> > > minute issues being discovered, and there is a call to be made on a
> >> > > case by case. The feature freeze will reduce the likelihood of
> >> > > introducing *more* issues by getting it from the master branch, but
> >> > > won't change a comma from issues that were already there.
> >> >
> >> > I thought you wanted to implement a code/feature freeze to allow for
> >> > more release stabilization. Can you clarify what you mean here?
> >> >
> >> > Thanks,
> >> > Michael
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <ma...@gmail.com>
> >> > wrote:
> >> > >
> >> > > Michael, as we chatted in last weekly meeting (though not yet
> >> > > formalized), since we have never really done a feature freeze on the
> >> > > branch during paste releases, we should start from the next release,
> >> > > to give a decent preview of what to expect to developers in terms of
> >> > > dates.
> >> > >
> >> > > > While some may feel "behind" in getting out the 2.10 release, our
> >> > > > priority must be to give the community time to verify the stability
> >> of
> >> > > > the release.
> >> > >
> >> > > I don't think that changes the picture here. There are *always* last
> >> > > minute issues being discovered, and there is a call to be made on a
> >> > > case by case. The feature freeze will reduce the likelihood of
> >> > > introducing *more* issues by getting it from the master branch, but
> >> > > won't change a comma from issues that were already there.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Matteo Merli
> >> > > <ma...@gmail.com>
> >> > >
> >> > > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <
> >> mmarshall@apache.org>
> >> > wrote:
> >> > > >
> >> > > > > I will build the release and start the vote before next
> >> Monday(GMT+8)
> >> > > >
> >> > > > Penghui, is your current plan to create branch-2.10, create the
> >> > > > release artifacts, and start a vote on them all within a few days of
> >> > > > each other?
> >> > > >
> >> > > > > I'm doing my best to follow PIP 47, but when seeing a potential
> >> break
> >> > > > > change, I need to confirm it.
> >> > > > > After all the potential break changes have been confirmed and
> >> fixed,
> >> > I will
> >> > > > > start the vote thread.
> >> > > >
> >> > > > I think we should review our current release plan before we move
> >> > > > forward as proposed above. PIP 47 explicitly says that a month
> >> before
> >> > > > the release date, the release manager will cut branches [0]. We
> >> don't
> >> > > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> >> > > > feature freeze and then a code freeze. We have not yet had either.
> >> > > >
> >> > > > I propose we create branch-2.10 now and simultaneously announce that
> >> > > > we are past the feature freeze period. Then, we can start the 2 week
> >> > > > period for bug fixes that precedes the code freeze, as PIP 47
> >> > > > prescribes. Then, in two weeks, we can produce the first release
> >> > > > candidate (also in PIP 47).
> >> > > >
> >> > > > While some may feel "behind" in getting out the 2.10 release, our
> >> > > > priority must be to give the community time to verify the stability
> >> of
> >> > > > the release.
> >> > > >
> >> > > > Thanks,
> >> > > > Michael
> >> > > >
> >> > > > [0]
> >> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org>
> >> wrote:
> >> > > > >
> >> > > > > Hi all
> >> > > > >
> >> > > > > Just put an update here.
> >> > > > >
> >> > > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> >> > > > > https://github.com/apache/pulsar/pull/13341
> >> > > > > need to do the final verification, and you are also very welcome
> >> to
> >> > verify
> >> > > > > these 2 changes in your environment, cases.
> >> > > > >
> >> > > > > I will build the release and start the vote before next
> >> Monday(GMT+8)
> >> > > > >
> >> > > > > Regards
> >> > > > > Penghui
> >> > > > >
> >> > > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org>
> >> > wrote:
> >> > > > >
> >> > > > > > Hi lari,
> >> > > > > >
> >> > > > > > > So finally, I understand that "the problem" is that all HTTP
> >> > server
> >> > > > > > threads are blocked and this makes the Pulsar Admin API
> >> > unavailable.
> >> > > > > >
> >> > > > > > To support the blocking servlet API, Jetty uses a default thread
> >> > pool that
> >> > > > > > can grow to up to 200 threads (
> >> > > > > >
> >> >
> >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> >> > )
> >> > > > > > .
> >> > > > > > However this default of 200 maximum threads is not used in
> >> Pulsar.
> >> > > > > >
> >> > > > > >  Regarding the "make async" changes, It is an optimization to
> >> > migrate from
> >> > > > > > the blocking servlet api to the asynchronous servlet api. This
> >> > work isn't
> >> > > > > > urgent since we can simply mitigate the HTTP server threads
> >> > getting blocked
> >> > > > > > by setting "numHttpServerThreads=200" in broker.conf. "the
> >> > problem" will be
> >> > > > > > resolved immediately without risks of regressions that are
> >> > involved in
> >> > > > > > making the sync -> async changes.
> >> > > > > >
> >> > > > > > Yes, this is the problem. But I am against using 200 threads as
> >> > the max
> >> > > > > > web server thread by default,
> >> > > > > > it can't work for cases that the broker without that much
> >> memory,
> >> > it will
> >> > > > > > lead to more serious problems
> >> > > > > > that the service quality of messaging API gets worse due to the
> >> JVM
> >> > > > > > GC, even memory overflow.
> >> > > > > >
> >> > > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10
> >> > release,
> >> > > > > > and all the PRs are not cherry-picked to branch-2.x
> >> > > > > > This is an optimization for pulsar, the current implementation
> >> > does not
> >> > > > > > use jetty async API well, we should fix it,
> >> > > > > > we should reduce the code with bad smells, and using async API
> >> is
> >> > also
> >> > > > > > a more efficient way without opening such jetty threads.
> >> > > > > > Do you have any concerns about the way the modification becomes
> >> > purely
> >> > > > > > async?
> >> > > > > >
> >> > > > > > > Penghui, would you mind adding a GitHub issue for the problem
> >> > where all
> >> > > > > > HTTP threads get blocked and the Pulsar Admin API stops
> >> responding?
> >> > > > > >
> >> > > > > > https://github.com/apache/pulsar/issues/4756 the attachment
> >> from
> >> > the
> >> > > > > > issue is a good example
> >> > > > > >
> >> > > > > > Regards,
> >> > > > > > Penghui
> >> > > > > >
> >> > > > > >
> >> > > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lhotari@apache.org
> >> >
> >> > wrote:
> >> > > > > >
> >> > > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to
> >> set
> >> > > > > >> numHttpServerThreads=200 .
> >> > > > > >> Please review
> >> > > > > >>
> >> > > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> >> > > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> >> > > > > >> > > Which is a sync method. Ultimately this could lead to all
> >> the
> >> > > > > >> pulsar-web
> >> > > > > >> > > thread
> >> > > > > >> > > blocked. we'd better not introduce blocking calls if we use
> >> > > > > >> AsyncResponse.
> >> > > > > >> > >
> >> > > > > >> > > > What issue did you see? Please share more context. Thanks
> >> > for the
> >> > > > > >> > > patience.
> >> > > > > >> > >
> >> > > > > >> > > It happened very earlier
> >> > > > > >> > >
> >> > > > > >> > > Here is the issue
> >> > https://github.com/apache/pulsar/issues/4756
> >> > > > > >> > > And here is also a related fix
> >> > > > > >> https://github.com/apache/pulsar/pull/10619
> >> > > > > >> >
> >> > > > > >> > Penghui, Thank you for the patience, and thanks for sharing
> >> more
> >> > > > > >> context. I happened to send a reply before reading your
> >> message,
> >> > so please
> >> > > > > >> bear with me.
> >> > > > > >> >
> >> > > > > >> > So finally, I understand that "the problem" is that all HTTP
> >> > server
> >> > > > > >> threads are blocked and this makes the Pulsar Admin API
> >> > unavailable.
> >> > > > > >> >
> >> > > > > >> > To support the blocking servlet API, Jetty uses a default
> >> > thread pool
> >> > > > > >> that can grow to up to 200 threads (
> >> > > > > >>
> >> >
> >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> >> > )
> >> > > > > >> .
> >> > > > > >> > However this default of 200 maximum threads is not used in
> >> > Pulsar.
> >> > > > > >> >
> >> > > > > >> > The problem is that Pulsar uses a low value that assumes
> >> > asynchronous
> >> > > > > >> API usage:
> >> > > > > >> >
> >> > > > > >>
> >> >
> >> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> >> > > > > >> > Pulsar should be using a high value (for example 200) as long
> >> > as there
> >> > > > > >> are blocking calls in Admin APIs.
> >> > > > > >> >
> >> > > > > >> > The mitigation to the issue of all HTTP server threads
> >> getting
> >> > blocked
> >> > > > > >> is setting "numHttpServerThreads=200" in broker.conf.
> >> > > > > >> >
> >> > > > > >> > Regarding the "make async" changes, It is an optimization to
> >> > migrate
> >> > > > > >> from the blocking servlet api to the asynchronous servlet api.
> >> > This work
> >> > > > > >> isn't urgent since we can simply mitigate the HTTP server
> >> threads
> >> > getting
> >> > > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf.
> >> > "the problem"
> >> > > > > >> will be resolved immediately without risks of regressions that
> >> > are involved
> >> > > > > >> in making the sync -> async changes.
> >> > > > > >> >
> >> > > > > >> > Penghui, would you mind adding a GitHub issue for the problem
> >> > where all
> >> > > > > >> HTTP threads get blocked and the Pulsar Admin API stops
> >> > responding?
> >> > > > > >> >
> >> > > > > >> > I can follow up with a PR which updates the default for
> >> > > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty
> >> > starts with
> >> > > > > >> 8 threads. We can agree on the default value to use in the PR.
> >> > > > > >> >
> >> > > > > >> > Thank you for the great collaboration on sharing the context
> >> and
> >> > > > > >> describing the problem patiently.
> >> > > > > >> >
> >> > > > > >> > BR,
> >> > > > > >> >
> >> > > > > >> > -Lari
> >> > > > > >> >
> >> > > > > >>
> >> > > > > >
> >> >
> >> --
> >> --
> >> Matteo Merli
> >> <ma...@gmail.com>
> >>
> >

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi all,

Put an update here, I have created branch-2.10[1]

[1]https://github.com/apache/pulsar/tree/branch-2.10

On Thu, Feb 17, 2022 at 9:23 AM PengHui Li <pe...@apache.org> wrote:

> Hi Micheal
>
> > Penghui, is your current plan to create branch-2.10, create the
> release artifacts, and start a vote on them all within a few days of
> each other?
>
> Yes, I will create branch-2.10 today.
>
> For starting the vote, we need to confirm these 2 PRs[1] will not introduce
> breaking changes. Very grateful if someone can also help verify them.
>
>
> [1] https://github.com/apache/pulsar/pull/13376,
> https://github.com/apache/pulsar/pull/13341
>
> Thanks,
> Penghui
>
> On Thu, Feb 17, 2022 at 8:59 AM Matteo Merli <ma...@gmail.com>
> wrote:
>
>> Yes, but I think that the code freeze is only meaningful if it’s
>> communicated in advance.
>>
>> The fact that it was included in the original PIP but never followed in
>> the
>> practice means it would be a last minute change.
>>
>> On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mm...@apache.org>
>> wrote:
>>
>> > When we discussed the code freeze in the community meeting on 2/3, I
>> > was under the impression that it was a new development to our existing
>> > release process. I subsequently learned it was already defined in
>> > PIP 47. Even if we haven't been following this part of PIP 47, what
>> > is the value in waiting until 2.11 to follow our already defined
>> process?
>> > While I agree it is helpful to provide guidance on when a version will
>> > ship,
>> > I think it is more important to give the community time to test a
>> release,
>> > even if that means we're a little late on our release schedule. So far,
>> > we haven't even created a branch to begin testing.
>> >
>> > Note also that Sijie suggested using a feature freeze early on in this
>> > thread.
>> >
>> > The 2.9.0 release is relevant here. It had 4 release candidates over 4
>> > weeks and the final result was broken. That indicates to me that tagging
>> > an RC early does not guarantee an early release and that our current
>> > process isn't optimal and likely needs adjustments. I do not think we
>> > should wait to address these issues. I propose we start following
>> > PIP 47's guidance on code freeze and release stabilization periods.
>> >
>> > > I don't think that changes the picture here. There are *always* last
>> > > minute issues being discovered, and there is a call to be made on a
>> > > case by case. The feature freeze will reduce the likelihood of
>> > > introducing *more* issues by getting it from the master branch, but
>> > > won't change a comma from issues that were already there.
>> >
>> > I thought you wanted to implement a code/feature freeze to allow for
>> > more release stabilization. Can you clarify what you mean here?
>> >
>> > Thanks,
>> > Michael
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <ma...@gmail.com>
>> > wrote:
>> > >
>> > > Michael, as we chatted in last weekly meeting (though not yet
>> > > formalized), since we have never really done a feature freeze on the
>> > > branch during paste releases, we should start from the next release,
>> > > to give a decent preview of what to expect to developers in terms of
>> > > dates.
>> > >
>> > > > While some may feel "behind" in getting out the 2.10 release, our
>> > > > priority must be to give the community time to verify the stability
>> of
>> > > > the release.
>> > >
>> > > I don't think that changes the picture here. There are *always* last
>> > > minute issues being discovered, and there is a call to be made on a
>> > > case by case. The feature freeze will reduce the likelihood of
>> > > introducing *more* issues by getting it from the master branch, but
>> > > won't change a comma from issues that were already there.
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Matteo Merli
>> > > <ma...@gmail.com>
>> > >
>> > > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <
>> mmarshall@apache.org>
>> > wrote:
>> > > >
>> > > > > I will build the release and start the vote before next
>> Monday(GMT+8)
>> > > >
>> > > > Penghui, is your current plan to create branch-2.10, create the
>> > > > release artifacts, and start a vote on them all within a few days of
>> > > > each other?
>> > > >
>> > > > > I'm doing my best to follow PIP 47, but when seeing a potential
>> break
>> > > > > change, I need to confirm it.
>> > > > > After all the potential break changes have been confirmed and
>> fixed,
>> > I will
>> > > > > start the vote thread.
>> > > >
>> > > > I think we should review our current release plan before we move
>> > > > forward as proposed above. PIP 47 explicitly says that a month
>> before
>> > > > the release date, the release manager will cut branches [0]. We
>> don't
>> > > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
>> > > > feature freeze and then a code freeze. We have not yet had either.
>> > > >
>> > > > I propose we create branch-2.10 now and simultaneously announce that
>> > > > we are past the feature freeze period. Then, we can start the 2 week
>> > > > period for bug fixes that precedes the code freeze, as PIP 47
>> > > > prescribes. Then, in two weeks, we can produce the first release
>> > > > candidate (also in PIP 47).
>> > > >
>> > > > While some may feel "behind" in getting out the 2.10 release, our
>> > > > priority must be to give the community time to verify the stability
>> of
>> > > > the release.
>> > > >
>> > > > Thanks,
>> > > > Michael
>> > > >
>> > > > [0]
>> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org>
>> wrote:
>> > > > >
>> > > > > Hi all
>> > > > >
>> > > > > Just put an update here.
>> > > > >
>> > > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
>> > > > > https://github.com/apache/pulsar/pull/13341
>> > > > > need to do the final verification, and you are also very welcome
>> to
>> > verify
>> > > > > these 2 changes in your environment, cases.
>> > > > >
>> > > > > I will build the release and start the vote before next
>> Monday(GMT+8)
>> > > > >
>> > > > > Regards
>> > > > > Penghui
>> > > > >
>> > > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org>
>> > wrote:
>> > > > >
>> > > > > > Hi lari,
>> > > > > >
>> > > > > > > So finally, I understand that "the problem" is that all HTTP
>> > server
>> > > > > > threads are blocked and this makes the Pulsar Admin API
>> > unavailable.
>> > > > > >
>> > > > > > To support the blocking servlet API, Jetty uses a default thread
>> > pool that
>> > > > > > can grow to up to 200 threads (
>> > > > > >
>> >
>> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
>> > )
>> > > > > > .
>> > > > > > However this default of 200 maximum threads is not used in
>> Pulsar.
>> > > > > >
>> > > > > >  Regarding the "make async" changes, It is an optimization to
>> > migrate from
>> > > > > > the blocking servlet api to the asynchronous servlet api. This
>> > work isn't
>> > > > > > urgent since we can simply mitigate the HTTP server threads
>> > getting blocked
>> > > > > > by setting "numHttpServerThreads=200" in broker.conf. "the
>> > problem" will be
>> > > > > > resolved immediately without risks of regressions that are
>> > involved in
>> > > > > > making the sync -> async changes.
>> > > > > >
>> > > > > > Yes, this is the problem. But I am against using 200 threads as
>> > the max
>> > > > > > web server thread by default,
>> > > > > > it can't work for cases that the broker without that much
>> memory,
>> > it will
>> > > > > > lead to more serious problems
>> > > > > > that the service quality of messaging API gets worse due to the
>> JVM
>> > > > > > GC, even memory overflow.
>> > > > > >
>> > > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10
>> > release,
>> > > > > > and all the PRs are not cherry-picked to branch-2.x
>> > > > > > This is an optimization for pulsar, the current implementation
>> > does not
>> > > > > > use jetty async API well, we should fix it,
>> > > > > > we should reduce the code with bad smells, and using async API
>> is
>> > also
>> > > > > > a more efficient way without opening such jetty threads.
>> > > > > > Do you have any concerns about the way the modification becomes
>> > purely
>> > > > > > async?
>> > > > > >
>> > > > > > > Penghui, would you mind adding a GitHub issue for the problem
>> > where all
>> > > > > > HTTP threads get blocked and the Pulsar Admin API stops
>> responding?
>> > > > > >
>> > > > > > https://github.com/apache/pulsar/issues/4756 the attachment
>> from
>> > the
>> > > > > > issue is a good example
>> > > > > >
>> > > > > > Regards,
>> > > > > > Penghui
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lhotari@apache.org
>> >
>> > wrote:
>> > > > > >
>> > > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to
>> set
>> > > > > >> numHttpServerThreads=200 .
>> > > > > >> Please review
>> > > > > >>
>> > > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
>> > > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
>> > > > > >> > > Which is a sync method. Ultimately this could lead to all
>> the
>> > > > > >> pulsar-web
>> > > > > >> > > thread
>> > > > > >> > > blocked. we'd better not introduce blocking calls if we use
>> > > > > >> AsyncResponse.
>> > > > > >> > >
>> > > > > >> > > > What issue did you see? Please share more context. Thanks
>> > for the
>> > > > > >> > > patience.
>> > > > > >> > >
>> > > > > >> > > It happened very earlier
>> > > > > >> > >
>> > > > > >> > > Here is the issue
>> > https://github.com/apache/pulsar/issues/4756
>> > > > > >> > > And here is also a related fix
>> > > > > >> https://github.com/apache/pulsar/pull/10619
>> > > > > >> >
>> > > > > >> > Penghui, Thank you for the patience, and thanks for sharing
>> more
>> > > > > >> context. I happened to send a reply before reading your
>> message,
>> > so please
>> > > > > >> bear with me.
>> > > > > >> >
>> > > > > >> > So finally, I understand that "the problem" is that all HTTP
>> > server
>> > > > > >> threads are blocked and this makes the Pulsar Admin API
>> > unavailable.
>> > > > > >> >
>> > > > > >> > To support the blocking servlet API, Jetty uses a default
>> > thread pool
>> > > > > >> that can grow to up to 200 threads (
>> > > > > >>
>> >
>> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
>> > )
>> > > > > >> .
>> > > > > >> > However this default of 200 maximum threads is not used in
>> > Pulsar.
>> > > > > >> >
>> > > > > >> > The problem is that Pulsar uses a low value that assumes
>> > asynchronous
>> > > > > >> API usage:
>> > > > > >> >
>> > > > > >>
>> >
>> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
>> > > > > >> > Pulsar should be using a high value (for example 200) as long
>> > as there
>> > > > > >> are blocking calls in Admin APIs.
>> > > > > >> >
>> > > > > >> > The mitigation to the issue of all HTTP server threads
>> getting
>> > blocked
>> > > > > >> is setting "numHttpServerThreads=200" in broker.conf.
>> > > > > >> >
>> > > > > >> > Regarding the "make async" changes, It is an optimization to
>> > migrate
>> > > > > >> from the blocking servlet api to the asynchronous servlet api.
>> > This work
>> > > > > >> isn't urgent since we can simply mitigate the HTTP server
>> threads
>> > getting
>> > > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf.
>> > "the problem"
>> > > > > >> will be resolved immediately without risks of regressions that
>> > are involved
>> > > > > >> in making the sync -> async changes.
>> > > > > >> >
>> > > > > >> > Penghui, would you mind adding a GitHub issue for the problem
>> > where all
>> > > > > >> HTTP threads get blocked and the Pulsar Admin API stops
>> > responding?
>> > > > > >> >
>> > > > > >> > I can follow up with a PR which updates the default for
>> > > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty
>> > starts with
>> > > > > >> 8 threads. We can agree on the default value to use in the PR.
>> > > > > >> >
>> > > > > >> > Thank you for the great collaboration on sharing the context
>> and
>> > > > > >> describing the problem patiently.
>> > > > > >> >
>> > > > > >> > BR,
>> > > > > >> >
>> > > > > >> > -Lari
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> >
>> --
>> --
>> Matteo Merli
>> <ma...@gmail.com>
>>
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi Micheal

> Penghui, is your current plan to create branch-2.10, create the
release artifacts, and start a vote on them all within a few days of
each other?

Yes, I will create branch-2.10 today.

For starting the vote, we need to confirm these 2 PRs[1] will not introduce
breaking changes. Very grateful if someone can also help verify them.


[1] https://github.com/apache/pulsar/pull/13376,
https://github.com/apache/pulsar/pull/13341

Thanks,
Penghui

On Thu, Feb 17, 2022 at 8:59 AM Matteo Merli <ma...@gmail.com> wrote:

> Yes, but I think that the code freeze is only meaningful if it’s
> communicated in advance.
>
> The fact that it was included in the original PIP but never followed in the
> practice means it would be a last minute change.
>
> On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mm...@apache.org>
> wrote:
>
> > When we discussed the code freeze in the community meeting on 2/3, I
> > was under the impression that it was a new development to our existing
> > release process. I subsequently learned it was already defined in
> > PIP 47. Even if we haven't been following this part of PIP 47, what
> > is the value in waiting until 2.11 to follow our already defined process?
> > While I agree it is helpful to provide guidance on when a version will
> > ship,
> > I think it is more important to give the community time to test a
> release,
> > even if that means we're a little late on our release schedule. So far,
> > we haven't even created a branch to begin testing.
> >
> > Note also that Sijie suggested using a feature freeze early on in this
> > thread.
> >
> > The 2.9.0 release is relevant here. It had 4 release candidates over 4
> > weeks and the final result was broken. That indicates to me that tagging
> > an RC early does not guarantee an early release and that our current
> > process isn't optimal and likely needs adjustments. I do not think we
> > should wait to address these issues. I propose we start following
> > PIP 47's guidance on code freeze and release stabilization periods.
> >
> > > I don't think that changes the picture here. There are *always* last
> > > minute issues being discovered, and there is a call to be made on a
> > > case by case. The feature freeze will reduce the likelihood of
> > > introducing *more* issues by getting it from the master branch, but
> > > won't change a comma from issues that were already there.
> >
> > I thought you wanted to implement a code/feature freeze to allow for
> > more release stabilization. Can you clarify what you mean here?
> >
> > Thanks,
> > Michael
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <ma...@gmail.com>
> > wrote:
> > >
> > > Michael, as we chatted in last weekly meeting (though not yet
> > > formalized), since we have never really done a feature freeze on the
> > > branch during paste releases, we should start from the next release,
> > > to give a decent preview of what to expect to developers in terms of
> > > dates.
> > >
> > > > While some may feel "behind" in getting out the 2.10 release, our
> > > > priority must be to give the community time to verify the stability
> of
> > > > the release.
> > >
> > > I don't think that changes the picture here. There are *always* last
> > > minute issues being discovered, and there is a call to be made on a
> > > case by case. The feature freeze will reduce the likelihood of
> > > introducing *more* issues by getting it from the master branch, but
> > > won't change a comma from issues that were already there.
> > >
> > >
> > >
> > >
> > > --
> > > Matteo Merli
> > > <ma...@gmail.com>
> > >
> > > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <
> mmarshall@apache.org>
> > wrote:
> > > >
> > > > > I will build the release and start the vote before next
> Monday(GMT+8)
> > > >
> > > > Penghui, is your current plan to create branch-2.10, create the
> > > > release artifacts, and start a vote on them all within a few days of
> > > > each other?
> > > >
> > > > > I'm doing my best to follow PIP 47, but when seeing a potential
> break
> > > > > change, I need to confirm it.
> > > > > After all the potential break changes have been confirmed and
> fixed,
> > I will
> > > > > start the vote thread.
> > > >
> > > > I think we should review our current release plan before we move
> > > > forward as proposed above. PIP 47 explicitly says that a month before
> > > > the release date, the release manager will cut branches [0]. We don't
> > > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> > > > feature freeze and then a code freeze. We have not yet had either.
> > > >
> > > > I propose we create branch-2.10 now and simultaneously announce that
> > > > we are past the feature freeze period. Then, we can start the 2 week
> > > > period for bug fixes that precedes the code freeze, as PIP 47
> > > > prescribes. Then, in two weeks, we can produce the first release
> > > > candidate (also in PIP 47).
> > > >
> > > > While some may feel "behind" in getting out the 2.10 release, our
> > > > priority must be to give the community time to verify the stability
> of
> > > > the release.
> > > >
> > > > Thanks,
> > > > Michael
> > > >
> > > > [0]
> > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org>
> wrote:
> > > > >
> > > > > Hi all
> > > > >
> > > > > Just put an update here.
> > > > >
> > > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> > > > > https://github.com/apache/pulsar/pull/13341
> > > > > need to do the final verification, and you are also very welcome to
> > verify
> > > > > these 2 changes in your environment, cases.
> > > > >
> > > > > I will build the release and start the vote before next
> Monday(GMT+8)
> > > > >
> > > > > Regards
> > > > > Penghui
> > > > >
> > > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi lari,
> > > > > >
> > > > > > > So finally, I understand that "the problem" is that all HTTP
> > server
> > > > > > threads are blocked and this makes the Pulsar Admin API
> > unavailable.
> > > > > >
> > > > > > To support the blocking servlet API, Jetty uses a default thread
> > pool that
> > > > > > can grow to up to 200 threads (
> > > > > >
> >
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> > )
> > > > > > .
> > > > > > However this default of 200 maximum threads is not used in
> Pulsar.
> > > > > >
> > > > > >  Regarding the "make async" changes, It is an optimization to
> > migrate from
> > > > > > the blocking servlet api to the asynchronous servlet api. This
> > work isn't
> > > > > > urgent since we can simply mitigate the HTTP server threads
> > getting blocked
> > > > > > by setting "numHttpServerThreads=200" in broker.conf. "the
> > problem" will be
> > > > > > resolved immediately without risks of regressions that are
> > involved in
> > > > > > making the sync -> async changes.
> > > > > >
> > > > > > Yes, this is the problem. But I am against using 200 threads as
> > the max
> > > > > > web server thread by default,
> > > > > > it can't work for cases that the broker without that much memory,
> > it will
> > > > > > lead to more serious problems
> > > > > > that the service quality of messaging API gets worse due to the
> JVM
> > > > > > GC, even memory overflow.
> > > > > >
> > > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10
> > release,
> > > > > > and all the PRs are not cherry-picked to branch-2.x
> > > > > > This is an optimization for pulsar, the current implementation
> > does not
> > > > > > use jetty async API well, we should fix it,
> > > > > > we should reduce the code with bad smells, and using async API is
> > also
> > > > > > a more efficient way without opening such jetty threads.
> > > > > > Do you have any concerns about the way the modification becomes
> > purely
> > > > > > async?
> > > > > >
> > > > > > > Penghui, would you mind adding a GitHub issue for the problem
> > where all
> > > > > > HTTP threads get blocked and the Pulsar Admin API stops
> responding?
> > > > > >
> > > > > > https://github.com/apache/pulsar/issues/4756 the attachment from
> > the
> > > > > > issue is a good example
> > > > > >
> > > > > > Regards,
> > > > > > Penghui
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org>
> > wrote:
> > > > > >
> > > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> > > > > >> numHttpServerThreads=200 .
> > > > > >> Please review
> > > > > >>
> > > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > > > > >> > > Which is a sync method. Ultimately this could lead to all
> the
> > > > > >> pulsar-web
> > > > > >> > > thread
> > > > > >> > > blocked. we'd better not introduce blocking calls if we use
> > > > > >> AsyncResponse.
> > > > > >> > >
> > > > > >> > > > What issue did you see? Please share more context. Thanks
> > for the
> > > > > >> > > patience.
> > > > > >> > >
> > > > > >> > > It happened very earlier
> > > > > >> > >
> > > > > >> > > Here is the issue
> > https://github.com/apache/pulsar/issues/4756
> > > > > >> > > And here is also a related fix
> > > > > >> https://github.com/apache/pulsar/pull/10619
> > > > > >> >
> > > > > >> > Penghui, Thank you for the patience, and thanks for sharing
> more
> > > > > >> context. I happened to send a reply before reading your message,
> > so please
> > > > > >> bear with me.
> > > > > >> >
> > > > > >> > So finally, I understand that "the problem" is that all HTTP
> > server
> > > > > >> threads are blocked and this makes the Pulsar Admin API
> > unavailable.
> > > > > >> >
> > > > > >> > To support the blocking servlet API, Jetty uses a default
> > thread pool
> > > > > >> that can grow to up to 200 threads (
> > > > > >>
> >
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> > )
> > > > > >> .
> > > > > >> > However this default of 200 maximum threads is not used in
> > Pulsar.
> > > > > >> >
> > > > > >> > The problem is that Pulsar uses a low value that assumes
> > asynchronous
> > > > > >> API usage:
> > > > > >> >
> > > > > >>
> >
> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > > > > >> > Pulsar should be using a high value (for example 200) as long
> > as there
> > > > > >> are blocking calls in Admin APIs.
> > > > > >> >
> > > > > >> > The mitigation to the issue of all HTTP server threads getting
> > blocked
> > > > > >> is setting "numHttpServerThreads=200" in broker.conf.
> > > > > >> >
> > > > > >> > Regarding the "make async" changes, It is an optimization to
> > migrate
> > > > > >> from the blocking servlet api to the asynchronous servlet api.
> > This work
> > > > > >> isn't urgent since we can simply mitigate the HTTP server
> threads
> > getting
> > > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf.
> > "the problem"
> > > > > >> will be resolved immediately without risks of regressions that
> > are involved
> > > > > >> in making the sync -> async changes.
> > > > > >> >
> > > > > >> > Penghui, would you mind adding a GitHub issue for the problem
> > where all
> > > > > >> HTTP threads get blocked and the Pulsar Admin API stops
> > responding?
> > > > > >> >
> > > > > >> > I can follow up with a PR which updates the default for
> > > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty
> > starts with
> > > > > >> 8 threads. We can agree on the default value to use in the PR.
> > > > > >> >
> > > > > >> > Thank you for the great collaboration on sharing the context
> and
> > > > > >> describing the problem patiently.
> > > > > >> >
> > > > > >> > BR,
> > > > > >> >
> > > > > >> > -Lari
> > > > > >> >
> > > > > >>
> > > > > >
> >
> --
> --
> Matteo Merli
> <ma...@gmail.com>
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Matteo Merli <ma...@gmail.com>.
Yes, but I think that the code freeze is only meaningful if it’s
communicated in advance.

The fact that it was included in the original PIP but never followed in the
practice means it would be a last minute change.

On Wed, Feb 16, 2022 at 2:37 PM Michael Marshall <mm...@apache.org>
wrote:

> When we discussed the code freeze in the community meeting on 2/3, I
> was under the impression that it was a new development to our existing
> release process. I subsequently learned it was already defined in
> PIP 47. Even if we haven't been following this part of PIP 47, what
> is the value in waiting until 2.11 to follow our already defined process?
> While I agree it is helpful to provide guidance on when a version will
> ship,
> I think it is more important to give the community time to test a release,
> even if that means we're a little late on our release schedule. So far,
> we haven't even created a branch to begin testing.
>
> Note also that Sijie suggested using a feature freeze early on in this
> thread.
>
> The 2.9.0 release is relevant here. It had 4 release candidates over 4
> weeks and the final result was broken. That indicates to me that tagging
> an RC early does not guarantee an early release and that our current
> process isn't optimal and likely needs adjustments. I do not think we
> should wait to address these issues. I propose we start following
> PIP 47's guidance on code freeze and release stabilization periods.
>
> > I don't think that changes the picture here. There are *always* last
> > minute issues being discovered, and there is a call to be made on a
> > case by case. The feature freeze will reduce the likelihood of
> > introducing *more* issues by getting it from the master branch, but
> > won't change a comma from issues that were already there.
>
> I thought you wanted to implement a code/feature freeze to allow for
> more release stabilization. Can you clarify what you mean here?
>
> Thanks,
> Michael
>
>
>
>
>
>
>
> On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <ma...@gmail.com>
> wrote:
> >
> > Michael, as we chatted in last weekly meeting (though not yet
> > formalized), since we have never really done a feature freeze on the
> > branch during paste releases, we should start from the next release,
> > to give a decent preview of what to expect to developers in terms of
> > dates.
> >
> > > While some may feel "behind" in getting out the 2.10 release, our
> > > priority must be to give the community time to verify the stability of
> > > the release.
> >
> > I don't think that changes the picture here. There are *always* last
> > minute issues being discovered, and there is a call to be made on a
> > case by case. The feature freeze will reduce the likelihood of
> > introducing *more* issues by getting it from the master branch, but
> > won't change a comma from issues that were already there.
> >
> >
> >
> >
> > --
> > Matteo Merli
> > <ma...@gmail.com>
> >
> > On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <mm...@apache.org>
> wrote:
> > >
> > > > I will build the release and start the vote before next Monday(GMT+8)
> > >
> > > Penghui, is your current plan to create branch-2.10, create the
> > > release artifacts, and start a vote on them all within a few days of
> > > each other?
> > >
> > > > I'm doing my best to follow PIP 47, but when seeing a potential break
> > > > change, I need to confirm it.
> > > > After all the potential break changes have been confirmed and fixed,
> I will
> > > > start the vote thread.
> > >
> > > I think we should review our current release plan before we move
> > > forward as proposed above. PIP 47 explicitly says that a month before
> > > the release date, the release manager will cut branches [0]. We don't
> > > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> > > feature freeze and then a code freeze. We have not yet had either.
> > >
> > > I propose we create branch-2.10 now and simultaneously announce that
> > > we are past the feature freeze period. Then, we can start the 2 week
> > > period for bug fixes that precedes the code freeze, as PIP 47
> > > prescribes. Then, in two weeks, we can produce the first release
> > > candidate (also in PIP 47).
> > >
> > > While some may feel "behind" in getting out the 2.10 release, our
> > > priority must be to give the community time to verify the stability of
> > > the release.
> > >
> > > Thanks,
> > > Michael
> > >
> > > [0]
> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
> > >
> > >
> > >
> > >
> > > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org> wrote:
> > > >
> > > > Hi all
> > > >
> > > > Just put an update here.
> > > >
> > > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> > > > https://github.com/apache/pulsar/pull/13341
> > > > need to do the final verification, and you are also very welcome to
> verify
> > > > these 2 changes in your environment, cases.
> > > >
> > > > I will build the release and start the vote before next Monday(GMT+8)
> > > >
> > > > Regards
> > > > Penghui
> > > >
> > > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org>
> wrote:
> > > >
> > > > > Hi lari,
> > > > >
> > > > > > So finally, I understand that "the problem" is that all HTTP
> server
> > > > > threads are blocked and this makes the Pulsar Admin API
> unavailable.
> > > > >
> > > > > To support the blocking servlet API, Jetty uses a default thread
> pool that
> > > > > can grow to up to 200 threads (
> > > > >
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> )
> > > > > .
> > > > > However this default of 200 maximum threads is not used in Pulsar.
> > > > >
> > > > >  Regarding the "make async" changes, It is an optimization to
> migrate from
> > > > > the blocking servlet api to the asynchronous servlet api. This
> work isn't
> > > > > urgent since we can simply mitigate the HTTP server threads
> getting blocked
> > > > > by setting "numHttpServerThreads=200" in broker.conf. "the
> problem" will be
> > > > > resolved immediately without risks of regressions that are
> involved in
> > > > > making the sync -> async changes.
> > > > >
> > > > > Yes, this is the problem. But I am against using 200 threads as
> the max
> > > > > web server thread by default,
> > > > > it can't work for cases that the broker without that much memory,
> it will
> > > > > lead to more serious problems
> > > > > that the service quality of messaging API gets worse due to the JVM
> > > > > GC, even memory overflow.
> > > > >
> > > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10
> release,
> > > > > and all the PRs are not cherry-picked to branch-2.x
> > > > > This is an optimization for pulsar, the current implementation
> does not
> > > > > use jetty async API well, we should fix it,
> > > > > we should reduce the code with bad smells, and using async API is
> also
> > > > > a more efficient way without opening such jetty threads.
> > > > > Do you have any concerns about the way the modification becomes
> purely
> > > > > async?
> > > > >
> > > > > > Penghui, would you mind adding a GitHub issue for the problem
> where all
> > > > > HTTP threads get blocked and the Pulsar Admin API stops responding?
> > > > >
> > > > > https://github.com/apache/pulsar/issues/4756 the attachment from
> the
> > > > > issue is a good example
> > > > >
> > > > > Regards,
> > > > > Penghui
> > > > >
> > > > >
> > > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org>
> wrote:
> > > > >
> > > > >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> > > > >> numHttpServerThreads=200 .
> > > > >> Please review
> > > > >>
> > > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > > > >> > > Which is a sync method. Ultimately this could lead to all the
> > > > >> pulsar-web
> > > > >> > > thread
> > > > >> > > blocked. we'd better not introduce blocking calls if we use
> > > > >> AsyncResponse.
> > > > >> > >
> > > > >> > > > What issue did you see? Please share more context. Thanks
> for the
> > > > >> > > patience.
> > > > >> > >
> > > > >> > > It happened very earlier
> > > > >> > >
> > > > >> > > Here is the issue
> https://github.com/apache/pulsar/issues/4756
> > > > >> > > And here is also a related fix
> > > > >> https://github.com/apache/pulsar/pull/10619
> > > > >> >
> > > > >> > Penghui, Thank you for the patience, and thanks for sharing more
> > > > >> context. I happened to send a reply before reading your message,
> so please
> > > > >> bear with me.
> > > > >> >
> > > > >> > So finally, I understand that "the problem" is that all HTTP
> server
> > > > >> threads are blocked and this makes the Pulsar Admin API
> unavailable.
> > > > >> >
> > > > >> > To support the blocking servlet API, Jetty uses a default
> thread pool
> > > > >> that can grow to up to 200 threads (
> > > > >>
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57
> )
> > > > >> .
> > > > >> > However this default of 200 maximum threads is not used in
> Pulsar.
> > > > >> >
> > > > >> > The problem is that Pulsar uses a low value that assumes
> asynchronous
> > > > >> API usage:
> > > > >> >
> > > > >>
> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > > > >> > Pulsar should be using a high value (for example 200) as long
> as there
> > > > >> are blocking calls in Admin APIs.
> > > > >> >
> > > > >> > The mitigation to the issue of all HTTP server threads getting
> blocked
> > > > >> is setting "numHttpServerThreads=200" in broker.conf.
> > > > >> >
> > > > >> > Regarding the "make async" changes, It is an optimization to
> migrate
> > > > >> from the blocking servlet api to the asynchronous servlet api.
> This work
> > > > >> isn't urgent since we can simply mitigate the HTTP server threads
> getting
> > > > >> blocked by setting "numHttpServerThreads=200" in broker.conf.
> "the problem"
> > > > >> will be resolved immediately without risks of regressions that
> are involved
> > > > >> in making the sync -> async changes.
> > > > >> >
> > > > >> > Penghui, would you mind adding a GitHub issue for the problem
> where all
> > > > >> HTTP threads get blocked and the Pulsar Admin API stops
> responding?
> > > > >> >
> > > > >> > I can follow up with a PR which updates the default for
> > > > >> numHttpServerThreads to 200. This is a maximum value and Jetty
> starts with
> > > > >> 8 threads. We can agree on the default value to use in the PR.
> > > > >> >
> > > > >> > Thank you for the great collaboration on sharing the context and
> > > > >> describing the problem patiently.
> > > > >> >
> > > > >> > BR,
> > > > >> >
> > > > >> > -Lari
> > > > >> >
> > > > >>
> > > > >
>
-- 
--
Matteo Merli
<ma...@gmail.com>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Michael Marshall <mm...@apache.org>.
When we discussed the code freeze in the community meeting on 2/3, I
was under the impression that it was a new development to our existing
release process. I subsequently learned it was already defined in
PIP 47. Even if we haven't been following this part of PIP 47, what
is the value in waiting until 2.11 to follow our already defined process?
While I agree it is helpful to provide guidance on when a version will ship,
I think it is more important to give the community time to test a release,
even if that means we're a little late on our release schedule. So far,
we haven't even created a branch to begin testing.

Note also that Sijie suggested using a feature freeze early on in this thread.

The 2.9.0 release is relevant here. It had 4 release candidates over 4
weeks and the final result was broken. That indicates to me that tagging
an RC early does not guarantee an early release and that our current
process isn't optimal and likely needs adjustments. I do not think we
should wait to address these issues. I propose we start following
PIP 47's guidance on code freeze and release stabilization periods.

> I don't think that changes the picture here. There are *always* last
> minute issues being discovered, and there is a call to be made on a
> case by case. The feature freeze will reduce the likelihood of
> introducing *more* issues by getting it from the master branch, but
> won't change a comma from issues that were already there.

I thought you wanted to implement a code/feature freeze to allow for
more release stabilization. Can you clarify what you mean here?

Thanks,
Michael







On Wed, Feb 16, 2022 at 2:42 PM Matteo Merli <ma...@gmail.com> wrote:
>
> Michael, as we chatted in last weekly meeting (though not yet
> formalized), since we have never really done a feature freeze on the
> branch during paste releases, we should start from the next release,
> to give a decent preview of what to expect to developers in terms of
> dates.
>
> > While some may feel "behind" in getting out the 2.10 release, our
> > priority must be to give the community time to verify the stability of
> > the release.
>
> I don't think that changes the picture here. There are *always* last
> minute issues being discovered, and there is a call to be made on a
> case by case. The feature freeze will reduce the likelihood of
> introducing *more* issues by getting it from the master branch, but
> won't change a comma from issues that were already there.
>
>
>
>
> --
> Matteo Merli
> <ma...@gmail.com>
>
> On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <mm...@apache.org> wrote:
> >
> > > I will build the release and start the vote before next Monday(GMT+8)
> >
> > Penghui, is your current plan to create branch-2.10, create the
> > release artifacts, and start a vote on them all within a few days of
> > each other?
> >
> > > I'm doing my best to follow PIP 47, but when seeing a potential break
> > > change, I need to confirm it.
> > > After all the potential break changes have been confirmed and fixed, I will
> > > start the vote thread.
> >
> > I think we should review our current release plan before we move
> > forward as proposed above. PIP 47 explicitly says that a month before
> > the release date, the release manager will cut branches [0]. We don't
> > yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> > feature freeze and then a code freeze. We have not yet had either.
> >
> > I propose we create branch-2.10 now and simultaneously announce that
> > we are past the feature freeze period. Then, we can start the 2 week
> > period for bug fixes that precedes the code freeze, as PIP 47
> > prescribes. Then, in two weeks, we can produce the first release
> > candidate (also in PIP 47).
> >
> > While some may feel "behind" in getting out the 2.10 release, our
> > priority must be to give the community time to verify the stability of
> > the release.
> >
> > Thanks,
> > Michael
> >
> > [0] https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
> >
> >
> >
> >
> > On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org> wrote:
> > >
> > > Hi all
> > >
> > > Just put an update here.
> > >
> > > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> > > https://github.com/apache/pulsar/pull/13341
> > > need to do the final verification, and you are also very welcome to verify
> > > these 2 changes in your environment, cases.
> > >
> > > I will build the release and start the vote before next Monday(GMT+8)
> > >
> > > Regards
> > > Penghui
> > >
> > > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org> wrote:
> > >
> > > > Hi lari,
> > > >
> > > > > So finally, I understand that "the problem" is that all HTTP server
> > > > threads are blocked and this makes the Pulsar Admin API unavailable.
> > > >
> > > > To support the blocking servlet API, Jetty uses a default thread pool that
> > > > can grow to up to 200 threads (
> > > > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > > > .
> > > > However this default of 200 maximum threads is not used in Pulsar.
> > > >
> > > >  Regarding the "make async" changes, It is an optimization to migrate from
> > > > the blocking servlet api to the asynchronous servlet api. This work isn't
> > > > urgent since we can simply mitigate the HTTP server threads getting blocked
> > > > by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be
> > > > resolved immediately without risks of regressions that are involved in
> > > > making the sync -> async changes.
> > > >
> > > > Yes, this is the problem. But I am against using 200 threads as the max
> > > > web server thread by default,
> > > > it can't work for cases that the broker without that much memory, it will
> > > > lead to more serious problems
> > > > that the service quality of messaging API gets worse due to the JVM
> > > > GC, even memory overflow.
> > > >
> > > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
> > > > and all the PRs are not cherry-picked to branch-2.x
> > > > This is an optimization for pulsar, the current implementation does not
> > > > use jetty async API well, we should fix it,
> > > > we should reduce the code with bad smells, and using async API is also
> > > > a more efficient way without opening such jetty threads.
> > > > Do you have any concerns about the way the modification becomes purely
> > > > async?
> > > >
> > > > > Penghui, would you mind adding a GitHub issue for the problem where all
> > > > HTTP threads get blocked and the Pulsar Admin API stops responding?
> > > >
> > > > https://github.com/apache/pulsar/issues/4756 the attachment from the
> > > > issue is a good example
> > > >
> > > > Regards,
> > > > Penghui
> > > >
> > > >
> > > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org> wrote:
> > > >
> > > >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> > > >> numHttpServerThreads=200 .
> > > >> Please review
> > > >>
> > > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > > >> > > Which is a sync method. Ultimately this could lead to all the
> > > >> pulsar-web
> > > >> > > thread
> > > >> > > blocked. we'd better not introduce blocking calls if we use
> > > >> AsyncResponse.
> > > >> > >
> > > >> > > > What issue did you see? Please share more context. Thanks for the
> > > >> > > patience.
> > > >> > >
> > > >> > > It happened very earlier
> > > >> > >
> > > >> > > Here is the issue https://github.com/apache/pulsar/issues/4756
> > > >> > > And here is also a related fix
> > > >> https://github.com/apache/pulsar/pull/10619
> > > >> >
> > > >> > Penghui, Thank you for the patience, and thanks for sharing more
> > > >> context. I happened to send a reply before reading your message, so please
> > > >> bear with me.
> > > >> >
> > > >> > So finally, I understand that "the problem" is that all HTTP server
> > > >> threads are blocked and this makes the Pulsar Admin API unavailable.
> > > >> >
> > > >> > To support the blocking servlet API, Jetty uses a default thread pool
> > > >> that can grow to up to 200 threads (
> > > >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > > >> .
> > > >> > However this default of 200 maximum threads is not used in Pulsar.
> > > >> >
> > > >> > The problem is that Pulsar uses a low value that assumes asynchronous
> > > >> API usage:
> > > >> >
> > > >> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > > >> > Pulsar should be using a high value (for example 200) as long as there
> > > >> are blocking calls in Admin APIs.
> > > >> >
> > > >> > The mitigation to the issue of all HTTP server threads getting blocked
> > > >> is setting "numHttpServerThreads=200" in broker.conf.
> > > >> >
> > > >> > Regarding the "make async" changes, It is an optimization to migrate
> > > >> from the blocking servlet api to the asynchronous servlet api. This work
> > > >> isn't urgent since we can simply mitigate the HTTP server threads getting
> > > >> blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem"
> > > >> will be resolved immediately without risks of regressions that are involved
> > > >> in making the sync -> async changes.
> > > >> >
> > > >> > Penghui, would you mind adding a GitHub issue for the problem where all
> > > >> HTTP threads get blocked and the Pulsar Admin API stops responding?
> > > >> >
> > > >> > I can follow up with a PR which updates the default for
> > > >> numHttpServerThreads to 200. This is a maximum value and Jetty starts with
> > > >> 8 threads. We can agree on the default value to use in the PR.
> > > >> >
> > > >> > Thank you for the great collaboration on sharing the context and
> > > >> describing the problem patiently.
> > > >> >
> > > >> > BR,
> > > >> >
> > > >> > -Lari
> > > >> >
> > > >>
> > > >

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Matteo Merli <ma...@gmail.com>.
Michael, as we chatted in last weekly meeting (though not yet
formalized), since we have never really done a feature freeze on the
branch during paste releases, we should start from the next release,
to give a decent preview of what to expect to developers in terms of
dates.

> While some may feel "behind" in getting out the 2.10 release, our
> priority must be to give the community time to verify the stability of
> the release.

I don't think that changes the picture here. There are *always* last
minute issues being discovered, and there is a call to be made on a
case by case. The feature freeze will reduce the likelihood of
introducing *more* issues by getting it from the master branch, but
won't change a comma from issues that were already there.




--
Matteo Merli
<ma...@gmail.com>

On Wed, Feb 16, 2022 at 10:47 AM Michael Marshall <mm...@apache.org> wrote:
>
> > I will build the release and start the vote before next Monday(GMT+8)
>
> Penghui, is your current plan to create branch-2.10, create the
> release artifacts, and start a vote on them all within a few days of
> each other?
>
> > I'm doing my best to follow PIP 47, but when seeing a potential break
> > change, I need to confirm it.
> > After all the potential break changes have been confirmed and fixed, I will
> > start the vote thread.
>
> I think we should review our current release plan before we move
> forward as proposed above. PIP 47 explicitly says that a month before
> the release date, the release manager will cut branches [0]. We don't
> yet have a `branch-2.10`. PIP 47 also defines a period of time for a
> feature freeze and then a code freeze. We have not yet had either.
>
> I propose we create branch-2.10 now and simultaneously announce that
> we are past the feature freeze period. Then, we can start the 2 week
> period for bug fixes that precedes the code freeze, as PIP 47
> prescribes. Then, in two weeks, we can produce the first release
> candidate (also in PIP 47).
>
> While some may feel "behind" in getting out the 2.10 release, our
> priority must be to give the community time to verify the stability of
> the release.
>
> Thanks,
> Michael
>
> [0] https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan
>
>
>
>
> On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org> wrote:
> >
> > Hi all
> >
> > Just put an update here.
> >
> > We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> > https://github.com/apache/pulsar/pull/13341
> > need to do the final verification, and you are also very welcome to verify
> > these 2 changes in your environment, cases.
> >
> > I will build the release and start the vote before next Monday(GMT+8)
> >
> > Regards
> > Penghui
> >
> > On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org> wrote:
> >
> > > Hi lari,
> > >
> > > > So finally, I understand that "the problem" is that all HTTP server
> > > threads are blocked and this makes the Pulsar Admin API unavailable.
> > >
> > > To support the blocking servlet API, Jetty uses a default thread pool that
> > > can grow to up to 200 threads (
> > > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > > .
> > > However this default of 200 maximum threads is not used in Pulsar.
> > >
> > >  Regarding the "make async" changes, It is an optimization to migrate from
> > > the blocking servlet api to the asynchronous servlet api. This work isn't
> > > urgent since we can simply mitigate the HTTP server threads getting blocked
> > > by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be
> > > resolved immediately without risks of regressions that are involved in
> > > making the sync -> async changes.
> > >
> > > Yes, this is the problem. But I am against using 200 threads as the max
> > > web server thread by default,
> > > it can't work for cases that the broker without that much memory, it will
> > > lead to more serious problems
> > > that the service quality of messaging API gets worse due to the JVM
> > > GC, even memory overflow.
> > >
> > > Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
> > > and all the PRs are not cherry-picked to branch-2.x
> > > This is an optimization for pulsar, the current implementation does not
> > > use jetty async API well, we should fix it,
> > > we should reduce the code with bad smells, and using async API is also
> > > a more efficient way without opening such jetty threads.
> > > Do you have any concerns about the way the modification becomes purely
> > > async?
> > >
> > > > Penghui, would you mind adding a GitHub issue for the problem where all
> > > HTTP threads get blocked and the Pulsar Admin API stops responding?
> > >
> > > https://github.com/apache/pulsar/issues/4756 the attachment from the
> > > issue is a good example
> > >
> > > Regards,
> > > Penghui
> > >
> > >
> > > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org> wrote:
> > >
> > >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> > >> numHttpServerThreads=200 .
> > >> Please review
> > >>
> > >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > >> > > Which is a sync method. Ultimately this could lead to all the
> > >> pulsar-web
> > >> > > thread
> > >> > > blocked. we'd better not introduce blocking calls if we use
> > >> AsyncResponse.
> > >> > >
> > >> > > > What issue did you see? Please share more context. Thanks for the
> > >> > > patience.
> > >> > >
> > >> > > It happened very earlier
> > >> > >
> > >> > > Here is the issue https://github.com/apache/pulsar/issues/4756
> > >> > > And here is also a related fix
> > >> https://github.com/apache/pulsar/pull/10619
> > >> >
> > >> > Penghui, Thank you for the patience, and thanks for sharing more
> > >> context. I happened to send a reply before reading your message, so please
> > >> bear with me.
> > >> >
> > >> > So finally, I understand that "the problem" is that all HTTP server
> > >> threads are blocked and this makes the Pulsar Admin API unavailable.
> > >> >
> > >> > To support the blocking servlet API, Jetty uses a default thread pool
> > >> that can grow to up to 200 threads (
> > >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > >> .
> > >> > However this default of 200 maximum threads is not used in Pulsar.
> > >> >
> > >> > The problem is that Pulsar uses a low value that assumes asynchronous
> > >> API usage:
> > >> >
> > >> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > >> > Pulsar should be using a high value (for example 200) as long as there
> > >> are blocking calls in Admin APIs.
> > >> >
> > >> > The mitigation to the issue of all HTTP server threads getting blocked
> > >> is setting "numHttpServerThreads=200" in broker.conf.
> > >> >
> > >> > Regarding the "make async" changes, It is an optimization to migrate
> > >> from the blocking servlet api to the asynchronous servlet api. This work
> > >> isn't urgent since we can simply mitigate the HTTP server threads getting
> > >> blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem"
> > >> will be resolved immediately without risks of regressions that are involved
> > >> in making the sync -> async changes.
> > >> >
> > >> > Penghui, would you mind adding a GitHub issue for the problem where all
> > >> HTTP threads get blocked and the Pulsar Admin API stops responding?
> > >> >
> > >> > I can follow up with a PR which updates the default for
> > >> numHttpServerThreads to 200. This is a maximum value and Jetty starts with
> > >> 8 threads. We can agree on the default value to use in the PR.
> > >> >
> > >> > Thank you for the great collaboration on sharing the context and
> > >> describing the problem patiently.
> > >> >
> > >> > BR,
> > >> >
> > >> > -Lari
> > >> >
> > >>
> > >

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Michael Marshall <mm...@apache.org>.
> I will build the release and start the vote before next Monday(GMT+8)

Penghui, is your current plan to create branch-2.10, create the
release artifacts, and start a vote on them all within a few days of
each other?

> I'm doing my best to follow PIP 47, but when seeing a potential break
> change, I need to confirm it.
> After all the potential break changes have been confirmed and fixed, I will
> start the vote thread.

I think we should review our current release plan before we move
forward as proposed above. PIP 47 explicitly says that a month before
the release date, the release manager will cut branches [0]. We don't
yet have a `branch-2.10`. PIP 47 also defines a period of time for a
feature freeze and then a code freeze. We have not yet had either.

I propose we create branch-2.10 now and simultaneously announce that
we are past the feature freeze period. Then, we can start the 2 week
period for bug fixes that precedes the code freeze, as PIP 47
prescribes. Then, in two weeks, we can produce the first release
candidate (also in PIP 47).

While some may feel "behind" in getting out the 2.10 release, our
priority must be to give the community time to verify the stability of
the release.

Thanks,
Michael

[0] https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan




On Wed, Feb 16, 2022 at 9:09 AM PengHui Li <pe...@apache.org> wrote:
>
> Hi all
>
> Just put an update here.
>
> We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
> https://github.com/apache/pulsar/pull/13341
> need to do the final verification, and you are also very welcome to verify
> these 2 changes in your environment, cases.
>
> I will build the release and start the vote before next Monday(GMT+8)
>
> Regards
> Penghui
>
> On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org> wrote:
>
> > Hi lari,
> >
> > > So finally, I understand that "the problem" is that all HTTP server
> > threads are blocked and this makes the Pulsar Admin API unavailable.
> >
> > To support the blocking servlet API, Jetty uses a default thread pool that
> > can grow to up to 200 threads (
> > https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> > .
> > However this default of 200 maximum threads is not used in Pulsar.
> >
> >  Regarding the "make async" changes, It is an optimization to migrate from
> > the blocking servlet api to the asynchronous servlet api. This work isn't
> > urgent since we can simply mitigate the HTTP server threads getting blocked
> > by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be
> > resolved immediately without risks of regressions that are involved in
> > making the sync -> async changes.
> >
> > Yes, this is the problem. But I am against using 200 threads as the max
> > web server thread by default,
> > it can't work for cases that the broker without that much memory, it will
> > lead to more serious problems
> > that the service quality of messaging API gets worse due to the JVM
> > GC, even memory overflow.
> >
> > Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
> > and all the PRs are not cherry-picked to branch-2.x
> > This is an optimization for pulsar, the current implementation does not
> > use jetty async API well, we should fix it,
> > we should reduce the code with bad smells, and using async API is also
> > a more efficient way without opening such jetty threads.
> > Do you have any concerns about the way the modification becomes purely
> > async?
> >
> > > Penghui, would you mind adding a GitHub issue for the problem where all
> > HTTP threads get blocked and the Pulsar Admin API stops responding?
> >
> > https://github.com/apache/pulsar/issues/4756 the attachment from the
> > issue is a good example
> >
> > Regards,
> > Penghui
> >
> >
> > On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org> wrote:
> >
> >> I created PR https://github.com/apache/pulsar/pull/14320 to set
> >> numHttpServerThreads=200 .
> >> Please review
> >>
> >> On 2022/02/16 12:39:34 Lari Hotari wrote:
> >> > On 2022/02/16 00:58:20 PengHui Li wrote:
> >> > > Which is a sync method. Ultimately this could lead to all the
> >> pulsar-web
> >> > > thread
> >> > > blocked. we'd better not introduce blocking calls if we use
> >> AsyncResponse.
> >> > >
> >> > > > What issue did you see? Please share more context. Thanks for the
> >> > > patience.
> >> > >
> >> > > It happened very earlier
> >> > >
> >> > > Here is the issue https://github.com/apache/pulsar/issues/4756
> >> > > And here is also a related fix
> >> https://github.com/apache/pulsar/pull/10619
> >> >
> >> > Penghui, Thank you for the patience, and thanks for sharing more
> >> context. I happened to send a reply before reading your message, so please
> >> bear with me.
> >> >
> >> > So finally, I understand that "the problem" is that all HTTP server
> >> threads are blocked and this makes the Pulsar Admin API unavailable.
> >> >
> >> > To support the blocking servlet API, Jetty uses a default thread pool
> >> that can grow to up to 200 threads (
> >> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> >> .
> >> > However this default of 200 maximum threads is not used in Pulsar.
> >> >
> >> > The problem is that Pulsar uses a low value that assumes asynchronous
> >> API usage:
> >> >
> >> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> >> > Pulsar should be using a high value (for example 200) as long as there
> >> are blocking calls in Admin APIs.
> >> >
> >> > The mitigation to the issue of all HTTP server threads getting blocked
> >> is setting "numHttpServerThreads=200" in broker.conf.
> >> >
> >> > Regarding the "make async" changes, It is an optimization to migrate
> >> from the blocking servlet api to the asynchronous servlet api. This work
> >> isn't urgent since we can simply mitigate the HTTP server threads getting
> >> blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem"
> >> will be resolved immediately without risks of regressions that are involved
> >> in making the sync -> async changes.
> >> >
> >> > Penghui, would you mind adding a GitHub issue for the problem where all
> >> HTTP threads get blocked and the Pulsar Admin API stops responding?
> >> >
> >> > I can follow up with a PR which updates the default for
> >> numHttpServerThreads to 200. This is a maximum value and Jetty starts with
> >> 8 threads. We can agree on the default value to use in the PR.
> >> >
> >> > Thank you for the great collaboration on sharing the context and
> >> describing the problem patiently.
> >> >
> >> > BR,
> >> >
> >> > -Lari
> >> >
> >>
> >

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi all

Just put an update here.

We have 2 PRs[1] https://github.com/apache/pulsar/pull/13376 and
https://github.com/apache/pulsar/pull/13341
need to do the final verification, and you are also very welcome to verify
these 2 changes in your environment, cases.

I will build the release and start the vote before next Monday(GMT+8)

Regards
Penghui

On Wed, Feb 16, 2022 at 10:22 PM PengHui Li <pe...@apache.org> wrote:

> Hi lari,
>
> > So finally, I understand that "the problem" is that all HTTP server
> threads are blocked and this makes the Pulsar Admin API unavailable.
>
> To support the blocking servlet API, Jetty uses a default thread pool that
> can grow to up to 200 threads (
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> .
> However this default of 200 maximum threads is not used in Pulsar.
>
>  Regarding the "make async" changes, It is an optimization to migrate from
> the blocking servlet api to the asynchronous servlet api. This work isn't
> urgent since we can simply mitigate the HTTP server threads getting blocked
> by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be
> resolved immediately without risks of regressions that are involved in
> making the sync -> async changes.
>
> Yes, this is the problem. But I am against using 200 threads as the max
> web server thread by default,
> it can't work for cases that the broker without that much memory, it will
> lead to more serious problems
> that the service quality of messaging API gets worse due to the JVM
> GC, even memory overflow.
>
> Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
> and all the PRs are not cherry-picked to branch-2.x
> This is an optimization for pulsar, the current implementation does not
> use jetty async API well, we should fix it,
> we should reduce the code with bad smells, and using async API is also
> a more efficient way without opening such jetty threads.
> Do you have any concerns about the way the modification becomes purely
> async?
>
> > Penghui, would you mind adding a GitHub issue for the problem where all
> HTTP threads get blocked and the Pulsar Admin API stops responding?
>
> https://github.com/apache/pulsar/issues/4756 the attachment from the
> issue is a good example
>
> Regards,
> Penghui
>
>
> On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org> wrote:
>
>> I created PR https://github.com/apache/pulsar/pull/14320 to set
>> numHttpServerThreads=200 .
>> Please review
>>
>> On 2022/02/16 12:39:34 Lari Hotari wrote:
>> > On 2022/02/16 00:58:20 PengHui Li wrote:
>> > > Which is a sync method. Ultimately this could lead to all the
>> pulsar-web
>> > > thread
>> > > blocked. we'd better not introduce blocking calls if we use
>> AsyncResponse.
>> > >
>> > > > What issue did you see? Please share more context. Thanks for the
>> > > patience.
>> > >
>> > > It happened very earlier
>> > >
>> > > Here is the issue https://github.com/apache/pulsar/issues/4756
>> > > And here is also a related fix
>> https://github.com/apache/pulsar/pull/10619
>> >
>> > Penghui, Thank you for the patience, and thanks for sharing more
>> context. I happened to send a reply before reading your message, so please
>> bear with me.
>> >
>> > So finally, I understand that "the problem" is that all HTTP server
>> threads are blocked and this makes the Pulsar Admin API unavailable.
>> >
>> > To support the blocking servlet API, Jetty uses a default thread pool
>> that can grow to up to 200 threads (
>> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
>> .
>> > However this default of 200 maximum threads is not used in Pulsar.
>> >
>> > The problem is that Pulsar uses a low value that assumes asynchronous
>> API usage:
>> >
>> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
>> > Pulsar should be using a high value (for example 200) as long as there
>> are blocking calls in Admin APIs.
>> >
>> > The mitigation to the issue of all HTTP server threads getting blocked
>> is setting "numHttpServerThreads=200" in broker.conf.
>> >
>> > Regarding the "make async" changes, It is an optimization to migrate
>> from the blocking servlet api to the asynchronous servlet api. This work
>> isn't urgent since we can simply mitigate the HTTP server threads getting
>> blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem"
>> will be resolved immediately without risks of regressions that are involved
>> in making the sync -> async changes.
>> >
>> > Penghui, would you mind adding a GitHub issue for the problem where all
>> HTTP threads get blocked and the Pulsar Admin API stops responding?
>> >
>> > I can follow up with a PR which updates the default for
>> numHttpServerThreads to 200. This is a maximum value and Jetty starts with
>> 8 threads. We can agree on the default value to use in the PR.
>> >
>> > Thank you for the great collaboration on sharing the context and
>> describing the problem patiently.
>> >
>> > BR,
>> >
>> > -Lari
>> >
>>
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi lari,

> So finally, I understand that "the problem" is that all HTTP server
threads are blocked and this makes the Pulsar Admin API unavailable.

To support the blocking servlet API, Jetty uses a default thread pool that
can grow to up to 200 threads (
https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
.
However this default of 200 maximum threads is not used in Pulsar.

 Regarding the "make async" changes, It is an optimization to migrate from
the blocking servlet api to the asynchronous servlet api. This work isn't
urgent since we can simply mitigate the HTTP server threads getting blocked
by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be
resolved immediately without risks of regressions that are involved in
making the sync -> async changes.

Yes, this is the problem. But I am against using 200 threads as the max web
server thread by default,
it can't work for cases that the broker without that much memory, it will
lead to more serious problems
that the service quality of messaging API gets worse due to the JVM
GC, even memory overflow.

Yes, it isn't urgent. So I said it's not a blocker for the 2.10 release,
and all the PRs are not cherry-picked to branch-2.x
This is an optimization for pulsar, the current implementation does not use
jetty async API well, we should fix it,
we should reduce the code with bad smells, and using async API is also
a more efficient way without opening such jetty threads.
Do you have any concerns about the way the modification becomes purely
async?

> Penghui, would you mind adding a GitHub issue for the problem where all
HTTP threads get blocked and the Pulsar Admin API stops responding?

https://github.com/apache/pulsar/issues/4756 the attachment from the issue
is a good example

Regards,
Penghui


On Wed, Feb 16, 2022 at 9:04 PM Lari Hotari <lh...@apache.org> wrote:

> I created PR https://github.com/apache/pulsar/pull/14320 to set
> numHttpServerThreads=200 .
> Please review
>
> On 2022/02/16 12:39:34 Lari Hotari wrote:
> > On 2022/02/16 00:58:20 PengHui Li wrote:
> > > Which is a sync method. Ultimately this could lead to all the
> pulsar-web
> > > thread
> > > blocked. we'd better not introduce blocking calls if we use
> AsyncResponse.
> > >
> > > > What issue did you see? Please share more context. Thanks for the
> > > patience.
> > >
> > > It happened very earlier
> > >
> > > Here is the issue https://github.com/apache/pulsar/issues/4756
> > > And here is also a related fix
> https://github.com/apache/pulsar/pull/10619
> >
> > Penghui, Thank you for the patience, and thanks for sharing more
> context. I happened to send a reply before reading your message, so please
> bear with me.
> >
> > So finally, I understand that "the problem" is that all HTTP server
> threads are blocked and this makes the Pulsar Admin API unavailable.
> >
> > To support the blocking servlet API, Jetty uses a default thread pool
> that can grow to up to 200 threads (
> https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57)
> .
> > However this default of 200 maximum threads is not used in Pulsar.
> >
> > The problem is that Pulsar uses a low value that assumes asynchronous
> API usage:
> >
> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> > Pulsar should be using a high value (for example 200) as long as there
> are blocking calls in Admin APIs.
> >
> > The mitigation to the issue of all HTTP server threads getting blocked
> is setting "numHttpServerThreads=200" in broker.conf.
> >
> > Regarding the "make async" changes, It is an optimization to migrate
> from the blocking servlet api to the asynchronous servlet api. This work
> isn't urgent since we can simply mitigate the HTTP server threads getting
> blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem"
> will be resolved immediately without risks of regressions that are involved
> in making the sync -> async changes.
> >
> > Penghui, would you mind adding a GitHub issue for the problem where all
> HTTP threads get blocked and the Pulsar Admin API stops responding?
> >
> > I can follow up with a PR which updates the default for
> numHttpServerThreads to 200. This is a maximum value and Jetty starts with
> 8 threads. We can agree on the default value to use in the PR.
> >
> > Thank you for the great collaboration on sharing the context and
> describing the problem patiently.
> >
> > BR,
> >
> > -Lari
> >
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
I created PR https://github.com/apache/pulsar/pull/14320 to set numHttpServerThreads=200 .
Please review 

On 2022/02/16 12:39:34 Lari Hotari wrote:
> On 2022/02/16 00:58:20 PengHui Li wrote:
> > Which is a sync method. Ultimately this could lead to all the pulsar-web
> > thread
> > blocked. we'd better not introduce blocking calls if we use AsyncResponse.
> > 
> > > What issue did you see? Please share more context. Thanks for the
> > patience.
> > 
> > It happened very earlier
> > 
> > Here is the issue https://github.com/apache/pulsar/issues/4756
> > And here is also a related fix https://github.com/apache/pulsar/pull/10619
> 
> Penghui, Thank you for the patience, and thanks for sharing more context. I happened to send a reply before reading your message, so please bear with me.
> 
> So finally, I understand that "the problem" is that all HTTP server threads are blocked and this makes the Pulsar Admin API unavailable.
> 
> To support the blocking servlet API, Jetty uses a default thread pool that can grow to up to 200 threads (https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57) .
> However this default of 200 maximum threads is not used in Pulsar.
> 
> The problem is that Pulsar uses a low value that assumes asynchronous API usage:
> https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
> Pulsar should be using a high value (for example 200) as long as there are blocking calls in Admin APIs.
> 
> The mitigation to the issue of all HTTP server threads getting blocked is setting "numHttpServerThreads=200" in broker.conf.
> 
> Regarding the "make async" changes, It is an optimization to migrate from the blocking servlet api to the asynchronous servlet api. This work isn't urgent since we can simply mitigate the HTTP server threads getting blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be resolved immediately without risks of regressions that are involved in making the sync -> async changes.
> 
> Penghui, would you mind adding a GitHub issue for the problem where all HTTP threads get blocked and the Pulsar Admin API stops responding?
> 
> I can follow up with a PR which updates the default for numHttpServerThreads to 200. This is a maximum value and Jetty starts with 8 threads. We can agree on the default value to use in the PR.
> 
> Thank you for the great collaboration on sharing the context and describing the problem patiently. 
> 
> BR, 
> 
> -Lari
> 

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
On 2022/02/16 00:58:20 PengHui Li wrote:
> Which is a sync method. Ultimately this could lead to all the pulsar-web
> thread
> blocked. we'd better not introduce blocking calls if we use AsyncResponse.
> 
> > What issue did you see? Please share more context. Thanks for the
> patience.
> 
> It happened very earlier
> 
> Here is the issue https://github.com/apache/pulsar/issues/4756
> And here is also a related fix https://github.com/apache/pulsar/pull/10619

Penghui, Thank you for the patience, and thanks for sharing more context. I happened to send a reply before reading your message, so please bear with me.

So finally, I understand that "the problem" is that all HTTP server threads are blocked and this makes the Pulsar Admin API unavailable.

To support the blocking servlet API, Jetty uses a default thread pool that can grow to up to 200 threads (https://github.com/eclipse/jetty.project/blob/4a0c91c0be53805e3fcffdcdcc9587d5301863db/jetty-util/src/main/java/org/eclipse/jetty/util/thread/ExecutorThreadPool.java#L57) .
However this default of 200 maximum threads is not used in Pulsar.

The problem is that Pulsar uses a low value that assumes asynchronous API usage:
https://github.com/apache/pulsar/blob/5c3ddc26d6e07eb0473b11b5ecc8318c1efe414b/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L201-L204
Pulsar should be using a high value (for example 200) as long as there are blocking calls in Admin APIs.

The mitigation to the issue of all HTTP server threads getting blocked is setting "numHttpServerThreads=200" in broker.conf.

Regarding the "make async" changes, It is an optimization to migrate from the blocking servlet api to the asynchronous servlet api. This work isn't urgent since we can simply mitigate the HTTP server threads getting blocked by setting "numHttpServerThreads=200" in broker.conf. "the problem" will be resolved immediately without risks of regressions that are involved in making the sync -> async changes.

Penghui, would you mind adding a GitHub issue for the problem where all HTTP threads get blocked and the Pulsar Admin API stops responding?

I can follow up with a PR which updates the default for numHttpServerThreads to 200. This is a maximum value and Jetty starts with 8 threads. We can agree on the default value to use in the PR.

Thank you for the great collaboration on sharing the context and describing the problem patiently. 

BR, 

-Lari

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi Lari,

> Thanks for replying, Penghui. The problem is that there is no rationale
nor description in that PR, https://github.com/apache/pulsar/pull/13666 .
The only sentence there is "Avoid call sync method in async rest API for
delete subscription".

For  https://github.com/apache/pulsar/pull/13666, I have shared the
stacktrace.
From the stacktrace, you can see the `pulsar-web-40-28` blocked at

```
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at
org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalDeleteSubscriptionForNonPartitionedTopic(PersistentTopicsBase.java:1498)
```

Which is a sync method. Ultimately this could lead to all the pulsar-web
thread
blocked. we'd better not introduce blocking calls if we use AsyncResponse.

> What issue did you see? Please share more context. Thanks for the
patience.

It happened very earlier

Here is the issue https://github.com/apache/pulsar/issues/4756
And here is also a related fix https://github.com/apache/pulsar/pull/10619

Thanks,
Penghui




On Tue, Feb 15, 2022 at 10:52 PM Lari Hotari <lh...@apache.org> wrote:

> On 2022/02/15 14:13:59 PengHui Li wrote:
> > The rationale for these changes, I think it starts from this PR
> > https://github.com/apache/pulsar/pull/13666
> > This is the only one example, we have seen the same issue again and
> again.
> > After #13666 get merged,
> > The contributors found there are many places that might also have the
> same
> > problem.
>
> Thanks for replying, Penghui. The problem is that there is no rationale
> nor description in that PR, https://github.com/apache/pulsar/pull/13666 .
> The only sentence there is "Avoid call sync method in async rest API for
> delete subscription".
>
> > "we have seen the same issue again and again."
>
> What issue did you see? Please share more context. Thanks for the patience.
>
> BR, Lari
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
On 2022/02/15 14:13:59 PengHui Li wrote:
> The rationale for these changes, I think it starts from this PR
> https://github.com/apache/pulsar/pull/13666
> This is the only one example, we have seen the same issue again and again.
> After #13666 get merged,
> The contributors found there are many places that might also have the same
> problem.

Thanks for replying, Penghui. The problem is that there is no rationale nor description in that PR, https://github.com/apache/pulsar/pull/13666 . The only sentence there is "Avoid call sync method in async rest API for delete subscription". 

> "we have seen the same issue again and again."

What issue did you see? Please share more context. Thanks for the patience.

BR, Lari

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
> Was this about the issue which this PR
https://github.com/apache/pulsar/pull/14283 resolved (since it is merged)?

I have the feeling that some past problems haven't been analyzed properly
before deciding on the solution. There seems to be an understanding that
switching from synchronous programming model to asynchronous programming
model solves problems on its own. Before making such changes, I would
expect that the issue is shared with the community and the assumptions
about the problem and the planned solution are also shared.

Yes, I think the issue has been fixed. Since only found this issue
yesterday which might be a race condition,
but can't confirm at that time, Demogorgon314 took a lot of effort to 100%
confirm the issue. And share the context and PR today.

> I am referring to PRs such as
https://github.com/apache/pulsar/issues/14013 where the only description of
the motivation for the change is that "PersistentTopicsBase has some sync
methods. Decide to make them async." .
Would it be possible to improve the description on such changes since those
changes are included in Pulsar 2.10.0 release?
There seem to be about 20 recent PRs in total which are about sync->async
changes: https://github.com/apache/pulsar/pulls?q=is%3Apr+is%3Aopen+async
What was the rationale for these changes?

It's not a blocker for 2.10. And we also testing this part these few days
to ensure that these changes do not introduce new problems.
You can see nodece has tested it in pulsarctl and push the fix
https://github.com/apache/pulsar/pull/14297
The rationale for these changes, I think it starts from this PR
https://github.com/apache/pulsar/pull/13666
This is the only one example, we have seen the same issue again and again.
After #13666 get merged,
The contributors found there are many places that might also have the same
problem.

Thanks,
Penghui

On Tue, Feb 15, 2022 at 7:07 PM Lari Hotari <lh...@apache.org> wrote:

> Thanks for the detailed reply, Penghui.
>
> > And, for the new metadata API, we found an issue that will introduce the
> > cache inconsistent issue,
> > we are working on a fix, it should be a release blocker, otherwise,
> > 2.10 will not able to use.
>
> Was this about the issue which this PR
> https://github.com/apache/pulsar/pull/14283 resolved (since it is merged)?
>
> I have the feeling that some past problems haven't been analyzed properly
> before deciding on the solution. There seems to be an understanding that
> switching from synchronous programming model to asynchronous programming
> model solves problems on its own. Before making such changes, I would
> expect that the issue is shared with the community and the assumptions
> about the problem and the planned solution are also shared.
>
> I am referring to PRs such as
> https://github.com/apache/pulsar/issues/14013 where the only description
> of the motivation for the change is that "PersistentTopicsBase has some
> sync methods. Decide to make them async." .
> Would it be possible to improve the description on such changes since
> those changes are included in Pulsar 2.10.0 release?
> There seem to be about 20 recent PRs in total which are about sync->async
> changes: https://github.com/apache/pulsar/pulls?q=is%3Apr+is%3Aopen+async
> What was the rationale for these changes?
>
> > I'm doing my best to follow PIP 47, but when seeing a potential break
> > change, I need to confirm it.
> > After all the potential break changes have been confirmed and fixed, I
> will
> > start the vote thread.
>
> The main concern I have about 2.10.0 release is that since we haven't
> branched out branch-2.10 as it is defined in PIP-47. There are more PRs for
> new changes coming in daily (such as the sync-async changes) which can
> cause instability in Pulsar.
>
> Who can share more context about the sync->async changes? It is also
> necessary for Pulsar 2.10.0 release notes since the sync->async changes are
> included in the release. Why were these changes made?
>
> BR,
>
> -Lari
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
Thanks for the detailed reply, Penghui.

> And, for the new metadata API, we found an issue that will introduce the
> cache inconsistent issue,
> we are working on a fix, it should be a release blocker, otherwise,
> 2.10 will not able to use.

Was this about the issue which this PR https://github.com/apache/pulsar/pull/14283 resolved (since it is merged)?

I have the feeling that some past problems haven't been analyzed properly before deciding on the solution. There seems to be an understanding that switching from synchronous programming model to asynchronous programming model solves problems on its own. Before making such changes, I would expect that the issue is shared with the community and the assumptions about the problem and the planned solution are also shared.

I am referring to PRs such as https://github.com/apache/pulsar/issues/14013 where the only description of the motivation for the change is that "PersistentTopicsBase has some sync methods. Decide to make them async." . 
Would it be possible to improve the description on such changes since those changes are included in Pulsar 2.10.0 release?
There seem to be about 20 recent PRs in total which are about sync->async changes: https://github.com/apache/pulsar/pulls?q=is%3Apr+is%3Aopen+async
What was the rationale for these changes?

> I'm doing my best to follow PIP 47, but when seeing a potential break
> change, I need to confirm it.
> After all the potential break changes have been confirmed and fixed, I will
> start the vote thread.

The main concern I have about 2.10.0 release is that since we haven't branched out branch-2.10 as it is defined in PIP-47. There are more PRs for new changes coming in daily (such as the sync-async changes) which can cause instability in Pulsar. 

Who can share more context about the sync->async changes? It is also necessary for Pulsar 2.10.0 release notes since the sync->async changes are included in the release. Why were these changes made?

BR,

-Lari

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi all,

The PR https://github.com/apache/pulsar/pull/14288 needs more eyes to
unblock the 2.10.0 release.
The PR fixes a breaking change in the branch-2.9, branch-2.8, and master
branches.

Thanks,
Penghui

On Tue, Feb 15, 2022 at 12:32 PM PengHui Li <pe...@apache.org> wrote:

> Hi all
>
> Please help review this PR https://github.com/apache/pulsar/pull/14283
> which should be a blocker for the 2.10.0 release.
> Tboy is working on another fix to fix the breaking change introduced in
> https://github.com/apache/pulsar/pull/13383.
> After the PR is available for review, I will update here.
>
> Regards,
> Penghui
>
> On Mon, Feb 14, 2022 at 8:55 PM PengHui Li <pe...@apache.org> wrote:
>
>> Hi Lari,
>>
>> There are 5 open PRs
>> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
>> and #14225 is a release blocker.
>> For #13376 and #13341, we are preparing the testing, to make sure they
>> can safely ship to 2.10.0
>> For #10478, it's a critical fix for the current message redeliver which
>> will impact the transaction correctness
>> while using failover or exclusive subscription, and it has protocol
>> changes, so we don't want to move it to 2.11
>> since the PR is done and already reviewed, just need more eyes on it.
>>
>> And, for the new metadata API, we found an issue that will introduce the
>> cache inconsistent issue,
>> we are working on a fix, it should be a release blocker, otherwise,
>> 2.10 will not able to use.
>>
>> Another one is related to https://github.com/apache/pulsar/pull/13383,
>> we are doing more tests to make sure
>> it will not introduce break changes.
>>
>> > Are we planning to follow this process, or is PIP 47 obsolete?
>>
>> I'm doing my best to follow PIP 47, but when seeing a potential break
>> change, I need to confirm it.
>> After all the potential break changes have been confirmed and fixed, I
>> will start the vote thread.
>>
>> Thanks,
>> Penghui
>>
>> On Mon, Feb 14, 2022 at 7:24 PM Lari Hotari <lh...@apache.org> wrote:
>>
>>> > After the features are completed, I will create the new 2.10 branch,
>>> and
>>> > only apply
>>> > the critical bug fixes, regression fixes. So that we can have adequate
>>> > testing on branch-2.10
>>>
>>> Hi Penghui,
>>>
>>> What's the status of 2.10.0 release? What features aren't complete?
>>>
>>> In PIP 47 (
>>> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan),
>>> there's a high-level description of the Pulsar release process:
>>>
>>> "A month before the release date, the release manager will cut branches
>>> and also publish (preferably on the wiki) a list of features that will be
>>> included in the release (these will typically be PIPs, but not always). We
>>> will leave another week for "minor" features to get in (see below for
>>> definitions), but at this point we will start efforts to stabilize the
>>> release branch and contribute mostly tests and fixes. Two weeks after
>>> branch cutting, we will announce code-freeze and start rolling out RCs,
>>> after which only fixes for blocking bugs will be merged."
>>>
>>> Are we planning to follow this process, or is PIP 47 obsolete?
>>>
>>> -Lari
>>>
>>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi all

Please help review this PR https://github.com/apache/pulsar/pull/14283
which should be a blocker for the 2.10.0 release.
Tboy is working on another fix to fix the breaking change introduced in
https://github.com/apache/pulsar/pull/13383.
After the PR is available for review, I will update here.

Regards,
Penghui

On Mon, Feb 14, 2022 at 8:55 PM PengHui Li <pe...@apache.org> wrote:

> Hi Lari,
>
> There are 5 open PRs
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
> and #14225 is a release blocker.
> For #13376 and #13341, we are preparing the testing, to make sure they can
> safely ship to 2.10.0
> For #10478, it's a critical fix for the current message redeliver which
> will impact the transaction correctness
> while using failover or exclusive subscription, and it has protocol
> changes, so we don't want to move it to 2.11
> since the PR is done and already reviewed, just need more eyes on it.
>
> And, for the new metadata API, we found an issue that will introduce the
> cache inconsistent issue,
> we are working on a fix, it should be a release blocker, otherwise,
> 2.10 will not able to use.
>
> Another one is related to https://github.com/apache/pulsar/pull/13383, we
> are doing more tests to make sure
> it will not introduce break changes.
>
> > Are we planning to follow this process, or is PIP 47 obsolete?
>
> I'm doing my best to follow PIP 47, but when seeing a potential break
> change, I need to confirm it.
> After all the potential break changes have been confirmed and fixed, I
> will start the vote thread.
>
> Thanks,
> Penghui
>
> On Mon, Feb 14, 2022 at 7:24 PM Lari Hotari <lh...@apache.org> wrote:
>
>> > After the features are completed, I will create the new 2.10 branch, and
>> > only apply
>> > the critical bug fixes, regression fixes. So that we can have adequate
>> > testing on branch-2.10
>>
>> Hi Penghui,
>>
>> What's the status of 2.10.0 release? What features aren't complete?
>>
>> In PIP 47 (
>> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan),
>> there's a high-level description of the Pulsar release process:
>>
>> "A month before the release date, the release manager will cut branches
>> and also publish (preferably on the wiki) a list of features that will be
>> included in the release (these will typically be PIPs, but not always). We
>> will leave another week for "minor" features to get in (see below for
>> definitions), but at this point we will start efforts to stabilize the
>> release branch and contribute mostly tests and fixes. Two weeks after
>> branch cutting, we will announce code-freeze and start rolling out RCs,
>> after which only fixes for blocking bugs will be merged."
>>
>> Are we planning to follow this process, or is PIP 47 obsolete?
>>
>> -Lari
>>
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Hi Lari,

There are 5 open PRs
https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
and #14225 is a release blocker.
For #13376 and #13341, we are preparing the testing, to make sure they can
safely ship to 2.10.0
For #10478, it's a critical fix for the current message redeliver which
will impact the transaction correctness
while using failover or exclusive subscription, and it has protocol
changes, so we don't want to move it to 2.11
since the PR is done and already reviewed, just need more eyes on it.

And, for the new metadata API, we found an issue that will introduce the
cache inconsistent issue,
we are working on a fix, it should be a release blocker, otherwise,
2.10 will not able to use.

Another one is related to https://github.com/apache/pulsar/pull/13383, we
are doing more tests to make sure
it will not introduce break changes.

> Are we planning to follow this process, or is PIP 47 obsolete?

I'm doing my best to follow PIP 47, but when seeing a potential break
change, I need to confirm it.
After all the potential break changes have been confirmed and fixed, I will
start the vote thread.

Thanks,
Penghui

On Mon, Feb 14, 2022 at 7:24 PM Lari Hotari <lh...@apache.org> wrote:

> > After the features are completed, I will create the new 2.10 branch, and
> > only apply
> > the critical bug fixes, regression fixes. So that we can have adequate
> > testing on branch-2.10
>
> Hi Penghui,
>
> What's the status of 2.10.0 release? What features aren't complete?
>
> In PIP 47 (
> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan),
> there's a high-level description of the Pulsar release process:
>
> "A month before the release date, the release manager will cut branches
> and also publish (preferably on the wiki) a list of features that will be
> included in the release (these will typically be PIPs, but not always). We
> will leave another week for "minor" features to get in (see below for
> definitions), but at this point we will start efforts to stabilize the
> release branch and contribute mostly tests and fixes. Two weeks after
> branch cutting, we will announce code-freeze and start rolling out RCs,
> after which only fixes for blocking bugs will be merged."
>
> Are we planning to follow this process, or is PIP 47 obsolete?
>
> -Lari
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Lari Hotari <lh...@apache.org>.
> After the features are completed, I will create the new 2.10 branch, and
> only apply
> the critical bug fixes, regression fixes. So that we can have adequate
> testing on branch-2.10

Hi Penghui, 

What's the status of 2.10.0 release? What features aren't complete?

In PIP 47 (https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan), there's a high-level description of the Pulsar release process:

"A month before the release date, the release manager will cut branches and also publish (preferably on the wiki) a list of features that will be included in the release (these will typically be PIPs, but not always). We will leave another week for "minor" features to get in (see below for definitions), but at this point we will start efforts to stabilize the release branch and contribute mostly tests and fixes. Two weeks after branch cutting, we will announce code-freeze and start rolling out RCs, after which only fixes for blocking bugs will be merged."

Are we planning to follow this process, or is PIP 47 obsolete?

-Lari

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by PengHui Li <pe...@apache.org>.
Yes, I agree, it's a good idea. But it depends on the features freeze time.
cherry-picking fix it ok, but does not work for BIG PRs with protocol
changes,
API changes, such huge changes might introduce new problems during the
cherry-picking.

After the features are completed, I will create the new 2.10 branch, and
only apply
the critical bug fixes, regression fixes. So that we can have adequate
testing on branch-2.10

Thanks,
Penghui

On Wed, Feb 9, 2022 at 4:30 PM Enrico Olivelli <eo...@gmail.com> wrote:

> PengHui,
> There is a recent discussion with Matteo (at the community meetings)
> about preparing the release branch a couple of weeks before sending
> out the official VOTE.
>
> What about creating the branch-2.10 as soon as possible?
> We will commit to that branch only the fixes needed to make 2.10.0 stable
> This way we will be free to commit new stuff to master branch without
> impacting the stability of 2.10
>
> This way people can start validating 2.10 seriously in order to catch
> problems before sending out the RC
>
> Does it sound like a good idea to you ?
> Enrico
>
> Il giorno mer 9 feb 2022 alle ore 09:25 PengHui Li
> <pe...@apache.org> ha scritto:
> >
> > Hi all,
> >
> > Sorry for the late reply, due to my vacation these days, we got a delay
> > here.
> >
> > Most of the changes of 2.10.0 are getting merged, for now, there are 14
> > opened PRs(10 approved)
> >
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
> >
> > I will take care of them and try to get them merged.
> > After the above PRs get merged, I will build the release and start the
> vote.
> > Please let me know if you have any questions about the 2.10.0 release.
> > And, also looking forward to more people taking a look at the opened PRs.
> >
> > Regards,
> > Penghui
> >
> >
> >
> >
> > On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:
> >
> > > +1.
> > >
> > > All make sense to me!
> > >
> > > We probably need to move to the feature frozen stage in order to cut a
> > > release at the end of January.
> > >
> > > - Sijie
> > >
> > > On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org> wrote:
> > >
> > > > Hi, everyone
> > > >
> > > > I hope you’ve all been doing well. I would like to start an email
> thread
> > > to
> > > > discuss features that we planned for 2.10.0.
> > > > According to the time-based release plan
> > > >
> https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
> > > > we should release 2.10.0 at the end of December 2021, since we have
> > > reached
> > > > the end of December,
> > > > I would like to target the 2.10.0 to the end of January 2022
> > > >
> > > > There are some powerful features and enhancements in 2.10.0 such as
> > > >
> > > > - PIP 84: Message redelivery epoch
> > > > - PIP 104: Add new consumer type: TableView
> > > > - PIP 106: Negative acknowledgment backoff
> > > > - PIP 110: Topic customized metadata support
> > > > - PIP 117: Change Pulsar standalone defaults
> > > > - PIP 118: Do not restart brokers when ZooKeeper session expires
> > > > - PIP 119: Enable consistent hashing by default on KeyShared
> dispatcher
> > > > - PIP 120: Enable client memory limit by default
> > > > - PIP 121: Pulsar cluster level auto failover
> > > > - PIP 123: Pulsar metadata CLI tool
> > > > - Metadata service batch operations
> > > > - RocksDB metadata service backend
> > > > - Etcd metadata service backend
> > > > - Ack timeout redelivery backoff policy
> > > > - Global topic policies
> > > >
> > > > Most of them have been completed, some work in progress we need to
> try to
> > > > complete within 2 weeks.
> > > > This can give me a 2 week buffer period to prepare for release and
> > > complete
> > > > the release vote.
> > > > For the unfinished parts, we can move them to 2.11.0.
> > > >
> > > > Some proposals are just being discussed, so I do not list them
> because
> > > I'm
> > > > not sure if we can complete them in two weeks.
> > > >
> > > > You can find all the change lists from
> > > >
> > > >
> > >
> https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
> > > > There are more than 500 commits.
> > > >
> > > > If I missed something or you have any suggestions please let me know.
> > > >
> > > > Regards,
> > > > Penghui
> > > >
> > >
>

Re: [DISCUSS] Apache Pulsar 2.10.0 release

Posted by Enrico Olivelli <eo...@gmail.com>.
PengHui,
There is a recent discussion with Matteo (at the community meetings)
about preparing the release branch a couple of weeks before sending
out the official VOTE.

What about creating the branch-2.10 as soon as possible?
We will commit to that branch only the fixes needed to make 2.10.0 stable
This way we will be free to commit new stuff to master branch without
impacting the stability of 2.10

This way people can start validating 2.10 seriously in order to catch
problems before sending out the RC

Does it sound like a good idea to you ?
Enrico

Il giorno mer 9 feb 2022 alle ore 09:25 PengHui Li
<pe...@apache.org> ha scritto:
>
> Hi all,
>
> Sorry for the late reply, due to my vacation these days, we got a delay
> here.
>
> Most of the changes of 2.10.0 are getting merged, for now, there are 14
> opened PRs(10 approved)
> https://github.com/apache/pulsar/pulls?q=is%3Aopen+is%3Apr+milestone%3A2.10.0
>
> I will take care of them and try to get them merged.
> After the above PRs get merged, I will build the release and start the vote.
> Please let me know if you have any questions about the 2.10.0 release.
> And, also looking forward to more people taking a look at the opened PRs.
>
> Regards,
> Penghui
>
>
>
>
> On Tue, Jan 4, 2022 at 7:56 AM Sijie Guo <gu...@gmail.com> wrote:
>
> > +1.
> >
> > All make sense to me!
> >
> > We probably need to move to the feature frozen stage in order to cut a
> > release at the end of January.
> >
> > - Sijie
> >
> > On Sun, Dec 26, 2021 at 8:46 PM PengHui Li <pe...@apache.org> wrote:
> >
> > > Hi, everyone
> > >
> > > I hope you’ve all been doing well. I would like to start an email thread
> > to
> > > discuss features that we planned for 2.10.0.
> > > According to the time-based release plan
> > > https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan,
> > > we should release 2.10.0 at the end of December 2021, since we have
> > reached
> > > the end of December,
> > > I would like to target the 2.10.0 to the end of January 2022
> > >
> > > There are some powerful features and enhancements in 2.10.0 such as
> > >
> > > - PIP 84: Message redelivery epoch
> > > - PIP 104: Add new consumer type: TableView
> > > - PIP 106: Negative acknowledgment backoff
> > > - PIP 110: Topic customized metadata support
> > > - PIP 117: Change Pulsar standalone defaults
> > > - PIP 118: Do not restart brokers when ZooKeeper session expires
> > > - PIP 119: Enable consistent hashing by default on KeyShared dispatcher
> > > - PIP 120: Enable client memory limit by default
> > > - PIP 121: Pulsar cluster level auto failover
> > > - PIP 123: Pulsar metadata CLI tool
> > > - Metadata service batch operations
> > > - RocksDB metadata service backend
> > > - Etcd metadata service backend
> > > - Ack timeout redelivery backoff policy
> > > - Global topic policies
> > >
> > > Most of them have been completed, some work in progress we need to try to
> > > complete within 2 weeks.
> > > This can give me a 2 week buffer period to prepare for release and
> > complete
> > > the release vote.
> > > For the unfinished parts, we can move them to 2.11.0.
> > >
> > > Some proposals are just being discussed, so I do not list them because
> > I'm
> > > not sure if we can complete them in two weeks.
> > >
> > > You can find all the change lists from
> > >
> > >
> > https://github.com/apache/pulsar/pulls?q=milestone%3A2.10.0+-label%3Arelease%2F2.9.1
> > > There are more than 500 commits.
> > >
> > > If I missed something or you have any suggestions please let me know.
> > >
> > > Regards,
> > > Penghui
> > >
> >