You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Andor Molnar <an...@apache.org> on 2019/01/04 13:23:25 UTC

JAVA 11 build is broken on 3.5

Hi team / Enrico,

I’d like to get feedback from the community on the following patch (moving the discussion from GitHub to here):

https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
https://github.com/apache/zookeeper/pull/753 <https://github.com/apache/zookeeper/pull/753>

In a nutshell: looks like that Netty 3.10 is broken under Java 11: it doesn’t properly close the underlying socket (probably not closing the registered NIO selectors) and reconfig tests are unable to re-bind the ports. This problem is similar that we already fixed in NIO with the following patch:
https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 <https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2>
The problem doesn’t show up on trunk which has been recently upgraded to Netty 4.

Repro: 
- Start embedded ZK, stop it and try to restart on the same port, or
- Start normal ensemble and reconfig to use different (client) port. Then reconfig back to the original port which should fail. (that’s the scenario which is covered in ReconfigTest)

I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and it fixes the problem with Java 11 (it doesn’t cause regression in the pre-commit build either), but Enrico is having concerns about making such big change before the release.

I tend to agree, but let’s see what are the options.

Thoughts:
- Do we have to fix this? - Yes. Java 11 is LTS and I the bug is critical.
- Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, what can we do?
       a) We cannot workaround from ZooKeeper itself and have to submit a pull request for Netty. I think it’s quite unlikely that they will accept the change given it’s not a security bug, but even if they did, only the upgraded version of Netty 3 would work properly with ZooKeeper. Err.
       b) We can workaround it from ZooKeeper: that could be option #1, but I have a strong feeling about it’s not going to be the case.
- Shall we upgrade to Netty 4? - this is option #2

Please share your thoughts, maybe you know about an option #3.

Regards,
Andor







Re: JAVA 11 build is broken on 3.5

Posted by Andor Molnar <an...@apache.org>.
Hi folks,

Looks like we’re in a very good shape: all builds of 3.5 release are green now.
Are you thinking of anything which would be good to get in before the stable release?

Regards,
Andor




> On 2019. Jan 9., at 22:18, Andor Molnár <an...@apache.org> wrote:
> 
> Thank you guys for your support.
> 
> I leave this open for the community to decide, but if none of the
> binding voters (committers, PMCs) respond in 1 week, I'll take
> responsibility and commit the patches. I know holidays are still going
> on, but 3.5 release is important we should make progress.
> 
> 
> Thanks,
> 
> Andor
> 
> 
> 
> On 1/9/19 21:54, Norbert Kalmar wrote:
>> I agree with backporting Netty 4 and the related patches to 3.5.5.
>> +1 (non binding) (and I might have already voted)
>> 
>> If it took so long to release a stable 3.5, I don't think we should leave a
>> known bug in the system, which has such an impact (not supporting current
>> LTS java).
>> The PRs shouldn't be hard to backport, no new development is required, and
>> unfortunately maven migration isn't finished yet, so it's not like we need
>> to postpone the release just because of Netty upgrade. That's my two cents
>> anyway.
>> 
>> Regards,
>> Norbert
>> 
>> On Wed, Jan 9, 2019 at 9:19 PM Enrico Olivelli <eo...@gmail.com> wrote:
>> 
>>> Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar
>>> <an...@apache.org> ha scritto:
>>>> So, you guys saying we should upgrade to Netty 4 and backport the
>>> following patches too:
>>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>>> session in Netty
>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>> NettyServerCnxnFactory
>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>>> the same behavior and make the code easier to maintain
>>>> I think it’s still deliverable with the stable release. Regarding the
>>> timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in
>>> 3.5.6. Is it acceptable to do such a big with a minor version change?
>>> 
>>> I think we can do the upgrade in 3.5.5.
>>> We can't say Java 11 is supported if we don't have tests working. And
>>> Java 11 is the current LTS release from Oracle.
>>> 
>>> If we release now 3.5.5 then we need to work for 3.5.6 and then
>>> release....it is a double work, we don't have so much resources :-(
>>> 
>>> IMHO it is better to release 3.5.5 and start focusing on 3.6.0
>>> 
>>> Enrico
>>> 
>>>> Upgrading now isn't ideal either, but maybe somewhat less brutal.
>>>> 
>>>> Regards,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 2019. Jan 8., at 2:10, Brian Nixon <br...@gmail.com>
>>> wrote:
>>>>> Also worth consideration for the Netty fixes backport bundle:
>>>>> 
>>>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>>>> 
>>>>> Our experience with Netty has been positive so far though I don't
>>> think we
>>>>> have enough info to declare it is stable. It makes a lot of sense to
>>> me for
>>>>> 3.5 to use Netty 4, the biggest question is the timing.
>>>>> 
>>>>> 
>>>>> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:
>>>>> 
>>>>>> I got another one:
>>>>>> 
>>>>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>>>>>> session in Netty
>>>>>> 
>>>>>> 
>>>>>> So, that doesn't seem too many. We can talk about backporting them,
>>> but
>>>>>> I don't think they're super critical for the first stable release.
>>>>>> 
>>>>>> Regarding 3.4, I need to validate the embedded scenario, but at least
>>>>>> the Java 11 build is green. :)
>>>>>> 
>>>>>> 
>>>>>> Andor
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 1/4/19 18:08, Enrico Olivelli wrote:
>>>>>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
>>>>>>> <nk...@cloudera.com.invalid> ha scritto:
>>>>>>>> +1 for Netty 4 in 3.5
>>>>>>>> 
>>>>>>>> Pretty much all the pros and cons has been said before me.
>>>>>>>> I would only add that this is not a new functionality that we wan't
>>> to
>>>>>>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit
>>> of
>>>>>>>> change unfortunately.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Norbert
>>>>>>>> 
>>>>>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org>
>>> wrote:
>>>>>>>>> What are those patches exactly?
>>>>>>>>> 
>>>>>>>>> Comparing the ported version of 3.5 with master I’ve only found 2
>>>>>> patches
>>>>>>>>> which are missing:
>>>>>>>>> 
>>>>>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>>>>>>>> NettyServerCnxnFactory
>>>>>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to
>>> keep
>>>>>>>>> the same behavior and make the code easier to maintain
>>>>>>>>> 
>>>>>>>>> None of them are critical I would say.
>>>>>>>>> Is there anything else I’m missing?
>>>>>>> I have compared the histories and I think Andor you are right.
>>>>>>> 
>>>>>>> master:
>>>>>>> https://github.com/apache/zookeeper/commits/master
>>>>>>> 
>>>>>>> branch-3.5:
>>>>>>> https://github.com/apache/zookeeper/commits/branch-3.5
>>>>>>> 
>>>>>>> Sorry I was thinking to the amount of patches related to TLS stuff ,
>>>>>>> but they are not related to Netty 4.
>>>>>>> 
>>>>>>> What about branch 3.4 ?
>>>>>>> We don't have reconfig but the case "Start embedded ZK, stop it and
>>>>>>> try to restart on the same port" should apply as well.
>>>>>>> 
>>>>>>> Enrico
>>>>>>> 
>>>>>>>>> Andor
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
>>>>>> wrote:
>>>>>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>>>>>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
>>>>>>>>>>> Hi team / Enrico,
>>>>>>>>>>> 
>>>>>>>>>>> I’d like to get feedback from the community on the following
>>> patch
>>>>>>>>> (moving the discussion from GitHub to here):
>>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>>>>>>>> https://github.com/apache/zookeeper/pull/753 <
>>>>>>>>> https://github.com/apache/zookeeper/pull/753>
>>>>>>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java
>>> 11: it
>>>>>>>>> doesn’t properly close the underlying socket (probably not closing
>>> the
>>>>>>>>> registered NIO selectors) and reconfig tests are unable to re-bind
>>> the
>>>>>>>>> ports. This problem is similar that we already fixed in NIO with
>>> the
>>>>>>>>> following patch:
>>>>>>>>> 
>>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>>> <
>>>>>>>>> 
>>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>>>>> The problem doesn’t show up on trunk which has been recently
>>> upgraded
>>>>>>>>> to Netty 4.
>>>>>>>>>>> Repro:
>>>>>>>>>>> - Start embedded ZK, stop it and try to restart on the same
>>> port, or
>>>>>>>>>>> - Start normal ensemble and reconfig to use different (client)
>>> port.
>>>>>>>>> Then reconfig back to the original port which should fail. (that’s
>>> the
>>>>>>>>> scenario which is covered in ReconfigTest)
>>>>>>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to
>>> 3.5
>>>>>> and
>>>>>>>>> it fixes the problem with Java 11 (it doesn’t cause regression in
>>> the
>>>>>>>>> pre-commit build either), but Enrico is having concerns about
>>> making
>>>>>> such
>>>>>>>>> big change before the release.
>>>>>>>>>>> I tend to agree, but let’s see what are the options.
>>>>>>>>>>> 
>>>>>>>>>>> Thoughts:
>>>>>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
>>>>>>>>> critical.
>>>>>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in
>>> Netty 3,
>>>>>>>>> what can we do?
>>>>>>>>>>>     a) We cannot workaround from ZooKeeper itself and have to
>>>>>> submit
>>>>>>>>> a pull request for Netty. I think it’s quite unlikely that they
>>> will
>>>>>> accept
>>>>>>>>> the change given it’s not a security bug, but even if they did,
>>> only
>>>>>> the
>>>>>>>>> upgraded version of Netty 3 would work properly with ZooKeeper.
>>> Err.
>>>>>>>>>>>     b) We can workaround it from ZooKeeper: that could be option
>>>>>> #1,
>>>>>>>>> but I have a strong feeling about it’s not going to be the case.
>>>>>>>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>>>>>>> 
>>>>>>>>>>> Please share your thoughts, maybe you know about an option #3.
>>>>>>>>>> Thank you Andor
>>>>>>>>>> 
>>>>>>>>>> I have thought more about this problem, and I have checked that
>>> Netty
>>>>>>>>>> 3 is really dead/unmantained (last release in 2016).
>>>>>>>>>> If I understand correctly there is no easy workaround (nothing
>>> without
>>>>>>>>>> hacking Netty 3 internals)
>>>>>>>>>> 
>>>>>>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
>>>>>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
>>>>>>>>>> The network stack is very important so it is better to have Netty
>>> 4 as
>>>>>>>>>> foundation, I am thinking about security issues, we won't make an
>>>>>>>>>> "hotfix" release with the switch to Netty 4 because there is a
>>> bad bug
>>>>>>>>>> in Netty 3.
>>>>>>>>>> So better to switch now.
>>>>>>>>>> 
>>>>>>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
>>>>>>>>>> around Netty on master branch, we must be sure that what we are
>>>>>>>>>> delivering in 3.5.5 is stable.
>>>>>>>>>> 
>>>>>>>>>> We will also have to state clearly in the "release notes" that
>>> Netty
>>>>>>>>>> version is changed, as this may have a non trivial impact to
>>> memory
>>>>>>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>>>>>>> 
>>>>>>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take
>>> care
>>>>>>>>>> of port all of the fixes around Netty 4 from master branch and we
>>>>>>>>>> state the switch clearly in the release notes
>>>>>>>>>> 
>>>>>>>>>> Enrico
>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Andor


Re: JAVA 11 build is broken on 3.5

Posted by Andor Molnár <an...@apache.org>.
Thank you guys for your support.

I leave this open for the community to decide, but if none of the
binding voters (committers, PMCs) respond in 1 week, I'll take
responsibility and commit the patches. I know holidays are still going
on, but 3.5 release is important we should make progress.


Thanks,

Andor



On 1/9/19 21:54, Norbert Kalmar wrote:
> I agree with backporting Netty 4 and the related patches to 3.5.5.
> +1 (non binding) (and I might have already voted)
>
> If it took so long to release a stable 3.5, I don't think we should leave a
> known bug in the system, which has such an impact (not supporting current
> LTS java).
> The PRs shouldn't be hard to backport, no new development is required, and
> unfortunately maven migration isn't finished yet, so it's not like we need
> to postpone the release just because of Netty upgrade. That's my two cents
> anyway.
>
> Regards,
> Norbert
>
> On Wed, Jan 9, 2019 at 9:19 PM Enrico Olivelli <eo...@gmail.com> wrote:
>
>> Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar
>> <an...@apache.org> ha scritto:
>>> So, you guys saying we should upgrade to Netty 4 and backport the
>> following patches too:
>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>> session in Netty
>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>> NettyServerCnxnFactory
>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>> the same behavior and make the code easier to maintain
>>> I think it’s still deliverable with the stable release. Regarding the
>> timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in
>> 3.5.6. Is it acceptable to do such a big with a minor version change?
>>
>> I think we can do the upgrade in 3.5.5.
>> We can't say Java 11 is supported if we don't have tests working. And
>> Java 11 is the current LTS release from Oracle.
>>
>> If we release now 3.5.5 then we need to work for 3.5.6 and then
>> release....it is a double work, we don't have so much resources :-(
>>
>> IMHO it is better to release 3.5.5 and start focusing on 3.6.0
>>
>> Enrico
>>
>>> Upgrading now isn't ideal either, but maybe somewhat less brutal.
>>>
>>> Regards,
>>> Andor
>>>
>>>
>>>
>>>
>>>> On 2019. Jan 8., at 2:10, Brian Nixon <br...@gmail.com>
>> wrote:
>>>> Also worth consideration for the Netty fixes backport bundle:
>>>>
>>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>>>
>>>> Our experience with Netty has been positive so far though I don't
>> think we
>>>> have enough info to declare it is stable. It makes a lot of sense to
>> me for
>>>> 3.5 to use Netty 4, the biggest question is the timing.
>>>>
>>>>
>>>> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:
>>>>
>>>>> I got another one:
>>>>>
>>>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>>>>> session in Netty
>>>>>
>>>>>
>>>>> So, that doesn't seem too many. We can talk about backporting them,
>> but
>>>>> I don't think they're super critical for the first stable release.
>>>>>
>>>>> Regarding 3.4, I need to validate the embedded scenario, but at least
>>>>> the Java 11 build is green. :)
>>>>>
>>>>>
>>>>> Andor
>>>>>
>>>>>
>>>>>
>>>>> On 1/4/19 18:08, Enrico Olivelli wrote:
>>>>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
>>>>>> <nk...@cloudera.com.invalid> ha scritto:
>>>>>>> +1 for Netty 4 in 3.5
>>>>>>>
>>>>>>> Pretty much all the pros and cons has been said before me.
>>>>>>> I would only add that this is not a new functionality that we wan't
>> to
>>>>>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit
>> of
>>>>>>> change unfortunately.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Norbert
>>>>>>>
>>>>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org>
>> wrote:
>>>>>>>> What are those patches exactly?
>>>>>>>>
>>>>>>>> Comparing the ported version of 3.5 with master I’ve only found 2
>>>>> patches
>>>>>>>> which are missing:
>>>>>>>>
>>>>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>>>>>>> NettyServerCnxnFactory
>>>>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to
>> keep
>>>>>>>> the same behavior and make the code easier to maintain
>>>>>>>>
>>>>>>>> None of them are critical I would say.
>>>>>>>> Is there anything else I’m missing?
>>>>>> I have compared the histories and I think Andor you are right.
>>>>>>
>>>>>> master:
>>>>>> https://github.com/apache/zookeeper/commits/master
>>>>>>
>>>>>> branch-3.5:
>>>>>> https://github.com/apache/zookeeper/commits/branch-3.5
>>>>>>
>>>>>> Sorry I was thinking to the amount of patches related to TLS stuff ,
>>>>>> but they are not related to Netty 4.
>>>>>>
>>>>>> What about branch 3.4 ?
>>>>>> We don't have reconfig but the case "Start embedded ZK, stop it and
>>>>>> try to restart on the same port" should apply as well.
>>>>>>
>>>>>> Enrico
>>>>>>
>>>>>>>> Andor
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
>>>>> wrote:
>>>>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>>>>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
>>>>>>>>>> Hi team / Enrico,
>>>>>>>>>>
>>>>>>>>>> I’d like to get feedback from the community on the following
>> patch
>>>>>>>> (moving the discussion from GitHub to here):
>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>>>>>>> https://github.com/apache/zookeeper/pull/753 <
>>>>>>>> https://github.com/apache/zookeeper/pull/753>
>>>>>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java
>> 11: it
>>>>>>>> doesn’t properly close the underlying socket (probably not closing
>> the
>>>>>>>> registered NIO selectors) and reconfig tests are unable to re-bind
>> the
>>>>>>>> ports. This problem is similar that we already fixed in NIO with
>> the
>>>>>>>> following patch:
>>>>>>>>
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>> <
>>>>>>>>
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>>>> The problem doesn’t show up on trunk which has been recently
>> upgraded
>>>>>>>> to Netty 4.
>>>>>>>>>> Repro:
>>>>>>>>>> - Start embedded ZK, stop it and try to restart on the same
>> port, or
>>>>>>>>>> - Start normal ensemble and reconfig to use different (client)
>> port.
>>>>>>>> Then reconfig back to the original port which should fail. (that’s
>> the
>>>>>>>> scenario which is covered in ReconfigTest)
>>>>>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to
>> 3.5
>>>>> and
>>>>>>>> it fixes the problem with Java 11 (it doesn’t cause regression in
>> the
>>>>>>>> pre-commit build either), but Enrico is having concerns about
>> making
>>>>> such
>>>>>>>> big change before the release.
>>>>>>>>>> I tend to agree, but let’s see what are the options.
>>>>>>>>>>
>>>>>>>>>> Thoughts:
>>>>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
>>>>>>>> critical.
>>>>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in
>> Netty 3,
>>>>>>>> what can we do?
>>>>>>>>>>      a) We cannot workaround from ZooKeeper itself and have to
>>>>> submit
>>>>>>>> a pull request for Netty. I think it’s quite unlikely that they
>> will
>>>>> accept
>>>>>>>> the change given it’s not a security bug, but even if they did,
>> only
>>>>> the
>>>>>>>> upgraded version of Netty 3 would work properly with ZooKeeper.
>> Err.
>>>>>>>>>>      b) We can workaround it from ZooKeeper: that could be option
>>>>> #1,
>>>>>>>> but I have a strong feeling about it’s not going to be the case.
>>>>>>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>>>>>>
>>>>>>>>>> Please share your thoughts, maybe you know about an option #3.
>>>>>>>>> Thank you Andor
>>>>>>>>>
>>>>>>>>> I have thought more about this problem, and I have checked that
>> Netty
>>>>>>>>> 3 is really dead/unmantained (last release in 2016).
>>>>>>>>> If I understand correctly there is no easy workaround (nothing
>> without
>>>>>>>>> hacking Netty 3 internals)
>>>>>>>>>
>>>>>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
>>>>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
>>>>>>>>> The network stack is very important so it is better to have Netty
>> 4 as
>>>>>>>>> foundation, I am thinking about security issues, we won't make an
>>>>>>>>> "hotfix" release with the switch to Netty 4 because there is a
>> bad bug
>>>>>>>>> in Netty 3.
>>>>>>>>> So better to switch now.
>>>>>>>>>
>>>>>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
>>>>>>>>> around Netty on master branch, we must be sure that what we are
>>>>>>>>> delivering in 3.5.5 is stable.
>>>>>>>>>
>>>>>>>>> We will also have to state clearly in the "release notes" that
>> Netty
>>>>>>>>> version is changed, as this may have a non trivial impact to
>> memory
>>>>>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>>>>>>
>>>>>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take
>> care
>>>>>>>>> of port all of the fixes around Netty 4 from master branch and we
>>>>>>>>> state the switch clearly in the release notes
>>>>>>>>>
>>>>>>>>> Enrico
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Andor

Re: JAVA 11 build is broken on 3.5

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
I agree with backporting Netty 4 and the related patches to 3.5.5.
+1 (non binding) (and I might have already voted)

If it took so long to release a stable 3.5, I don't think we should leave a
known bug in the system, which has such an impact (not supporting current
LTS java).
The PRs shouldn't be hard to backport, no new development is required, and
unfortunately maven migration isn't finished yet, so it's not like we need
to postpone the release just because of Netty upgrade. That's my two cents
anyway.

Regards,
Norbert

On Wed, Jan 9, 2019 at 9:19 PM Enrico Olivelli <eo...@gmail.com> wrote:

> Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar
> <an...@apache.org> ha scritto:
> >
> > So, you guys saying we should upgrade to Netty 4 and backport the
> following patches too:
> >
> > ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> > ZOOKEEPER-3163 Use session map to improve the performance when closing
> session in Netty
> > ZOOKEEPER-3146 Limit the maximum client connections per IP in
> NettyServerCnxnFactory
> > ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
> the same behavior and make the code easier to maintain
> >
> > I think it’s still deliverable with the stable release. Regarding the
> timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in
> 3.5.6. Is it acceptable to do such a big with a minor version change?
>
> I think we can do the upgrade in 3.5.5.
> We can't say Java 11 is supported if we don't have tests working. And
> Java 11 is the current LTS release from Oracle.
>
> If we release now 3.5.5 then we need to work for 3.5.6 and then
> release....it is a double work, we don't have so much resources :-(
>
> IMHO it is better to release 3.5.5 and start focusing on 3.6.0
>
> Enrico
>
> >
> > Upgrading now isn't ideal either, but maybe somewhat less brutal.
> >
> > Regards,
> > Andor
> >
> >
> >
> >
> > > On 2019. Jan 8., at 2:10, Brian Nixon <br...@gmail.com>
> wrote:
> > >
> > > Also worth consideration for the Netty fixes backport bundle:
> > >
> > > ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> > >
> > > Our experience with Netty has been positive so far though I don't
> think we
> > > have enough info to declare it is stable. It makes a lot of sense to
> me for
> > > 3.5 to use Netty 4, the biggest question is the timing.
> > >
> > >
> > > On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:
> > >
> > >> I got another one:
> > >>
> > >> ZOOKEEPER-3163 Use session map to improve the performance when closing
> > >> session in Netty
> > >>
> > >>
> > >> So, that doesn't seem too many. We can talk about backporting them,
> but
> > >> I don't think they're super critical for the first stable release.
> > >>
> > >> Regarding 3.4, I need to validate the embedded scenario, but at least
> > >> the Java 11 build is green. :)
> > >>
> > >>
> > >> Andor
> > >>
> > >>
> > >>
> > >> On 1/4/19 18:08, Enrico Olivelli wrote:
> > >>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
> > >>> <nk...@cloudera.com.invalid> ha scritto:
> > >>>> +1 for Netty 4 in 3.5
> > >>>>
> > >>>> Pretty much all the pros and cons has been said before me.
> > >>>> I would only add that this is not a new functionality that we wan't
> to
> > >>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit
> of
> > >>>> change unfortunately.
> > >>>>
> > >>>> Regards,
> > >>>> Norbert
> > >>>>
> > >>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org>
> wrote:
> > >>>>
> > >>>>> What are those patches exactly?
> > >>>>>
> > >>>>> Comparing the ported version of 3.5 with master I’ve only found 2
> > >> patches
> > >>>>> which are missing:
> > >>>>>
> > >>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
> > >>>>> NettyServerCnxnFactory
> > >>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to
> keep
> > >>>>> the same behavior and make the code easier to maintain
> > >>>>>
> > >>>>> None of them are critical I would say.
> > >>>>> Is there anything else I’m missing?
> > >>> I have compared the histories and I think Andor you are right.
> > >>>
> > >>> master:
> > >>> https://github.com/apache/zookeeper/commits/master
> > >>>
> > >>> branch-3.5:
> > >>> https://github.com/apache/zookeeper/commits/branch-3.5
> > >>>
> > >>> Sorry I was thinking to the amount of patches related to TLS stuff ,
> > >>> but they are not related to Netty 4.
> > >>>
> > >>> What about branch 3.4 ?
> > >>> We don't have reconfig but the case "Start embedded ZK, stop it and
> > >>> try to restart on the same port" should apply as well.
> > >>>
> > >>> Enrico
> > >>>
> > >>>>> Andor
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
> > >> wrote:
> > >>>>>>
> > >>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> > >>>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
> > >>>>>>> Hi team / Enrico,
> > >>>>>>>
> > >>>>>>> I’d like to get feedback from the community on the following
> patch
> > >>>>> (moving the discussion from GitHub to here):
> > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
> > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> > >>>>>>> https://github.com/apache/zookeeper/pull/753 <
> > >>>>> https://github.com/apache/zookeeper/pull/753>
> > >>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java
> 11: it
> > >>>>> doesn’t properly close the underlying socket (probably not closing
> the
> > >>>>> registered NIO selectors) and reconfig tests are unable to re-bind
> the
> > >>>>> ports. This problem is similar that we already fixed in NIO with
> the
> > >>>>> following patch:
> > >>>>>
> > >>
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> > >>>>> <
> > >>>>>
> > >>
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> > >>>>>>> The problem doesn’t show up on trunk which has been recently
> upgraded
> > >>>>> to Netty 4.
> > >>>>>>> Repro:
> > >>>>>>> - Start embedded ZK, stop it and try to restart on the same
> port, or
> > >>>>>>> - Start normal ensemble and reconfig to use different (client)
> port.
> > >>>>> Then reconfig back to the original port which should fail. (that’s
> the
> > >>>>> scenario which is covered in ReconfigTest)
> > >>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to
> 3.5
> > >> and
> > >>>>> it fixes the problem with Java 11 (it doesn’t cause regression in
> the
> > >>>>> pre-commit build either), but Enrico is having concerns about
> making
> > >> such
> > >>>>> big change before the release.
> > >>>>>>> I tend to agree, but let’s see what are the options.
> > >>>>>>>
> > >>>>>>> Thoughts:
> > >>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
> > >>>>> critical.
> > >>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in
> Netty 3,
> > >>>>> what can we do?
> > >>>>>>>      a) We cannot workaround from ZooKeeper itself and have to
> > >> submit
> > >>>>> a pull request for Netty. I think it’s quite unlikely that they
> will
> > >> accept
> > >>>>> the change given it’s not a security bug, but even if they did,
> only
> > >> the
> > >>>>> upgraded version of Netty 3 would work properly with ZooKeeper.
> Err.
> > >>>>>>>      b) We can workaround it from ZooKeeper: that could be option
> > >> #1,
> > >>>>> but I have a strong feeling about it’s not going to be the case.
> > >>>>>>> - Shall we upgrade to Netty 4? - this is option #2
> > >>>>>>>
> > >>>>>>> Please share your thoughts, maybe you know about an option #3.
> > >>>>>> Thank you Andor
> > >>>>>>
> > >>>>>> I have thought more about this problem, and I have checked that
> Netty
> > >>>>>> 3 is really dead/unmantained (last release in 2016).
> > >>>>>> If I understand correctly there is no easy workaround (nothing
> without
> > >>>>>> hacking Netty 3 internals)
> > >>>>>>
> > >>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
> > >>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> > >>>>>> The network stack is very important so it is better to have Netty
> 4 as
> > >>>>>> foundation, I am thinking about security issues, we won't make an
> > >>>>>> "hotfix" release with the switch to Netty 4 because there is a
> bad bug
> > >>>>>> in Netty 3.
> > >>>>>> So better to switch now.
> > >>>>>>
> > >>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> > >>>>>> around Netty on master branch, we must be sure that what we are
> > >>>>>> delivering in 3.5.5 is stable.
> > >>>>>>
> > >>>>>> We will also have to state clearly in the "release notes" that
> Netty
> > >>>>>> version is changed, as this may have a non trivial impact to
> memory
> > >>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
> > >>>>>>
> > >>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take
> care
> > >>>>>> of port all of the fixes around Netty 4 from master branch and we
> > >>>>>> state the switch clearly in the release notes
> > >>>>>>
> > >>>>>> Enrico
> > >>>>>>
> > >>>>>>> Regards,
> > >>>>>>> Andor
> > >>>>>
> > >>
> >
>

Re: JAVA 11 build is broken on 3.5

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar
<an...@apache.org> ha scritto:
>
> So, you guys saying we should upgrade to Netty 4 and backport the following patches too:
>
> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> ZOOKEEPER-3163 Use session map to improve the performance when closing session in Netty
> ZOOKEEPER-3146 Limit the maximum client connections per IP in NettyServerCnxnFactory
> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep the same behavior and make the code easier to maintain
>
> I think it’s still deliverable with the stable release. Regarding the timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in 3.5.6. Is it acceptable to do such a big with a minor version change?

I think we can do the upgrade in 3.5.5.
We can't say Java 11 is supported if we don't have tests working. And
Java 11 is the current LTS release from Oracle.

If we release now 3.5.5 then we need to work for 3.5.6 and then
release....it is a double work, we don't have so much resources :-(

IMHO it is better to release 3.5.5 and start focusing on 3.6.0

Enrico

>
> Upgrading now isn't ideal either, but maybe somewhat less brutal.
>
> Regards,
> Andor
>
>
>
>
> > On 2019. Jan 8., at 2:10, Brian Nixon <br...@gmail.com> wrote:
> >
> > Also worth consideration for the Netty fixes backport bundle:
> >
> > ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> >
> > Our experience with Netty has been positive so far though I don't think we
> > have enough info to declare it is stable. It makes a lot of sense to me for
> > 3.5 to use Netty 4, the biggest question is the timing.
> >
> >
> > On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:
> >
> >> I got another one:
> >>
> >> ZOOKEEPER-3163 Use session map to improve the performance when closing
> >> session in Netty
> >>
> >>
> >> So, that doesn't seem too many. We can talk about backporting them, but
> >> I don't think they're super critical for the first stable release.
> >>
> >> Regarding 3.4, I need to validate the embedded scenario, but at least
> >> the Java 11 build is green. :)
> >>
> >>
> >> Andor
> >>
> >>
> >>
> >> On 1/4/19 18:08, Enrico Olivelli wrote:
> >>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
> >>> <nk...@cloudera.com.invalid> ha scritto:
> >>>> +1 for Netty 4 in 3.5
> >>>>
> >>>> Pretty much all the pros and cons has been said before me.
> >>>> I would only add that this is not a new functionality that we wan't to
> >>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
> >>>> change unfortunately.
> >>>>
> >>>> Regards,
> >>>> Norbert
> >>>>
> >>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:
> >>>>
> >>>>> What are those patches exactly?
> >>>>>
> >>>>> Comparing the ported version of 3.5 with master I’ve only found 2
> >> patches
> >>>>> which are missing:
> >>>>>
> >>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
> >>>>> NettyServerCnxnFactory
> >>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
> >>>>> the same behavior and make the code easier to maintain
> >>>>>
> >>>>> None of them are critical I would say.
> >>>>> Is there anything else I’m missing?
> >>> I have compared the histories and I think Andor you are right.
> >>>
> >>> master:
> >>> https://github.com/apache/zookeeper/commits/master
> >>>
> >>> branch-3.5:
> >>> https://github.com/apache/zookeeper/commits/branch-3.5
> >>>
> >>> Sorry I was thinking to the amount of patches related to TLS stuff ,
> >>> but they are not related to Netty 4.
> >>>
> >>> What about branch 3.4 ?
> >>> We don't have reconfig but the case "Start embedded ZK, stop it and
> >>> try to restart on the same port" should apply as well.
> >>>
> >>> Enrico
> >>>
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> >>>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
> >>>>>>> Hi team / Enrico,
> >>>>>>>
> >>>>>>> I’d like to get feedback from the community on the following patch
> >>>>> (moving the discussion from GitHub to here):
> >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
> >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> >>>>>>> https://github.com/apache/zookeeper/pull/753 <
> >>>>> https://github.com/apache/zookeeper/pull/753>
> >>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
> >>>>> doesn’t properly close the underlying socket (probably not closing the
> >>>>> registered NIO selectors) and reconfig tests are unable to re-bind the
> >>>>> ports. This problem is similar that we already fixed in NIO with the
> >>>>> following patch:
> >>>>>
> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> >>>>> <
> >>>>>
> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> >>>>>>> The problem doesn’t show up on trunk which has been recently upgraded
> >>>>> to Netty 4.
> >>>>>>> Repro:
> >>>>>>> - Start embedded ZK, stop it and try to restart on the same port, or
> >>>>>>> - Start normal ensemble and reconfig to use different (client) port.
> >>>>> Then reconfig back to the original port which should fail. (that’s the
> >>>>> scenario which is covered in ReconfigTest)
> >>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5
> >> and
> >>>>> it fixes the problem with Java 11 (it doesn’t cause regression in the
> >>>>> pre-commit build either), but Enrico is having concerns about making
> >> such
> >>>>> big change before the release.
> >>>>>>> I tend to agree, but let’s see what are the options.
> >>>>>>>
> >>>>>>> Thoughts:
> >>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
> >>>>> critical.
> >>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
> >>>>> what can we do?
> >>>>>>>      a) We cannot workaround from ZooKeeper itself and have to
> >> submit
> >>>>> a pull request for Netty. I think it’s quite unlikely that they will
> >> accept
> >>>>> the change given it’s not a security bug, but even if they did, only
> >> the
> >>>>> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
> >>>>>>>      b) We can workaround it from ZooKeeper: that could be option
> >> #1,
> >>>>> but I have a strong feeling about it’s not going to be the case.
> >>>>>>> - Shall we upgrade to Netty 4? - this is option #2
> >>>>>>>
> >>>>>>> Please share your thoughts, maybe you know about an option #3.
> >>>>>> Thank you Andor
> >>>>>>
> >>>>>> I have thought more about this problem, and I have checked that Netty
> >>>>>> 3 is really dead/unmantained (last release in 2016).
> >>>>>> If I understand correctly there is no easy workaround (nothing without
> >>>>>> hacking Netty 3 internals)
> >>>>>>
> >>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
> >>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> >>>>>> The network stack is very important so it is better to have Netty 4 as
> >>>>>> foundation, I am thinking about security issues, we won't make an
> >>>>>> "hotfix" release with the switch to Netty 4 because there is a bad bug
> >>>>>> in Netty 3.
> >>>>>> So better to switch now.
> >>>>>>
> >>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> >>>>>> around Netty on master branch, we must be sure that what we are
> >>>>>> delivering in 3.5.5 is stable.
> >>>>>>
> >>>>>> We will also have to state clearly in the "release notes" that Netty
> >>>>>> version is changed, as this may have a non trivial impact to memory
> >>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
> >>>>>>
> >>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care
> >>>>>> of port all of the fixes around Netty 4 from master branch and we
> >>>>>> state the switch clearly in the release notes
> >>>>>>
> >>>>>> Enrico
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Andor
> >>>>>
> >>
>

Re: JAVA 11 build is broken on 3.5

Posted by Andor Molnar <an...@apache.org>.
So, you guys saying we should upgrade to Netty 4 and backport the following patches too:

ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
ZOOKEEPER-3163 Use session map to improve the performance when closing session in Netty
ZOOKEEPER-3146 Limit the maximum client connections per IP in NettyServerCnxnFactory
ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep the same behavior and make the code easier to maintain

I think it’s still deliverable with the stable release. Regarding the timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in 3.5.6. Is it acceptable to do such a big with a minor version change?

Upgrading now isn't ideal either, but maybe somewhat less brutal.

Regards,
Andor




> On 2019. Jan 8., at 2:10, Brian Nixon <br...@gmail.com> wrote:
> 
> Also worth consideration for the Netty fixes backport bundle:
> 
> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> 
> Our experience with Netty has been positive so far though I don't think we
> have enough info to declare it is stable. It makes a lot of sense to me for
> 3.5 to use Netty 4, the biggest question is the timing.
> 
> 
> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:
> 
>> I got another one:
>> 
>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>> session in Netty
>> 
>> 
>> So, that doesn't seem too many. We can talk about backporting them, but
>> I don't think they're super critical for the first stable release.
>> 
>> Regarding 3.4, I need to validate the embedded scenario, but at least
>> the Java 11 build is green. :)
>> 
>> 
>> Andor
>> 
>> 
>> 
>> On 1/4/19 18:08, Enrico Olivelli wrote:
>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
>>> <nk...@cloudera.com.invalid> ha scritto:
>>>> +1 for Netty 4 in 3.5
>>>> 
>>>> Pretty much all the pros and cons has been said before me.
>>>> I would only add that this is not a new functionality that we wan't to
>>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
>>>> change unfortunately.
>>>> 
>>>> Regards,
>>>> Norbert
>>>> 
>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:
>>>> 
>>>>> What are those patches exactly?
>>>>> 
>>>>> Comparing the ported version of 3.5 with master I’ve only found 2
>> patches
>>>>> which are missing:
>>>>> 
>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>>>> NettyServerCnxnFactory
>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>>>>> the same behavior and make the code easier to maintain
>>>>> 
>>>>> None of them are critical I would say.
>>>>> Is there anything else I’m missing?
>>> I have compared the histories and I think Andor you are right.
>>> 
>>> master:
>>> https://github.com/apache/zookeeper/commits/master
>>> 
>>> branch-3.5:
>>> https://github.com/apache/zookeeper/commits/branch-3.5
>>> 
>>> Sorry I was thinking to the amount of patches related to TLS stuff ,
>>> but they are not related to Netty 4.
>>> 
>>> What about branch 3.4 ?
>>> We don't have reconfig but the case "Start embedded ZK, stop it and
>>> try to restart on the same port" should apply as well.
>>> 
>>> Enrico
>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
>> wrote:
>>>>>> 
>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
>>>>>>> Hi team / Enrico,
>>>>>>> 
>>>>>>> I’d like to get feedback from the community on the following patch
>>>>> (moving the discussion from GitHub to here):
>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>>>> https://github.com/apache/zookeeper/pull/753 <
>>>>> https://github.com/apache/zookeeper/pull/753>
>>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
>>>>> doesn’t properly close the underlying socket (probably not closing the
>>>>> registered NIO selectors) and reconfig tests are unable to re-bind the
>>>>> ports. This problem is similar that we already fixed in NIO with the
>>>>> following patch:
>>>>> 
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>> <
>>>>> 
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>> The problem doesn’t show up on trunk which has been recently upgraded
>>>>> to Netty 4.
>>>>>>> Repro:
>>>>>>> - Start embedded ZK, stop it and try to restart on the same port, or
>>>>>>> - Start normal ensemble and reconfig to use different (client) port.
>>>>> Then reconfig back to the original port which should fail. (that’s the
>>>>> scenario which is covered in ReconfigTest)
>>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5
>> and
>>>>> it fixes the problem with Java 11 (it doesn’t cause regression in the
>>>>> pre-commit build either), but Enrico is having concerns about making
>> such
>>>>> big change before the release.
>>>>>>> I tend to agree, but let’s see what are the options.
>>>>>>> 
>>>>>>> Thoughts:
>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
>>>>> critical.
>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
>>>>> what can we do?
>>>>>>>      a) We cannot workaround from ZooKeeper itself and have to
>> submit
>>>>> a pull request for Netty. I think it’s quite unlikely that they will
>> accept
>>>>> the change given it’s not a security bug, but even if they did, only
>> the
>>>>> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
>>>>>>>      b) We can workaround it from ZooKeeper: that could be option
>> #1,
>>>>> but I have a strong feeling about it’s not going to be the case.
>>>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>>> 
>>>>>>> Please share your thoughts, maybe you know about an option #3.
>>>>>> Thank you Andor
>>>>>> 
>>>>>> I have thought more about this problem, and I have checked that Netty
>>>>>> 3 is really dead/unmantained (last release in 2016).
>>>>>> If I understand correctly there is no easy workaround (nothing without
>>>>>> hacking Netty 3 internals)
>>>>>> 
>>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
>>>>>> The network stack is very important so it is better to have Netty 4 as
>>>>>> foundation, I am thinking about security issues, we won't make an
>>>>>> "hotfix" release with the switch to Netty 4 because there is a bad bug
>>>>>> in Netty 3.
>>>>>> So better to switch now.
>>>>>> 
>>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
>>>>>> around Netty on master branch, we must be sure that what we are
>>>>>> delivering in 3.5.5 is stable.
>>>>>> 
>>>>>> We will also have to state clearly in the "release notes" that Netty
>>>>>> version is changed, as this may have a non trivial impact to memory
>>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>>> 
>>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care
>>>>>> of port all of the fixes around Netty 4 from master branch and we
>>>>>> state the switch clearly in the release notes
>>>>>> 
>>>>>> Enrico
>>>>>> 
>>>>>>> Regards,
>>>>>>> Andor
>>>>> 
>> 


Re: JAVA 11 build is broken on 3.5

Posted by Brian Nixon <br...@gmail.com>.
Also worth consideration for the Netty fixes backport bundle:

ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak

Our experience with Netty has been positive so far though I don't think we
have enough info to declare it is stable. It makes a lot of sense to me for
3.5 to use Netty 4, the biggest question is the timing.


On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote:

> I got another one:
>
> ZOOKEEPER-3163 Use session map to improve the performance when closing
> session in Netty
>
>
> So, that doesn't seem too many. We can talk about backporting them, but
> I don't think they're super critical for the first stable release.
>
> Regarding 3.4, I need to validate the embedded scenario, but at least
> the Java 11 build is green. :)
>
>
> Andor
>
>
>
> On 1/4/19 18:08, Enrico Olivelli wrote:
> > Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
> > <nk...@cloudera.com.invalid> ha scritto:
> >> +1 for Netty 4 in 3.5
> >>
> >> Pretty much all the pros and cons has been said before me.
> >> I would only add that this is not a new functionality that we wan't to
> >> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
> >> change unfortunately.
> >>
> >> Regards,
> >> Norbert
> >>
> >> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:
> >>
> >>> What are those patches exactly?
> >>>
> >>> Comparing the ported version of 3.5 with master I’ve only found 2
> patches
> >>> which are missing:
> >>>
> >>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
> >>> NettyServerCnxnFactory
> >>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
> >>> the same behavior and make the code easier to maintain
> >>>
> >>> None of them are critical I would say.
> >>> Is there anything else I’m missing?
> > I have compared the histories and I think Andor you are right.
> >
> > master:
> > https://github.com/apache/zookeeper/commits/master
> >
> > branch-3.5:
> > https://github.com/apache/zookeeper/commits/branch-3.5
> >
> > Sorry I was thinking to the amount of patches related to TLS stuff ,
> > but they are not related to Netty 4.
> >
> > What about branch 3.4 ?
> > We don't have reconfig but the case "Start embedded ZK, stop it and
> > try to restart on the same port" should apply as well.
> >
> > Enrico
> >
> >>> Andor
> >>>
> >>>
> >>>
> >>>
> >>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com>
> wrote:
> >>>>
> >>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> >>>> <andor@apache.org <ma...@apache.org>> ha scritto:
> >>>>> Hi team / Enrico,
> >>>>>
> >>>>> I’d like to get feedback from the community on the following patch
> >>> (moving the discussion from GitHub to here):
> >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> >>>>> https://github.com/apache/zookeeper/pull/753 <
> >>> https://github.com/apache/zookeeper/pull/753>
> >>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
> >>> doesn’t properly close the underlying socket (probably not closing the
> >>> registered NIO selectors) and reconfig tests are unable to re-bind the
> >>> ports. This problem is similar that we already fixed in NIO with the
> >>> following patch:
> >>>
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> >>> <
> >>>
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> >>>>> The problem doesn’t show up on trunk which has been recently upgraded
> >>> to Netty 4.
> >>>>> Repro:
> >>>>> - Start embedded ZK, stop it and try to restart on the same port, or
> >>>>> - Start normal ensemble and reconfig to use different (client) port.
> >>> Then reconfig back to the original port which should fail. (that’s the
> >>> scenario which is covered in ReconfigTest)
> >>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5
> and
> >>> it fixes the problem with Java 11 (it doesn’t cause regression in the
> >>> pre-commit build either), but Enrico is having concerns about making
> such
> >>> big change before the release.
> >>>>> I tend to agree, but let’s see what are the options.
> >>>>>
> >>>>> Thoughts:
> >>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
> >>> critical.
> >>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
> >>> what can we do?
> >>>>>       a) We cannot workaround from ZooKeeper itself and have to
> submit
> >>> a pull request for Netty. I think it’s quite unlikely that they will
> accept
> >>> the change given it’s not a security bug, but even if they did, only
> the
> >>> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
> >>>>>       b) We can workaround it from ZooKeeper: that could be option
> #1,
> >>> but I have a strong feeling about it’s not going to be the case.
> >>>>> - Shall we upgrade to Netty 4? - this is option #2
> >>>>>
> >>>>> Please share your thoughts, maybe you know about an option #3.
> >>>> Thank you Andor
> >>>>
> >>>> I have thought more about this problem, and I have checked that Netty
> >>>> 3 is really dead/unmantained (last release in 2016).
> >>>> If I understand correctly there is no easy workaround (nothing without
> >>>> hacking Netty 3 internals)
> >>>>
> >>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
> >>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> >>>> The network stack is very important so it is better to have Netty 4 as
> >>>> foundation, I am thinking about security issues, we won't make an
> >>>> "hotfix" release with the switch to Netty 4 because there is a bad bug
> >>>> in Netty 3.
> >>>> So better to switch now.
> >>>>
> >>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> >>>> around Netty on master branch, we must be sure that what we are
> >>>> delivering in 3.5.5 is stable.
> >>>>
> >>>> We will also have to state clearly in the "release notes" that Netty
> >>>> version is changed, as this may have a non trivial impact to memory
> >>>> usage (i.e. Netty 4 uses more Direct memory by default)
> >>>>
> >>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care
> >>>> of port all of the fixes around Netty 4 from master branch and we
> >>>> state the switch clearly in the release notes
> >>>>
> >>>> Enrico
> >>>>
> >>>>> Regards,
> >>>>> Andor
> >>>
>

Re: JAVA 11 build is broken on 3.5

Posted by Andor Molnár <an...@apache.org>.
I got another one:

ZOOKEEPER-3163 Use session map to improve the performance when closing
session in Netty


So, that doesn't seem too many. We can talk about backporting them, but
I don't think they're super critical for the first stable release.

Regarding 3.4, I need to validate the embedded scenario, but at least
the Java 11 build is green. :)


Andor



On 1/4/19 18:08, Enrico Olivelli wrote:
> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
> <nk...@cloudera.com.invalid> ha scritto:
>> +1 for Netty 4 in 3.5
>>
>> Pretty much all the pros and cons has been said before me.
>> I would only add that this is not a new functionality that we wan't to
>> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
>> change unfortunately.
>>
>> Regards,
>> Norbert
>>
>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:
>>
>>> What are those patches exactly?
>>>
>>> Comparing the ported version of 3.5 with master I’ve only found 2 patches
>>> which are missing:
>>>
>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>> NettyServerCnxnFactory
>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>>> the same behavior and make the code easier to maintain
>>>
>>> None of them are critical I would say.
>>> Is there anything else I’m missing?
> I have compared the histories and I think Andor you are right.
>
> master:
> https://github.com/apache/zookeeper/commits/master
>
> branch-3.5:
> https://github.com/apache/zookeeper/commits/branch-3.5
>
> Sorry I was thinking to the amount of patches related to TLS stuff ,
> but they are not related to Netty 4.
>
> What about branch 3.4 ?
> We don't have reconfig but the case "Start embedded ZK, stop it and
> try to restart on the same port" should apply as well.
>
> Enrico
>
>>> Andor
>>>
>>>
>>>
>>>
>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com> wrote:
>>>>
>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>> <andor@apache.org <ma...@apache.org>> ha scritto:
>>>>> Hi team / Enrico,
>>>>>
>>>>> I’d like to get feedback from the community on the following patch
>>> (moving the discussion from GitHub to here):
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>> https://github.com/apache/zookeeper/pull/753 <
>>> https://github.com/apache/zookeeper/pull/753>
>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
>>> doesn’t properly close the underlying socket (probably not closing the
>>> registered NIO selectors) and reconfig tests are unable to re-bind the
>>> ports. This problem is similar that we already fixed in NIO with the
>>> following patch:
>>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>> <
>>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>> The problem doesn’t show up on trunk which has been recently upgraded
>>> to Netty 4.
>>>>> Repro:
>>>>> - Start embedded ZK, stop it and try to restart on the same port, or
>>>>> - Start normal ensemble and reconfig to use different (client) port.
>>> Then reconfig back to the original port which should fail. (that’s the
>>> scenario which is covered in ReconfigTest)
>>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and
>>> it fixes the problem with Java 11 (it doesn’t cause regression in the
>>> pre-commit build either), but Enrico is having concerns about making such
>>> big change before the release.
>>>>> I tend to agree, but let’s see what are the options.
>>>>>
>>>>> Thoughts:
>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
>>> critical.
>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
>>> what can we do?
>>>>>       a) We cannot workaround from ZooKeeper itself and have to submit
>>> a pull request for Netty. I think it’s quite unlikely that they will accept
>>> the change given it’s not a security bug, but even if they did, only the
>>> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
>>>>>       b) We can workaround it from ZooKeeper: that could be option #1,
>>> but I have a strong feeling about it’s not going to be the case.
>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>
>>>>> Please share your thoughts, maybe you know about an option #3.
>>>> Thank you Andor
>>>>
>>>> I have thought more about this problem, and I have checked that Netty
>>>> 3 is really dead/unmantained (last release in 2016).
>>>> If I understand correctly there is no easy workaround (nothing without
>>>> hacking Netty 3 internals)
>>>>
>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
>>>> The network stack is very important so it is better to have Netty 4 as
>>>> foundation, I am thinking about security issues, we won't make an
>>>> "hotfix" release with the switch to Netty 4 because there is a bad bug
>>>> in Netty 3.
>>>> So better to switch now.
>>>>
>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
>>>> around Netty on master branch, we must be sure that what we are
>>>> delivering in 3.5.5 is stable.
>>>>
>>>> We will also have to state clearly in the "release notes" that Netty
>>>> version is changed, as this may have a non trivial impact to memory
>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>
>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care
>>>> of port all of the fixes around Netty 4 from master branch and we
>>>> state the switch clearly in the release notes
>>>>
>>>> Enrico
>>>>
>>>>> Regards,
>>>>> Andor
>>>

Re: JAVA 11 build is broken on 3.5

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
<nk...@cloudera.com.invalid> ha scritto:
>
> +1 for Netty 4 in 3.5
>
> Pretty much all the pros and cons has been said before me.
> I would only add that this is not a new functionality that we wan't to
> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
> change unfortunately.
>
> Regards,
> Norbert
>
> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:
>
> > What are those patches exactly?
> >
> > Comparing the ported version of 3.5 with master I’ve only found 2 patches
> > which are missing:
> >
> > ZOOKEEPER-3146 Limit the maximum client connections per IP in
> > NettyServerCnxnFactory
> > ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
> > the same behavior and make the code easier to maintain
> >
> > None of them are critical I would say.
> > Is there anything else I’m missing?

I have compared the histories and I think Andor you are right.

master:
https://github.com/apache/zookeeper/commits/master

branch-3.5:
https://github.com/apache/zookeeper/commits/branch-3.5

Sorry I was thinking to the amount of patches related to TLS stuff ,
but they are not related to Netty 4.

What about branch 3.4 ?
We don't have reconfig but the case "Start embedded ZK, stop it and
try to restart on the same port" should apply as well.

Enrico

> >
> > Andor
> >
> >
> >
> >
> > > On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com> wrote:
> > >
> > > Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> > > <andor@apache.org <ma...@apache.org>> ha scritto:
> > >>
> > >> Hi team / Enrico,
> > >>
> > >> I’d like to get feedback from the community on the following patch
> > (moving the discussion from GitHub to here):
> > >>
> > >> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
> > https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> > >> https://github.com/apache/zookeeper/pull/753 <
> > https://github.com/apache/zookeeper/pull/753>
> > >>
> > >> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
> > doesn’t properly close the underlying socket (probably not closing the
> > registered NIO selectors) and reconfig tests are unable to re-bind the
> > ports. This problem is similar that we already fixed in NIO with the
> > following patch:
> > >>
> > https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> > <
> > https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> > >
> > >> The problem doesn’t show up on trunk which has been recently upgraded
> > to Netty 4.
> > >>
> > >> Repro:
> > >> - Start embedded ZK, stop it and try to restart on the same port, or
> > >> - Start normal ensemble and reconfig to use different (client) port.
> > Then reconfig back to the original port which should fail. (that’s the
> > scenario which is covered in ReconfigTest)
> > >>
> > >> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and
> > it fixes the problem with Java 11 (it doesn’t cause regression in the
> > pre-commit build either), but Enrico is having concerns about making such
> > big change before the release.
> > >>
> > >> I tend to agree, but let’s see what are the options.
> > >>
> > >> Thoughts:
> > >> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
> > critical.
> > >> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
> > what can we do?
> > >>       a) We cannot workaround from ZooKeeper itself and have to submit
> > a pull request for Netty. I think it’s quite unlikely that they will accept
> > the change given it’s not a security bug, but even if they did, only the
> > upgraded version of Netty 3 would work properly with ZooKeeper. Err.
> > >>       b) We can workaround it from ZooKeeper: that could be option #1,
> > but I have a strong feeling about it’s not going to be the case.
> > >> - Shall we upgrade to Netty 4? - this is option #2
> > >>
> > >> Please share your thoughts, maybe you know about an option #3.
> > >
> > > Thank you Andor
> > >
> > > I have thought more about this problem, and I have checked that Netty
> > > 3 is really dead/unmantained (last release in 2016).
> > > If I understand correctly there is no easy workaround (nothing without
> > > hacking Netty 3 internals)
> > >
> > > As soon as we will declare 3.5.5 "stable" the world will hopefully
> > > abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> > > The network stack is very important so it is better to have Netty 4 as
> > > foundation, I am thinking about security issues, we won't make an
> > > "hotfix" release with the switch to Netty 4 because there is a bad bug
> > > in Netty 3.
> > > So better to switch now.
> > >
> > > But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> > > around Netty on master branch, we must be sure that what we are
> > > delivering in 3.5.5 is stable.
> > >
> > > We will also have to state clearly in the "release notes" that Netty
> > > version is changed, as this may have a non trivial impact to memory
> > > usage (i.e. Netty 4 uses more Direct memory by default)
> > >
> > > So to recap my final opinion: +1 to switch to Netty 4 if we take care
> > > of port all of the fixes around Netty 4 from master branch and we
> > > state the switch clearly in the release notes
> > >
> > > Enrico
> > >
> > >>
> > >> Regards,
> > >> Andor
> >
> >

Re: JAVA 11 build is broken on 3.5

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
+1 for Netty 4 in 3.5

Pretty much all the pros and cons has been said before me.
I would only add that this is not a new functionality that we wan't to
backport. It's a criticall(ish?) bugfix, which requires quite a bit of
change unfortunately.

Regards,
Norbert

On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote:

> What are those patches exactly?
>
> Comparing the ported version of 3.5 with master I’ve only found 2 patches
> which are missing:
>
> ZOOKEEPER-3146 Limit the maximum client connections per IP in
> NettyServerCnxnFactory
> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
> the same behavior and make the code easier to maintain
>
> None of them are critical I would say.
> Is there anything else I’m missing?
>
> Andor
>
>
>
>
> > On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com> wrote:
> >
> > Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> > <andor@apache.org <ma...@apache.org>> ha scritto:
> >>
> >> Hi team / Enrico,
> >>
> >> I’d like to get feedback from the community on the following patch
> (moving the discussion from GitHub to here):
> >>
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> >> https://github.com/apache/zookeeper/pull/753 <
> https://github.com/apache/zookeeper/pull/753>
> >>
> >> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it
> doesn’t properly close the underlying socket (probably not closing the
> registered NIO selectors) and reconfig tests are unable to re-bind the
> ports. This problem is similar that we already fixed in NIO with the
> following patch:
> >>
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> <
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
> >
> >> The problem doesn’t show up on trunk which has been recently upgraded
> to Netty 4.
> >>
> >> Repro:
> >> - Start embedded ZK, stop it and try to restart on the same port, or
> >> - Start normal ensemble and reconfig to use different (client) port.
> Then reconfig back to the original port which should fail. (that’s the
> scenario which is covered in ReconfigTest)
> >>
> >> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and
> it fixes the problem with Java 11 (it doesn’t cause regression in the
> pre-commit build either), but Enrico is having concerns about making such
> big change before the release.
> >>
> >> I tend to agree, but let’s see what are the options.
> >>
> >> Thoughts:
> >> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is
> critical.
> >> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3,
> what can we do?
> >>       a) We cannot workaround from ZooKeeper itself and have to submit
> a pull request for Netty. I think it’s quite unlikely that they will accept
> the change given it’s not a security bug, but even if they did, only the
> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
> >>       b) We can workaround it from ZooKeeper: that could be option #1,
> but I have a strong feeling about it’s not going to be the case.
> >> - Shall we upgrade to Netty 4? - this is option #2
> >>
> >> Please share your thoughts, maybe you know about an option #3.
> >
> > Thank you Andor
> >
> > I have thought more about this problem, and I have checked that Netty
> > 3 is really dead/unmantained (last release in 2016).
> > If I understand correctly there is no easy workaround (nothing without
> > hacking Netty 3 internals)
> >
> > As soon as we will declare 3.5.5 "stable" the world will hopefully
> > abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> > The network stack is very important so it is better to have Netty 4 as
> > foundation, I am thinking about security issues, we won't make an
> > "hotfix" release with the switch to Netty 4 because there is a bad bug
> > in Netty 3.
> > So better to switch now.
> >
> > But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> > around Netty on master branch, we must be sure that what we are
> > delivering in 3.5.5 is stable.
> >
> > We will also have to state clearly in the "release notes" that Netty
> > version is changed, as this may have a non trivial impact to memory
> > usage (i.e. Netty 4 uses more Direct memory by default)
> >
> > So to recap my final opinion: +1 to switch to Netty 4 if we take care
> > of port all of the fixes around Netty 4 from master branch and we
> > state the switch clearly in the release notes
> >
> > Enrico
> >
> >>
> >> Regards,
> >> Andor
>
>

Re: JAVA 11 build is broken on 3.5

Posted by Andor Molnar <an...@apache.org>.
What are those patches exactly?

Comparing the ported version of 3.5 with master I’ve only found 2 patches which are missing:

ZOOKEEPER-3146 Limit the maximum client connections per IP in NettyServerCnxnFactory
ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep the same behavior and make the code easier to maintain

None of them are critical I would say.
Is there anything else I’m missing?

Andor




> On 2019. Jan 4., at 16:27, Enrico Olivelli <eo...@gmail.com> wrote:
> 
> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
> <andor@apache.org <ma...@apache.org>> ha scritto:
>> 
>> Hi team / Enrico,
>> 
>> I’d like to get feedback from the community on the following patch (moving the discussion from GitHub to here):
>> 
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>> https://github.com/apache/zookeeper/pull/753 <https://github.com/apache/zookeeper/pull/753>
>> 
>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it doesn’t properly close the underlying socket (probably not closing the registered NIO selectors) and reconfig tests are unable to re-bind the ports. This problem is similar that we already fixed in NIO with the following patch:
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 <https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2>
>> The problem doesn’t show up on trunk which has been recently upgraded to Netty 4.
>> 
>> Repro:
>> - Start embedded ZK, stop it and try to restart on the same port, or
>> - Start normal ensemble and reconfig to use different (client) port. Then reconfig back to the original port which should fail. (that’s the scenario which is covered in ReconfigTest)
>> 
>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and it fixes the problem with Java 11 (it doesn’t cause regression in the pre-commit build either), but Enrico is having concerns about making such big change before the release.
>> 
>> I tend to agree, but let’s see what are the options.
>> 
>> Thoughts:
>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is critical.
>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, what can we do?
>>       a) We cannot workaround from ZooKeeper itself and have to submit a pull request for Netty. I think it’s quite unlikely that they will accept the change given it’s not a security bug, but even if they did, only the upgraded version of Netty 3 would work properly with ZooKeeper. Err.
>>       b) We can workaround it from ZooKeeper: that could be option #1, but I have a strong feeling about it’s not going to be the case.
>> - Shall we upgrade to Netty 4? - this is option #2
>> 
>> Please share your thoughts, maybe you know about an option #3.
> 
> Thank you Andor
> 
> I have thought more about this problem, and I have checked that Netty
> 3 is really dead/unmantained (last release in 2016).
> If I understand correctly there is no easy workaround (nothing without
> hacking Netty 3 internals)
> 
> As soon as we will declare 3.5.5 "stable" the world will hopefully
> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
> The network stack is very important so it is better to have Netty 4 as
> foundation, I am thinking about security issues, we won't make an
> "hotfix" release with the switch to Netty 4 because there is a bad bug
> in Netty 3.
> So better to switch now.
> 
> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
> around Netty on master branch, we must be sure that what we are
> delivering in 3.5.5 is stable.
> 
> We will also have to state clearly in the "release notes" that Netty
> version is changed, as this may have a non trivial impact to memory
> usage (i.e. Netty 4 uses more Direct memory by default)
> 
> So to recap my final opinion: +1 to switch to Netty 4 if we take care
> of port all of the fixes around Netty 4 from master branch and we
> state the switch clearly in the release notes
> 
> Enrico
> 
>> 
>> Regards,
>> Andor


Re: JAVA 11 build is broken on 3.5

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
<an...@apache.org> ha scritto:
>
> Hi team / Enrico,
>
> I’d like to get feedback from the community on the following patch (moving the discussion from GitHub to here):
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
> https://github.com/apache/zookeeper/pull/753 <https://github.com/apache/zookeeper/pull/753>
>
> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it doesn’t properly close the underlying socket (probably not closing the registered NIO selectors) and reconfig tests are unable to re-bind the ports. This problem is similar that we already fixed in NIO with the following patch:
> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 <https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2>
> The problem doesn’t show up on trunk which has been recently upgraded to Netty 4.
>
> Repro:
> - Start embedded ZK, stop it and try to restart on the same port, or
> - Start normal ensemble and reconfig to use different (client) port. Then reconfig back to the original port which should fail. (that’s the scenario which is covered in ReconfigTest)
>
> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and it fixes the problem with Java 11 (it doesn’t cause regression in the pre-commit build either), but Enrico is having concerns about making such big change before the release.
>
> I tend to agree, but let’s see what are the options.
>
> Thoughts:
> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is critical.
> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, what can we do?
>        a) We cannot workaround from ZooKeeper itself and have to submit a pull request for Netty. I think it’s quite unlikely that they will accept the change given it’s not a security bug, but even if they did, only the upgraded version of Netty 3 would work properly with ZooKeeper. Err.
>        b) We can workaround it from ZooKeeper: that could be option #1, but I have a strong feeling about it’s not going to be the case.
> - Shall we upgrade to Netty 4? - this is option #2
>
> Please share your thoughts, maybe you know about an option #3.

Thank you Andor

I have thought more about this problem, and I have checked that Netty
3 is really dead/unmantained (last release in 2016).
If I understand correctly there is no easy workaround (nothing without
hacking Netty 3 internals)

As soon as we will declare 3.5.5 "stable" the world will hopefully
abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
The network stack is very important so it is better to have Netty 4 as
foundation, I am thinking about security issues, we won't make an
"hotfix" release with the switch to Netty 4 because there is a bad bug
in Netty 3.
So better to switch now.

But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
around Netty on master branch, we must be sure that what we are
delivering in 3.5.5 is stable.

We will also have to state clearly in the "release notes" that Netty
version is changed, as this may have a non trivial impact to memory
usage (i.e. Netty 4 uses more Direct memory by default)

So to recap my final opinion: +1 to switch to Netty 4 if we take care
of port all of the fixes around Netty 4 from master branch and we
state the switch clearly in the release notes

Enrico

>
> Regards,
> Andor
>
>
>
>
>
>