You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Fangmin Lv <lv...@gmail.com> on 2018/11/02 18:46:03 UTC

Re: ZooKeeper 3.5 blocker issues

Hi Andor,

Is anyone working on ZK-2778? I can pick it up if there is no one working
on it yet.

I'll open a 3.5 PR for ZK-3104 today.

Fangmin

On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org> wrote:

> Hi folks,
>
> You’ve probably realised lots of update emails coming from Jira. Please be
> aware that we’ve updated a bunch of open blocker/critical 3.5 tickets to
> reflect to what we discussed in this email.
>
> If you open up the following jira filter:
>
> project = ZooKeeper and resolution = Unresolved and fixVersion = 3.5.5 AND
> priority in (blocker, critical) ORDER BY priority DESC, key ASC
>
> You’ll see the most up-to-date list of tickets which need to be addressed
> before the stable 3.5 release.
>
> Thank you for your efforts to get this done.
>
> Fangmin, ZK-3104 is waiting for backport, but ticket has already been
> resolved. Have you created a separate ticket for the backport or shall I
> just reopen it with the right fix versions?
>
> Thanks,
> Andor
>
>
>
> > On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
> >
> > Hi,
> >
> > Let me summarize and give a quick update on the outstanding issues for
> 3.5 GA:
> >
> > - ZOOKEEPER-1818 (Fix don't care for trunk)
> > - ZOOKEEPER-2778 (Potential server deadlock between follower sync with
> leader and follower receiving external connection requests.)
> > - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> > - ZOOKEEPER-925 Docs generation to Maven
> > - ZOOKEEPER-3104 (waiting for backport)
> > - ZOOKEEPER-3125 (waiting for backport PR #647)
> >
> > The 2 Maven related tickets are no-brainers as well as the backports.
> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see,
> ZK-1818 is the only one waiting for a volunteer.
> >
> > Please correct me if I’ve missed something.
> >
> > Regards,
> > Andor
> >
> >
> >
> >
> >> On 2018. Sep 28., at 18:32, Tamas Penzes <ta...@cloudera.com.INVALID>
> wrote:
> >>
> >> Hi All,
> >>
> >> I would add ZOOKEEPER-3021
> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate project
> >> structure to Maven build as a blocker too. Since the migration has
> started
> >> it would be good to finish before releasing ZK 3.5.x GA.
> >>
> >> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925>
> replace
> >> our forrest site and documentation generation might also be a good idea,
> >> since then we could deliver the new MarkDown based documentation.
> >>
> >> Regards, Tamaas
> >>
> >> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lv...@gmail.com>
> wrote:
> >>
> >>> Oh, sorry for the confusion, I should provide more context.
> >>>
> >>> Leader will use on disk txn sync with followers to if the peer zxid is
> not
> >>> in it's in memory commit logs, the code is here: Leader on disk txn
> sync
> >>> <
> >>>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> >>>> .
> >>> There is bug that potentially there will be gap in the txn files, like
> >>> after snap sync, etc, so it's possible the peer will miss txns due to
> this.
> >>>
> >>> The option to disable it is snapshotSizeFactor
> >>> <
> >>>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> >>>> ,
> >>> set it to -1 will disable this feature. On 3.5, it's better to have a
> PR to
> >>> set this to -1 by default. It might have more SNAP sync, but from our
> prod
> >>> it doesn't seem to be a big problem to me.
> >>>
> >>> I can send out the diff to disable it by default on 3.5 if you guys
> think
> >>> this is the right way to do.
> >>>
> >>> Thanks,
> >>> Fangmin
> >>>
> >>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org> wrote:
> >>>
> >>>> What’s needed to turn it off?
> >>>> Do we need a PR or it’s just a config option?
> >>>> Shall we implement a feature switch for that and turn it off by
> default?
> >>>>
> >>>> Sorry I don’t have too much insight on disk txn sync.
> >>>>
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com> wrote:
> >>>>>
> >>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
> >>>> inconsistency
> >>>>> which could caused by on disk txn sync, as I mentioned in a newer
> JIRA
> >>>>> ZOOKEEPER-2846 <https://issues.apache.org/jira/browse/ZOOKEEPER-2846
> >,
> >>>> the
> >>>>> snap sync or txn sync could also leave txns gap in the txn file,
> which
> >>>> is a
> >>>>> more common case could trigger this issue.
> >>>>>
> >>>>> I would suggest to turn off the on disk txn sync by default for now
> to
> >>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use that
> to
> >>>>> validate the on disk txns during syncing.
> >>>>>
> >>>>> Thanks,
> >>>>> Fangmin
> >>>>>
> >>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lv...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>>> Andor,
> >>>>>>
> >>>>>> ZOOKEEPER-3114 is about adding real time digest checking to help
> >>>> detecting
> >>>>>> inconsistency, it's a new feature with amounts of code change. I'll
> >>>> start
> >>>>>> upstream it part by part, but I don't expect it's being merged in
> the
> >>>> next
> >>>>>> few weeks. So yes, it's a nice to have, but definitely not a block
> for
> >>>> 3.5.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Fangmin
> >>>>>>
> >>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <an...@apache.org>
> >>> wrote:
> >>>>>>
> >>>>>>> Fangmin,
> >>>>>>>
> >>>>>>> Sorry, I just noticed that you want to include the consistency
> fixes
> >>> in
> >>>>>>> the stable version which is fine. Let’s finish the backports and
> >>> we’ll
> >>>> be
> >>>>>>> done with them.
> >>>>>>>
> >>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block 3.5
> >>> with
> >>>>>>> that. What do you think?
> >>>>>>>
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org>
> wrote:
> >>>>>>>>
> >>>>>>>> Cool, thanks for the clarification.
> >>>>>>>>
> >>>>>>>> The updated list is as follows:
> >>>>>>>>
> >>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast protocol)
> >>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
> >>> with
> >>>>>>> leader and follower receiving external connection requests.)
> >>>>>>>>
> >>>>>>>> The following are not critical and no blockers for the stable
> >>> release:
> >>>>>>>>
> >>>>>>>> Waiting for to be ported to 3.5:
> >>>>>>>> - ZOOKEEPER-3104
> >>>>>>>> - ZOOKEEPER-3125
> >>>>>>>> - ZOOKEEPER-3127
> >>>>>>>>
> >>>>>>>> New feature:
> >>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Andor
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lv...@gmail.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Andor,
> >>>>>>>>>
> >>>>>>>>> That's the on disk txn feature, which was disabled internally
> after
> >>>> we
> >>>>>>>>> found the potentially inconsistent issue. The only solution we
> have
> >>>>>>> for now
> >>>>>>>>> is waiting for the new digest checking feature I mentioned in
> >>>>>>>>> ZOOKEEPER-3114.
> >>>>>>>>>
> >>>>>>>>> I think there are some other critical consistent issues we just
> >>> fixed
> >>>>>>> on
> >>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125, ZOOKEEPER-3127,
> I
> >>>>>>> think we
> >>>>>>>>> should include that in the official 3.5 release as well.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Fangmin
> >>>>>>>>>
> >>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <an...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Jeelani,
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the list
> >>> to
> >>>>>>> get
> >>>>>>>>>> closer to a stable release. :)
> >>>>>>>>>>
> >>>>>>>>>> What's the feature which can be disabled to avoid data
> >>>> inconsistency?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Andor
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> >>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
> >>>>>>> ZOOKEEPER-2418 as
> >>>>>>>>>> well? This exists in 3.4 as well and the feature can be
> disabled.
> >>> We
> >>>>>>> are
> >>>>>>>>>> working on a longer term fix for it in 3.6.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>>
> >>>>>>>>>>> Jeelani
> >>>>>>>>>>>
> >>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> <andor@cloudera.com.INVALID
> >>>>
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Fine.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the list
> >>>> of:
> >>>>>>>>>>>
> >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast protocol)
> >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
> >>>>>>>>>> transactions to
> >>>>>>>>>>> followers)
> >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> >>> sync
> >>>>>>>>>> with
> >>>>>>>>>>> leader and follower receiving external connection requests.)
> >>>>>>>>>>>
> >>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5 release,
> >>>>>>> hence
> >>>>>>>>>> I
> >>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
> >>> release.
> >>>>>>> PR
> >>>>>>>>>> has
> >>>>>>>>>>> been out for a long time, get on reviewing please.
> >>>>>>>>>>> The rest are also long outstanding issues which have been found
> >>> in
> >>>>>>>>>> the 3.5
> >>>>>>>>>>> branch.
> >>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in 3.4,
> >>> but
> >>>>>>>>>> never has
> >>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present.
> >>>>>>>>>>>
> >>>>>>>>>>> I think we should at least run some manual testing and see if
> we
> >>>>>>>>>> could
> >>>>>>>>>>> repro any of these issues before going ahead with a stable
> >>>> release.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Andor
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <ha...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I haven't went through the entire list, but looks like lots of
> >>> the
> >>>>>>>>>> JIRA
> >>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549, 2846,
> also
> >>>>>>>>>> affects
> >>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think historically the single outstanding blocking issue
> for a
> >>>>>>>>>> stable 3.5
> >>>>>>>>>>>> release is the reconfig feature and security concerns around
> it
> >>>>>>>>>> (somehow
> >>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta releases
> >>> were
> >>>>>>>>>> created
> >>>>>>>>>>>> to stabilize that feature.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> >>>>>>>>>>>> SSL-release-date-tt7581744.html
> >>>>>>>>>>>>
> >>>>>>>>>>>> So it looks like we are in good shape to release. Something
> >>> might
> >>>>>>>>>> worth
> >>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
> >>>>>>>>>>>>
> >>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> >>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky
> tests
> >>> at
> >>>>>>>>>> all.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> >>>>>>>>>> <an...@cloudera.com.invalid>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >
>
>

Re: ZooKeeper 3.5 blocker issues

Posted by Enrico Olivelli <eo...@gmail.com>.
Il giorno gio 20 dic 2018 alle ore 10:20 Norbert Kalmar
<nk...@cloudera.com.invalid> ha scritto:
>
> One more thing - for the CI integration, actually, Java reports from Clover
> - by maven - should be already available on master. Basically we only
> analyse zookeeper-server, but as it is planned to be seperated, I wrote
> Clover so it starts from root pom and aggregates the results.

Got it.

>
> On Thu, Dec 20, 2018 at 10:17 AM Norbert Kalmar <nk...@cloudera.com>
> wrote:
>
> > Sure, sorry about that, I didn't check thoroughly the static analyser
> > part. Spotbugs works for me!
> >
> > Thanks Enrico!
> >
> > On Thu, Dec 20, 2018 at 10:10 AM Enrico Olivelli <eo...@gmail.com>
> > wrote:
> >
> >> Great
> >>
> >> Il giorno gio 20 dic 2018 alle ore 10:07 Norbert Kalmar
> >> <nk...@cloudera.com.invalid> ha scritto:
> >> >
> >> > Subtasks:
> >> > Findbugs, checkstyle -
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-3223
> >>
> >> We don't have checksyle. In my experience introducing checkstyle break
> >> every pending patch.
> >> I would like to narrow down this issue to "Spotbugs" and pick in up
> >>
> >> > CI integration - https://issues.apache.org/jira/browse/ZOOKEEPER-3224
> >> I would like to pick this up
> >>
> >> Enrico
> >>
> >> > Code coverage - https://issues.apache.org/jira/browse/ZOOKEEPER-3225 -
> >> I
> >> > laready started this one and some of it is committed with the patch, so
> >> I
> >> > will continue to work on it.
> >> > Recipes and contrib -
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-3171
> >> > - Already on it, recipes is done, PR soon available.
> >> > Assembly - https://issues.apache.org/jira/browse/ZOOKEEPER-3029
> >> >
> >> > This are the tasks left I can think of. If anything is missing, feel
> >> free
> >> > to create a jira, or let me know.
> >> > The ones I'm already working on - 3225, 3171 - I made a comment, Those
> >> > should be ready this week.
> >>
> >> >
> >> > Thanks,
> >> > Norbert
> >> >
> >> >
> >> > On Thu, Dec 20, 2018 at 9:07 AM Enrico Olivelli <eo...@gmail.com>
> >> wrote:
> >> >
> >> > > Great.
> >> > > Can you create JIRA tickets for remaining subtask? So that I can pick
> >> them
> >> > > up?
> >> > > I volounter for spotbugs and for CI integration, but let's see the
> >> list
> >> > > Enrico
> >> > >
> >> > > Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:
> >> > >
> >> > > > Ok. Looks like ant still works properly, so let’s commit this patch
> >> and
> >> > > > you guys can collaborate to polish the Maven build.
> >> > > >
> >> > > > For now, it’s master-only.
> >> > > >
> >> > > > Thanks,
> >> > > > Andor
> >> > > >
> >> > > >
> >> > > >
> >> > > > > On 2018. Dec 19., at 16:44, Norbert Kalmar
> >> > > <nk...@cloudera.com.INVALID>
> >> > > > wrote:
> >> > > > >
> >> > > > > Thank you Enrico, I agree, that we could commit this patch at it's
> >> > > > current
> >> > > > > state, it fulfills the original jira anyways.
> >> > > > >
> >> > > > > I'll see what's wrong with the java tests, but honestly, it looks
> >> like
> >> > > > > they're just flaky... runs well on local builds with 8 thread.
> >> > > > >
> >> > > > > Regards,
> >> > > > > Norbert
> >> > > > >
> >> > > > > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes
> >> > > <tamaas@cloudera.com.invalid
> >> > > > >
> >> > > > > wrote:
> >> > > > >
> >> > > > >> Hi All,
> >> > > > >>
> >> > > > >> For assembly task I would promote the way how HBase works.
> >> > > > >> They create a pure source and a bin tarball separately. Please
> >> see how
> >> > > > they
> >> > > > >> create a release here:
> >> > > > >>
> >> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> >> > > > >> We could probably use the well known "copy+paste technology" to
> >> have
> >> > > it
> >> > > > >> within ZooKeeper the same way. ;-)
> >> > > > >>
> >> > > > >> Regards, Tamaas
> >> > > > >>
> >> > > > >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <
> >> eolivelli@gmail.com>
> >> > > > >> wrote:
> >> > > > >>
> >> > > > >>> Great work Norbert
> >> > > > >>> I you want I can help,especially for rat, findbugs (need to
> >> switch to
> >> > > > >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
> >> > > > >>> Plugin in other projects)
> >> > > > >>> But I am not sure how can I help you concretely if we do not
> >> commit
> >> > > > your
> >> > > > >>> work.
> >> > > > >>> We could commit the work as it is now, leaving "ant" as official
> >> > > build
> >> > > > >>> method, but having the poms committed will ease collaboration.
> >> > > > >>>
> >> > > > >>> We will also have to work on CI jobs, I can help on that part
> >> as well
> >> > > > >>>
> >> > > > >>> Enrico
> >> > > > >>>
> >> > > > >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> >> > > > >>> <nk...@cloudera.com.invalid> ha scritto:
> >> > > > >>>>
> >> > > > >>>> Hi everyone,
> >> > > > >>>>
> >> > > > >>>> Some update on the maven migration: I had a few bumps here and
> >> there
> >> > > > >>> (just
> >> > > > >>>> looking at the latest patch Andor linked -
> >> > > > >>>> https://github.com/apache/zookeeper/pull/708 - you can see on
> >> the
> >> > > > >>> commits).
> >> > > > >>>> Current state is that the build works, tests run, but reports
> >> like
> >> > > > >>>> findbugs, clover etc. are not yet implemented. Maven has
> >> plugins for
> >> > > > >> them
> >> > > > >>>> usually, but it's not always trivial, especially with the C
> >> client.
> >> > > > The
> >> > > > >>>> assembly is also left to be done, but it should be fairly easy
> >> to
> >> > > do a
> >> > > > >>>> similar tarball then ant does (although this will be also an
> >> > > > >> interesting
> >> > > > >>>> task, as ant does some strange things, like duplicated sources
> >> of
> >> > > most
> >> > > > >>>> contrib projects).
> >> > > > >>>>
> >> > > > >>>> I had a seperate jira to do the recipes and contrib maven
> >> build. I
> >> > > do
> >> > > > >> not
> >> > > > >>>> have open PR for it, but recipes is done and I am now working
> >> on the
> >> > > > >>>> contrib projects. Most of them is manually build and never gets
> >> > > called
> >> > > > >>> from
> >> > > > >>>> the main build.xml. I will not integrate these either to the
> >> maven
> >> > > > >> build.
> >> > > > >>>> The reason is that there are plans to remove some of them from
> >> ZK
> >> > > repo
> >> > > > >>>> anyway. The other reason is that for starters, we want to
> >> replicate
> >> > > > the
> >> > > > >>> ant
> >> > > > >>>> build as closely as possible, without doing any nasty
> >> workarounds in
> >> > > > >>> maven
> >> > > > >>>> to achieve that. And from there, we can improve, use maven's
> >> > > > advantages
> >> > > > >>> to
> >> > > > >>>> shape the build of ZooKeeper. Once it is stable and proven to
> >> have
> >> > > all
> >> > > > >>> the
> >> > > > >>>> functionality required for build and release.
> >> > > > >>>>
> >> > > > >>>> Right now, I am trying to stabilize the build as much as
> >> possible.
> >> > > > >> Andor
> >> > > > >>>> also fixed some flaky C tests that for some strange reasons,
> >> become
> >> > > > >>>> extremely flaky with the maven build:
> >> > > > >>>> https://github.com/apache/zookeeper/pull/740
> >> > > > >>>>
> >> > > > >>>> Regards,
> >> > > > >>>> Norbert
> >> > > > >>>>
> >> > > > >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> >> > > > >> <andor@cloudera.com.invalid
> >> > > > >>>>
> >> > > > >>>> wrote:
> >> > > > >>>>
> >> > > > >>>>> Sure, good point. Let's put it on the list.
> >> > > > >>>>>
> >> > > > >>>>> Andor
> >> > > > >>>>>
> >> > > > >>>>>
> >> > > > >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <
> >> phunt@apache.org>
> >> > > > >>> wrote:
> >> > > > >>>>>
> >> > > > >>>>>> Are folks OK to wait on that OWASP issue I documented over
> >> the
> >> > > > >>> weekend?
> >> > > > >>>>>> afaict we are not affected but it would be good to get
> >> another
> >> > > pair
> >> > > > >>> of
> >> > > > >>>>> eyes
> >> > > > >>>>>> on it.
> >> > > > >>>>>>
> >> > > > >>>>>> Patrick
> >> > > > >>>>>>
> >> > > > >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <
> >> andor@apache.org>
> >> > > > >>> wrote:
> >> > > > >>>>>>
> >> > > > >>>>>>> Hi team,
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> I'm proudly announce that thanks to the joint effort from
> >> the
> >> > > > >>>>> community,
> >> > > > >>>>>>> the 3.5 blockers list has become empty:
> >> > > > >>>>>>>
> >> > > > >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND
> >> fixVersion =
> >> > > > >>> 3.5.5
> >> > > > >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
> >> key
> >> > > > >>> ASC"
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> Well... almost. All the blocker issues have gone, but we
> >> still
> >> > > > >>> have the
> >> > > > >>>>>>> Maven migration to complete before the stable release. If
> >> you
> >> > > > >> have
> >> > > > >>> some
> >> > > > >>>>>>> free cycles, please join us testing the Maven build on this
> >> PR:
> >> > > > >>>>>>>
> >> > > > >>>>>>> https://github.com/apache/zookeeper/pull/708
> >> > > > >>>>>>>
> >> > > > >>>>>>> I hope we can merge it pretty soon.
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> In terms of the builds, the weather at 3.5 branch is quite
> >> sunny
> >> > > > >>>>>> nowadays:
> >> > > > >>>>>>>
> >> > > > >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
> >> > > > >>>>>>>
> >> > > > >>>>>>> The Java 11 build is still having some difficulties, which
> >> > > > >>> hopefully I
> >> > > > >>>>>>> can address before the holidays:
> >> > > > >>>>>>>
> >> > > > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> If you happen to know about something which is important
> >> from
> >> > > > >> 3.5's
> >> > > > >>>>>>> perspective and missing from the above, please don't
> >> hesitate to
> >> > > > >>> share.
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> Happy ZooKeeping!
> >> > > > >>>>>>>
> >> > > > >>>>>>> Andor
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
> >> > > > >>>>>>>> Andor,
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
> >> > > > >>>>>>>> https://github.com/apache/zookeeper/pull/685.
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> Fangmin
> >> > > > >>>>>>>>
> >> > > > >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> >> > > > >> lvfangmin@gmail.com>
> >> > > > >>>>>> wrote:
> >> > > > >>>>>>>>
> >> > > > >>>>>>>>> Hi Andor,
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there
> >> is no
> >> > > > >>> one
> >> > > > >>>>>>> working
> >> > > > >>>>>>>>> on it yet.
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>> Fangmin
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> >> > > > >> andor@apache.org>
> >> > > > >>>>>> wrote:
> >> > > > >>>>>>>>>
> >> > > > >>>>>>>>>> Hi folks,
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> You’ve probably realised lots of update emails coming
> >> from
> >> > > > >>> Jira.
> >> > > > >>>>>> Please
> >> > > > >>>>>>>>>> be aware that we’ve updated a bunch of open
> >> blocker/critical
> >> > > > >>> 3.5
> >> > > > >>>>>>> tickets to
> >> > > > >>>>>>>>>> reflect to what we discussed in this email.
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> If you open up the following jira filter:
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
> >> > > > >> fixVersion
> >> > > > >>> =
> >> > > > >>>>>> 3.5.5
> >> > > > >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority
> >> DESC,
> >> > > > >>> key ASC
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> You’ll see the most up-to-date list of tickets which
> >> need to
> >> > > > >> be
> >> > > > >>>>>>> addressed
> >> > > > >>>>>>>>>> before the stable 3.5 release.
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> Thank you for your efforts to get this done.
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> >> > > > >>> already
> >> > > > >>>>> been
> >> > > > >>>>>>>>>> resolved. Have you created a separate ticket for the
> >> backport
> >> > > > >>> or
> >> > > > >>>>>> shall
> >> > > > >>>>>>> I
> >> > > > >>>>>>>>>> just reopen it with the right fix versions?
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>> Thanks,
> >> > > > >>>>>>>>>> Andor
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <
> >> andor@apache.org>
> >> > > > >>> wrote:
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>> Hi,
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>> Let me summarize and give a quick update on the
> >> outstanding
> >> > > > >>> issues
> >> > > > >>>>>> for
> >> > > > >>>>>>>>>> 3.5 GA:
> >> > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >> > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> >> follower
> >> > > > >>> sync
> >> > > > >>>>>> with
> >> > > > >>>>>>>>>> leader and follower receiving external connection
> >> requests.)
> >> > > > >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> >> > > > >> (ongoing)
> >> > > > >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
> >> > > > >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
> >> > > > >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as
> >> the
> >> > > > >>>>>> backports.
> >> > > > >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far
> >> as I
> >> > > > >> can
> >> > > > >>>>> see,
> >> > > > >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
> >> > > > >>>>>>>>>>> Please correct me if I’ve missed something.
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>> Regards,
> >> > > > >>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>>
> >> > > > >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> >> > > > >>>>>> <tamaas@cloudera.com.INVALID
> >> > > > >>>>>>>>
> >> > > > >>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>> Hi All,
> >> > > > >>>>>>>>>>>>
> >> > > > >>>>>>>>>>>> I would add ZOOKEEPER-3021
> >> > > > >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> >> > > > >>> Migrate
> >> > > > >>>>>>> project
> >> > > > >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
> >> > > > >>> migration
> >> > > > >>>>> has
> >> > > > >>>>>>>>>> started
> >> > > > >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x
> >> GA.
> >> > > > >>>>>>>>>>>>
> >> > > > >>>>>>>>>>>> ZOOKEEPER-925 <
> >> > > > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> >> > > > >>>>>>>
> >> > > > >>>>>>>>>> replace
> >> > > > >>>>>>>>>>>> our forrest site and documentation generation might
> >> also
> >> > > > >> be a
> >> > > > >>>>> good
> >> > > > >>>>>>>>>> idea,
> >> > > > >>>>>>>>>>>> since then we could deliver the new MarkDown based
> >> > > > >>> documentation.
> >> > > > >>>>>>>>>>>>
> >> > > > >>>>>>>>>>>> Regards, Tamaas
> >> > > > >>>>>>>>>>>>
> >> > > > >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> >> > > > >>> lvfangmin@gmail.com
> >> > > > >>>>>>
> >> > > > >>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
> >> > > > >> context.
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if
> >> the
> >> > > > >>> peer
> >> > > > >>>>>> zxid
> >> > > > >>>>>>>>>> is not
> >> > > > >>>>>>>>>>>>> in it's in memory commit logs, the code is here:
> >> Leader on
> >> > > > >>> disk
> >> > > > >>>>>> txn
> >> > > > >>>>>>>>>> sync
> >> > > > >>>>>>>>>>>>> <
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> >> > > > >>>>>>>>>>>>>> .
> >> > > > >>>>>>>>>>>>> There is bug that potentially there will be gap in
> >> the txn
> >> > > > >>>>> files,
> >> > > > >>>>>>> like
> >> > > > >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will
> >> miss
> >> > > > >>> txns
> >> > > > >>>>> due
> >> > > > >>>>>>> to
> >> > > > >>>>>>>>>> this.
> >> > > > >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
> >> > > > >>>>>>>>>>>>> <
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> >> > > > >>>>>>>>>>>>>> ,
> >> > > > >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
> >> > > > >> better
> >> > > > >>> to
> >> > > > >>>>>> have
> >> > > > >>>>>>> a
> >> > > > >>>>>>>>>> PR to
> >> > > > >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP
> >> sync,
> >> > > > >> but
> >> > > > >>>>> from
> >> > > > >>>>>>> our
> >> > > > >>>>>>>>>> prod
> >> > > > >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>> I can send out the diff to disable it by default on
> >> 3.5 if
> >> > > > >>> you
> >> > > > >>>>>> guys
> >> > > > >>>>>>>>>> think
> >> > > > >>>>>>>>>>>>> this is the right way to do.
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>> Thanks,
> >> > > > >>>>>>>>>>>>> Fangmin
> >> > > > >>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> >> > > > >>> andor@apache.org>
> >> > > > >>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>> What’s needed to turn it off?
> >> > > > >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
> >> > > > >>>>>>>>>>>>>> Shall we implement a feature switch for that and
> >> turn it
> >> > > > >>> off by
> >> > > > >>>>>>>>>> default?
> >> > > > >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
> >> > > > >>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> >> > > > >>> lvfangmin@gmail.com>
> >> > > > >>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> >> > > > >> case
> >> > > > >>> of
> >> > > > >>>>>>>>>>>>>> inconsistency
> >> > > > >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I
> >> mentioned
> >> > > > >> in
> >> > > > >>> a
> >> > > > >>>>>> newer
> >> > > > >>>>>>>>>> JIRA
> >> > > > >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
> >> > > > >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> >> > > > >>>>>>>>>>>>>> the
> >> > > > >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in
> >> the
> >> > > > >> txn
> >> > > > >>>>> file,
> >> > > > >>>>>>>>>> which
> >> > > > >>>>>>>>>>>>>> is a
> >> > > > >>>>>>>>>>>>>>> more common case could trigger this issue.
> >> > > > >>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
> >> > > > >>> default
> >> > > > >>>>> for
> >> > > > >>>>>>> now
> >> > > > >>>>>>>>>> to
> >> > > > >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114,
> >> we
> >> > > > >>> can use
> >> > > > >>>>>>> that
> >> > > > >>>>>>>>>> to
> >> > > > >>>>>>>>>>>>>>> validate the on disk txns during syncing.
> >> > > > >>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>> Thanks,
> >> > > > >>>>>>>>>>>>>>> Fangmin
> >> > > > >>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> >> > > > >>>>> lvfangmin@gmail.com
> >> > > > >>>>>>>
> >> > > > >>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>> Andor,
> >> > > > >>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> >> > > > >> checking
> >> > > > >>> to
> >> > > > >>>>>> help
> >> > > > >>>>>>>>>>>>>> detecting
> >> > > > >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of
> >> code
> >> > > > >>>>> change.
> >> > > > >>>>>>> I'll
> >> > > > >>>>>>>>>>>>>> start
> >> > > > >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's
> >> being
> >> > > > >>>>> merged
> >> > > > >>>>>> in
> >> > > > >>>>>>>>>> the
> >> > > > >>>>>>>>>>>>>> next
> >> > > > >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but
> >> definitely
> >> > > > >>> not a
> >> > > > >>>>>>> block
> >> > > > >>>>>>>>>> for
> >> > > > >>>>>>>>>>>>>> 3.5.
> >> > > > >>>>>>>>>>>>>>>> Thanks,
> >> > > > >>>>>>>>>>>>>>>> Fangmin
> >> > > > >>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> >> > > > >>>>> andor@apache.org
> >> > > > >>>>>>>
> >> > > > >>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>> Fangmin,
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
> >> > > > >>>>> consistency
> >> > > > >>>>>>>>>> fixes
> >> > > > >>>>>>>>>>>>> in
> >> > > > >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
> >> > > > >>> backports
> >> > > > >>>>>> and
> >> > > > >>>>>>>>>>>>> we’ll
> >> > > > >>>>>>>>>>>>>> be
> >> > > > >>>>>>>>>>>>>>>>> done with them.
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> >> > > > >> wouldn’t
> >> > > > >>>>> block
> >> > > > >>>>>>> 3.5
> >> > > > >>>>>>>>>>>>> with
> >> > > > >>>>>>>>>>>>>>>>> that. What do you think?
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> >> > > > >>> andor@apache.org
> >> > > > >>>>>>
> >> > > > >>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>> The updated list is as follows:
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> >> Broadcast
> >> > > > >>>>>> protocol)
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> >> between
> >> > > > >>>>> follower
> >> > > > >>>>>>> sync
> >> > > > >>>>>>>>>>>>> with
> >> > > > >>>>>>>>>>>>>>>>> leader and follower receiving external connection
> >> > > > >>> requests.)
> >> > > > >>>>>>>>>>>>>>>>>> The following are not critical and no blockers
> >> for
> >> > > > >> the
> >> > > > >>>>> stable
> >> > > > >>>>>>>>>>>>> release:
> >> > > > >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>> New feature:
> >> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>> Regards,
> >> > > > >>>>>>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> >> > > > >>>>> lvfangmin@gmail.com>
> >> > > > >>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>>> Hi Andor,
> >> > > > >>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was
> >> disabled
> >> > > > >>>>>> internally
> >> > > > >>>>>>>>>> after
> >> > > > >>>>>>>>>>>>>> we
> >> > > > >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The
> >> only
> >> > > > >>>>> solution
> >> > > > >>>>>> we
> >> > > > >>>>>>>>>> have
> >> > > > >>>>>>>>>>>>>>>>> for now
> >> > > > >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
> >> > > > >>> mentioned
> >> > > > >>>>>> in
> >> > > > >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
> >> > > > >>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
> >> > > > >>> issues we
> >> > > > >>>>>>> just
> >> > > > >>>>>>>>>>>>> fixed
> >> > > > >>>>>>>>>>>>>>>>> on
> >> > > > >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> >> > > > >>>>>>>>>> ZOOKEEPER-3127, I
> >> > > > >>>>>>>>>>>>>>>>> think we
> >> > > > >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release
> >> as
> >> > > > >>> well.
> >> > > > >>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>> Thanks,
> >> > > > >>>>>>>>>>>>>>>>>>> Fangmin
> >> > > > >>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> >> > > > >>>>>>> andor@apache.org
> >> > > > >>>>>>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to
> >> remove it
> >> > > > >>> from
> >> > > > >>>>> the
> >> > > > >>>>>>>>>> list
> >> > > > >>>>>>>>>>>>> to
> >> > > > >>>>>>>>>>>>>>>>> get
> >> > > > >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to
> >> avoid
> >> > > > >>> data
> >> > > > >>>>>>>>>>>>>> inconsistency?
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> >> > > > >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> >> > > > >>> ignoring
> >> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
> >> > > > >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the
> >> feature
> >> > > > >> can
> >> > > > >>> be
> >> > > > >>>>>>>>>> disabled.
> >> > > > >>>>>>>>>>>>> We
> >> > > > >>>>>>>>>>>>>>>>> are
> >> > > > >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> >> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> Jeelani
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> >> > > > >>>>>>>>>> <andor@cloudera.com.INVALID
> >> > > > >>>>>>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>>>>> Fine.
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930.
> >> Still we
> >> > > > >>> have
> >> > > > >>>>> the
> >> > > > >>>>>>>>>> list
> >> > > > >>>>>>>>>>>>>> of:
> >> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> >> > > > >>> Broadcast
> >> > > > >>>>>>>>>> protocol)
> >> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> >> > > > >> sending
> >> > > > >>> some
> >> > > > >>>>>>>>>>>>>>>>>>>> transactions to
> >> > > > >>>>>>>>>>>>>>>>>>>>> followers)
> >> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> >> > > > >> between
> >> > > > >>>>>> follower
> >> > > > >>>>>>>>>>>>> sync
> >> > > > >>>>>>>>>>>>>>>>>>>> with
> >> > > > >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external
> >> connection
> >> > > > >>>>>> requests.)
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for
> >> the
> >> > > > >>> 3.5
> >> > > > >>>>>>> release,
> >> > > > >>>>>>>>>>>>>>>>> hence
> >> > > > >>>>>>>>>>>>>>>>>>>> I
> >> > > > >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the
> >> next
> >> > > > >>> stable
> >> > > > >>>>>>>>>>>>> release.
> >> > > > >>>>>>>>>>>>>>>>> PR
> >> > > > >>>>>>>>>>>>>>>>>>>> has
> >> > > > >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing
> >> please.
> >> > > > >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues
> >> which
> >> > > > >> have
> >> > > > >>>>> been
> >> > > > >>>>>>>>>> found
> >> > > > >>>>>>>>>>>>> in
> >> > > > >>>>>>>>>>>>>>>>>>>> the 3.5
> >> > > > >>>>>>>>>>>>>>>>>>>>> branch.
> >> > > > >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4
> >> and
> >> > > > >>> fixed in
> >> > > > >>>>>>> 3.4,
> >> > > > >>>>>>>>>>>>> but
> >> > > > >>>>>>>>>>>>>>>>>>>> never has
> >> > > > >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if
> >> still
> >> > > > >>>>> present.
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual
> >> testing
> >> > > > >>> and
> >> > > > >>>>> see
> >> > > > >>>>>>> if
> >> > > > >>>>>>>>>> we
> >> > > > >>>>>>>>>>>>>>>>>>>> could
> >> > > > >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead
> >> with
> >> > > > >> a
> >> > > > >>>>> stable
> >> > > > >>>>>>>>>>>>>> release.
> >> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
> >> > > > >>>>>>>>>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> >> > > > >>>>>>> hanm@apache.org>
> >> > > > >>>>>>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but
> >> looks
> >> > > > >>> like
> >> > > > >>>>>> lots
> >> > > > >>>>>>>>>> of
> >> > > > >>>>>>>>>>>>> the
> >> > > > >>>>>>>>>>>>>>>>>>>> JIRA
> >> > > > >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
> >> > > > >>> ZOOKEEPER-1549,
> >> > > > >>>>>> 2846,
> >> > > > >>>>>>>>>> also
> >> > > > >>>>>>>>>>>>>>>>>>>> affects
> >> > > > >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues
> >> out?
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
> >> > > > >>> blocking
> >> > > > >>>>>> issue
> >> > > > >>>>>>>>>> for a
> >> > > > >>>>>>>>>>>>>>>>>>>> stable 3.5
> >> > > > >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
> >> > > > >>> concerns
> >> > > > >>>>>>> around
> >> > > > >>>>>>>>>> it
> >> > > > >>>>>>>>>>>>>>>>>>>> (somehow
> >> > > > >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha
> >> and
> >> > > > >>> beta
> >> > > > >>>>>>> releases
> >> > > > >>>>>>>>>>>>> were
> >> > > > >>>>>>>>>>>>>>>>>>>> created
> >> > > > >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> >> > > > >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to
> >> release.
> >> > > > >>>>>> Something
> >> > > > >>>>>>>>>>>>> might
> >> > > > >>>>>>>>>>>>>>>>>>>> worth
> >> > > > >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par
> >> with
> >> > > > >>> 3.4
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test
> >> for the
> >> > > > >>>>> record
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> >> > > > >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has
> >> little or
> >> > > > >> no
> >> > > > >>>>> flaky
> >> > > > >>>>>>>>>> tests
> >> > > > >>>>>>>>>>>>> at
> >> > > > >>>>>>>>>>>>>>>>>>>> all.
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> >> > > > >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
> >> > > > >>>>>>>>>>>>>>>>>>>>>> wrote:
> >> > > > >>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> >> > > > >>> appreciate
> >> > > > >>>>>> it.
> >> > > > >>>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>>> Andor
> >> > > > >>>>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>>>>>
> >> > > > >>>>>>>>>>
> >> > > > >>>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>
> >> > > > >>>
> >> > > > >>
> >> > > >
> >> > > > --
> >> > >
> >> > >
> >> > > -- Enrico Olivelli
> >> > >
> >>
> >

Re: ZooKeeper 3.5 blocker issues

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
One more thing - for the CI integration, actually, Java reports from Clover
- by maven - should be already available on master. Basically we only
analyse zookeeper-server, but as it is planned to be seperated, I wrote
Clover so it starts from root pom and aggregates the results.

On Thu, Dec 20, 2018 at 10:17 AM Norbert Kalmar <nk...@cloudera.com>
wrote:

> Sure, sorry about that, I didn't check thoroughly the static analyser
> part. Spotbugs works for me!
>
> Thanks Enrico!
>
> On Thu, Dec 20, 2018 at 10:10 AM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
>> Great
>>
>> Il giorno gio 20 dic 2018 alle ore 10:07 Norbert Kalmar
>> <nk...@cloudera.com.invalid> ha scritto:
>> >
>> > Subtasks:
>> > Findbugs, checkstyle -
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3223
>>
>> We don't have checksyle. In my experience introducing checkstyle break
>> every pending patch.
>> I would like to narrow down this issue to "Spotbugs" and pick in up
>>
>> > CI integration - https://issues.apache.org/jira/browse/ZOOKEEPER-3224
>> I would like to pick this up
>>
>> Enrico
>>
>> > Code coverage - https://issues.apache.org/jira/browse/ZOOKEEPER-3225 -
>> I
>> > laready started this one and some of it is committed with the patch, so
>> I
>> > will continue to work on it.
>> > Recipes and contrib -
>> https://issues.apache.org/jira/browse/ZOOKEEPER-3171
>> > - Already on it, recipes is done, PR soon available.
>> > Assembly - https://issues.apache.org/jira/browse/ZOOKEEPER-3029
>> >
>> > This are the tasks left I can think of. If anything is missing, feel
>> free
>> > to create a jira, or let me know.
>> > The ones I'm already working on - 3225, 3171 - I made a comment, Those
>> > should be ready this week.
>>
>> >
>> > Thanks,
>> > Norbert
>> >
>> >
>> > On Thu, Dec 20, 2018 at 9:07 AM Enrico Olivelli <eo...@gmail.com>
>> wrote:
>> >
>> > > Great.
>> > > Can you create JIRA tickets for remaining subtask? So that I can pick
>> them
>> > > up?
>> > > I volounter for spotbugs and for CI integration, but let's see the
>> list
>> > > Enrico
>> > >
>> > > Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:
>> > >
>> > > > Ok. Looks like ant still works properly, so let’s commit this patch
>> and
>> > > > you guys can collaborate to polish the Maven build.
>> > > >
>> > > > For now, it’s master-only.
>> > > >
>> > > > Thanks,
>> > > > Andor
>> > > >
>> > > >
>> > > >
>> > > > > On 2018. Dec 19., at 16:44, Norbert Kalmar
>> > > <nk...@cloudera.com.INVALID>
>> > > > wrote:
>> > > > >
>> > > > > Thank you Enrico, I agree, that we could commit this patch at it's
>> > > > current
>> > > > > state, it fulfills the original jira anyways.
>> > > > >
>> > > > > I'll see what's wrong with the java tests, but honestly, it looks
>> like
>> > > > > they're just flaky... runs well on local builds with 8 thread.
>> > > > >
>> > > > > Regards,
>> > > > > Norbert
>> > > > >
>> > > > > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes
>> > > <tamaas@cloudera.com.invalid
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > >> Hi All,
>> > > > >>
>> > > > >> For assembly task I would promote the way how HBase works.
>> > > > >> They create a pure source and a bin tarball separately. Please
>> see how
>> > > > they
>> > > > >> create a release here:
>> > > > >>
>> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
>> > > > >> We could probably use the well known "copy+paste technology" to
>> have
>> > > it
>> > > > >> within ZooKeeper the same way. ;-)
>> > > > >>
>> > > > >> Regards, Tamaas
>> > > > >>
>> > > > >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <
>> eolivelli@gmail.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >>> Great work Norbert
>> > > > >>> I you want I can help,especially for rat, findbugs (need to
>> switch to
>> > > > >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
>> > > > >>> Plugin in other projects)
>> > > > >>> But I am not sure how can I help you concretely if we do not
>> commit
>> > > > your
>> > > > >>> work.
>> > > > >>> We could commit the work as it is now, leaving "ant" as official
>> > > build
>> > > > >>> method, but having the poms committed will ease collaboration.
>> > > > >>>
>> > > > >>> We will also have to work on CI jobs, I can help on that part
>> as well
>> > > > >>>
>> > > > >>> Enrico
>> > > > >>>
>> > > > >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
>> > > > >>> <nk...@cloudera.com.invalid> ha scritto:
>> > > > >>>>
>> > > > >>>> Hi everyone,
>> > > > >>>>
>> > > > >>>> Some update on the maven migration: I had a few bumps here and
>> there
>> > > > >>> (just
>> > > > >>>> looking at the latest patch Andor linked -
>> > > > >>>> https://github.com/apache/zookeeper/pull/708 - you can see on
>> the
>> > > > >>> commits).
>> > > > >>>> Current state is that the build works, tests run, but reports
>> like
>> > > > >>>> findbugs, clover etc. are not yet implemented. Maven has
>> plugins for
>> > > > >> them
>> > > > >>>> usually, but it's not always trivial, especially with the C
>> client.
>> > > > The
>> > > > >>>> assembly is also left to be done, but it should be fairly easy
>> to
>> > > do a
>> > > > >>>> similar tarball then ant does (although this will be also an
>> > > > >> interesting
>> > > > >>>> task, as ant does some strange things, like duplicated sources
>> of
>> > > most
>> > > > >>>> contrib projects).
>> > > > >>>>
>> > > > >>>> I had a seperate jira to do the recipes and contrib maven
>> build. I
>> > > do
>> > > > >> not
>> > > > >>>> have open PR for it, but recipes is done and I am now working
>> on the
>> > > > >>>> contrib projects. Most of them is manually build and never gets
>> > > called
>> > > > >>> from
>> > > > >>>> the main build.xml. I will not integrate these either to the
>> maven
>> > > > >> build.
>> > > > >>>> The reason is that there are plans to remove some of them from
>> ZK
>> > > repo
>> > > > >>>> anyway. The other reason is that for starters, we want to
>> replicate
>> > > > the
>> > > > >>> ant
>> > > > >>>> build as closely as possible, without doing any nasty
>> workarounds in
>> > > > >>> maven
>> > > > >>>> to achieve that. And from there, we can improve, use maven's
>> > > > advantages
>> > > > >>> to
>> > > > >>>> shape the build of ZooKeeper. Once it is stable and proven to
>> have
>> > > all
>> > > > >>> the
>> > > > >>>> functionality required for build and release.
>> > > > >>>>
>> > > > >>>> Right now, I am trying to stabilize the build as much as
>> possible.
>> > > > >> Andor
>> > > > >>>> also fixed some flaky C tests that for some strange reasons,
>> become
>> > > > >>>> extremely flaky with the maven build:
>> > > > >>>> https://github.com/apache/zookeeper/pull/740
>> > > > >>>>
>> > > > >>>> Regards,
>> > > > >>>> Norbert
>> > > > >>>>
>> > > > >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
>> > > > >> <andor@cloudera.com.invalid
>> > > > >>>>
>> > > > >>>> wrote:
>> > > > >>>>
>> > > > >>>>> Sure, good point. Let's put it on the list.
>> > > > >>>>>
>> > > > >>>>> Andor
>> > > > >>>>>
>> > > > >>>>>
>> > > > >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <
>> phunt@apache.org>
>> > > > >>> wrote:
>> > > > >>>>>
>> > > > >>>>>> Are folks OK to wait on that OWASP issue I documented over
>> the
>> > > > >>> weekend?
>> > > > >>>>>> afaict we are not affected but it would be good to get
>> another
>> > > pair
>> > > > >>> of
>> > > > >>>>> eyes
>> > > > >>>>>> on it.
>> > > > >>>>>>
>> > > > >>>>>> Patrick
>> > > > >>>>>>
>> > > > >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <
>> andor@apache.org>
>> > > > >>> wrote:
>> > > > >>>>>>
>> > > > >>>>>>> Hi team,
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> I'm proudly announce that thanks to the joint effort from
>> the
>> > > > >>>>> community,
>> > > > >>>>>>> the 3.5 blockers list has become empty:
>> > > > >>>>>>>
>> > > > >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND
>> fixVersion =
>> > > > >>> 3.5.5
>> > > > >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
>> key
>> > > > >>> ASC"
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> Well... almost. All the blocker issues have gone, but we
>> still
>> > > > >>> have the
>> > > > >>>>>>> Maven migration to complete before the stable release. If
>> you
>> > > > >> have
>> > > > >>> some
>> > > > >>>>>>> free cycles, please join us testing the Maven build on this
>> PR:
>> > > > >>>>>>>
>> > > > >>>>>>> https://github.com/apache/zookeeper/pull/708
>> > > > >>>>>>>
>> > > > >>>>>>> I hope we can merge it pretty soon.
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> In terms of the builds, the weather at 3.5 branch is quite
>> sunny
>> > > > >>>>>> nowadays:
>> > > > >>>>>>>
>> > > > >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
>> > > > >>>>>>>
>> > > > >>>>>>> The Java 11 build is still having some difficulties, which
>> > > > >>> hopefully I
>> > > > >>>>>>> can address before the holidays:
>> > > > >>>>>>>
>> > > > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> If you happen to know about something which is important
>> from
>> > > > >> 3.5's
>> > > > >>>>>>> perspective and missing from the above, please don't
>> hesitate to
>> > > > >>> share.
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> Happy ZooKeeping!
>> > > > >>>>>>>
>> > > > >>>>>>> Andor
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
>> > > > >>>>>>>> Andor,
>> > > > >>>>>>>>
>> > > > >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
>> > > > >>>>>>>> https://github.com/apache/zookeeper/pull/685.
>> > > > >>>>>>>>
>> > > > >>>>>>>> Fangmin
>> > > > >>>>>>>>
>> > > > >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
>> > > > >> lvfangmin@gmail.com>
>> > > > >>>>>> wrote:
>> > > > >>>>>>>>
>> > > > >>>>>>>>> Hi Andor,
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there
>> is no
>> > > > >>> one
>> > > > >>>>>>> working
>> > > > >>>>>>>>> on it yet.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Fangmin
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
>> > > > >> andor@apache.org>
>> > > > >>>>>> wrote:
>> > > > >>>>>>>>>
>> > > > >>>>>>>>>> Hi folks,
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> You’ve probably realised lots of update emails coming
>> from
>> > > > >>> Jira.
>> > > > >>>>>> Please
>> > > > >>>>>>>>>> be aware that we’ve updated a bunch of open
>> blocker/critical
>> > > > >>> 3.5
>> > > > >>>>>>> tickets to
>> > > > >>>>>>>>>> reflect to what we discussed in this email.
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> If you open up the following jira filter:
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
>> > > > >> fixVersion
>> > > > >>> =
>> > > > >>>>>> 3.5.5
>> > > > >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority
>> DESC,
>> > > > >>> key ASC
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> You’ll see the most up-to-date list of tickets which
>> need to
>> > > > >> be
>> > > > >>>>>>> addressed
>> > > > >>>>>>>>>> before the stable 3.5 release.
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Thank you for your efforts to get this done.
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
>> > > > >>> already
>> > > > >>>>> been
>> > > > >>>>>>>>>> resolved. Have you created a separate ticket for the
>> backport
>> > > > >>> or
>> > > > >>>>>> shall
>> > > > >>>>>>> I
>> > > > >>>>>>>>>> just reopen it with the right fix versions?
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Thanks,
>> > > > >>>>>>>>>> Andor
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <
>> andor@apache.org>
>> > > > >>> wrote:
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Hi,
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Let me summarize and give a quick update on the
>> outstanding
>> > > > >>> issues
>> > > > >>>>>> for
>> > > > >>>>>>>>>> 3.5 GA:
>> > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>> > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
>> follower
>> > > > >>> sync
>> > > > >>>>>> with
>> > > > >>>>>>>>>> leader and follower receiving external connection
>> requests.)
>> > > > >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
>> > > > >> (ongoing)
>> > > > >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
>> > > > >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
>> > > > >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as
>> the
>> > > > >>>>>> backports.
>> > > > >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far
>> as I
>> > > > >> can
>> > > > >>>>> see,
>> > > > >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
>> > > > >>>>>>>>>>> Please correct me if I’ve missed something.
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Regards,
>> > > > >>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
>> > > > >>>>>> <tamaas@cloudera.com.INVALID
>> > > > >>>>>>>>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>> Hi All,
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> I would add ZOOKEEPER-3021
>> > > > >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
>> > > > >>> Migrate
>> > > > >>>>>>> project
>> > > > >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
>> > > > >>> migration
>> > > > >>>>> has
>> > > > >>>>>>>>>> started
>> > > > >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x
>> GA.
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> ZOOKEEPER-925 <
>> > > > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
>> > > > >>>>>>>
>> > > > >>>>>>>>>> replace
>> > > > >>>>>>>>>>>> our forrest site and documentation generation might
>> also
>> > > > >> be a
>> > > > >>>>> good
>> > > > >>>>>>>>>> idea,
>> > > > >>>>>>>>>>>> since then we could deliver the new MarkDown based
>> > > > >>> documentation.
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> Regards, Tamaas
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
>> > > > >>> lvfangmin@gmail.com
>> > > > >>>>>>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
>> > > > >> context.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if
>> the
>> > > > >>> peer
>> > > > >>>>>> zxid
>> > > > >>>>>>>>>> is not
>> > > > >>>>>>>>>>>>> in it's in memory commit logs, the code is here:
>> Leader on
>> > > > >>> disk
>> > > > >>>>>> txn
>> > > > >>>>>>>>>> sync
>> > > > >>>>>>>>>>>>> <
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
>> > > > >>>>>>>>>>>>>> .
>> > > > >>>>>>>>>>>>> There is bug that potentially there will be gap in
>> the txn
>> > > > >>>>> files,
>> > > > >>>>>>> like
>> > > > >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will
>> miss
>> > > > >>> txns
>> > > > >>>>> due
>> > > > >>>>>>> to
>> > > > >>>>>>>>>> this.
>> > > > >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
>> > > > >>>>>>>>>>>>> <
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
>> > > > >>>>>>>>>>>>>> ,
>> > > > >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
>> > > > >> better
>> > > > >>> to
>> > > > >>>>>> have
>> > > > >>>>>>> a
>> > > > >>>>>>>>>> PR to
>> > > > >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP
>> sync,
>> > > > >> but
>> > > > >>>>> from
>> > > > >>>>>>> our
>> > > > >>>>>>>>>> prod
>> > > > >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> I can send out the diff to disable it by default on
>> 3.5 if
>> > > > >>> you
>> > > > >>>>>> guys
>> > > > >>>>>>>>>> think
>> > > > >>>>>>>>>>>>> this is the right way to do.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>> Fangmin
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
>> > > > >>> andor@apache.org>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>> What’s needed to turn it off?
>> > > > >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
>> > > > >>>>>>>>>>>>>> Shall we implement a feature switch for that and
>> turn it
>> > > > >>> off by
>> > > > >>>>>>>>>> default?
>> > > > >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
>> > > > >>> lvfangmin@gmail.com>
>> > > > >>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
>> > > > >> case
>> > > > >>> of
>> > > > >>>>>>>>>>>>>> inconsistency
>> > > > >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I
>> mentioned
>> > > > >> in
>> > > > >>> a
>> > > > >>>>>> newer
>> > > > >>>>>>>>>> JIRA
>> > > > >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
>> > > > >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
>> > > > >>>>>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in
>> the
>> > > > >> txn
>> > > > >>>>> file,
>> > > > >>>>>>>>>> which
>> > > > >>>>>>>>>>>>>> is a
>> > > > >>>>>>>>>>>>>>> more common case could trigger this issue.
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
>> > > > >>> default
>> > > > >>>>> for
>> > > > >>>>>>> now
>> > > > >>>>>>>>>> to
>> > > > >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114,
>> we
>> > > > >>> can use
>> > > > >>>>>>> that
>> > > > >>>>>>>>>> to
>> > > > >>>>>>>>>>>>>>> validate the on disk txns during syncing.
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>>>> Fangmin
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
>> > > > >>>>> lvfangmin@gmail.com
>> > > > >>>>>>>
>> > > > >>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>> Andor,
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
>> > > > >> checking
>> > > > >>> to
>> > > > >>>>>> help
>> > > > >>>>>>>>>>>>>> detecting
>> > > > >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of
>> code
>> > > > >>>>> change.
>> > > > >>>>>>> I'll
>> > > > >>>>>>>>>>>>>> start
>> > > > >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's
>> being
>> > > > >>>>> merged
>> > > > >>>>>> in
>> > > > >>>>>>>>>> the
>> > > > >>>>>>>>>>>>>> next
>> > > > >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but
>> definitely
>> > > > >>> not a
>> > > > >>>>>>> block
>> > > > >>>>>>>>>> for
>> > > > >>>>>>>>>>>>>> 3.5.
>> > > > >>>>>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>>>>> Fangmin
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
>> > > > >>>>> andor@apache.org
>> > > > >>>>>>>
>> > > > >>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>> Fangmin,
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
>> > > > >>>>> consistency
>> > > > >>>>>>>>>> fixes
>> > > > >>>>>>>>>>>>> in
>> > > > >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
>> > > > >>> backports
>> > > > >>>>>> and
>> > > > >>>>>>>>>>>>> we’ll
>> > > > >>>>>>>>>>>>>> be
>> > > > >>>>>>>>>>>>>>>>> done with them.
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
>> > > > >> wouldn’t
>> > > > >>>>> block
>> > > > >>>>>>> 3.5
>> > > > >>>>>>>>>>>>> with
>> > > > >>>>>>>>>>>>>>>>> that. What do you think?
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
>> > > > >>> andor@apache.org
>> > > > >>>>>>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> The updated list is as follows:
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
>> Broadcast
>> > > > >>>>>> protocol)
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
>> between
>> > > > >>>>> follower
>> > > > >>>>>>> sync
>> > > > >>>>>>>>>>>>> with
>> > > > >>>>>>>>>>>>>>>>> leader and follower receiving external connection
>> > > > >>> requests.)
>> > > > >>>>>>>>>>>>>>>>>> The following are not critical and no blockers
>> for
>> > > > >> the
>> > > > >>>>> stable
>> > > > >>>>>>>>>>>>> release:
>> > > > >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> New feature:
>> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> Regards,
>> > > > >>>>>>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
>> > > > >>>>> lvfangmin@gmail.com>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>>> Hi Andor,
>> > > > >>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was
>> disabled
>> > > > >>>>>> internally
>> > > > >>>>>>>>>> after
>> > > > >>>>>>>>>>>>>> we
>> > > > >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The
>> only
>> > > > >>>>> solution
>> > > > >>>>>> we
>> > > > >>>>>>>>>> have
>> > > > >>>>>>>>>>>>>>>>> for now
>> > > > >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
>> > > > >>> mentioned
>> > > > >>>>>> in
>> > > > >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
>> > > > >>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
>> > > > >>> issues we
>> > > > >>>>>>> just
>> > > > >>>>>>>>>>>>> fixed
>> > > > >>>>>>>>>>>>>>>>> on
>> > > > >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
>> > > > >>>>>>>>>> ZOOKEEPER-3127, I
>> > > > >>>>>>>>>>>>>>>>> think we
>> > > > >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release
>> as
>> > > > >>> well.
>> > > > >>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>>>>>>>> Fangmin
>> > > > >>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
>> > > > >>>>>>> andor@apache.org
>> > > > >>>>>>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to
>> remove it
>> > > > >>> from
>> > > > >>>>> the
>> > > > >>>>>>>>>> list
>> > > > >>>>>>>>>>>>> to
>> > > > >>>>>>>>>>>>>>>>> get
>> > > > >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to
>> avoid
>> > > > >>> data
>> > > > >>>>>>>>>>>>>> inconsistency?
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
>> > > > >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
>> > > > >>> ignoring
>> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
>> > > > >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the
>> feature
>> > > > >> can
>> > > > >>> be
>> > > > >>>>>>>>>> disabled.
>> > > > >>>>>>>>>>>>> We
>> > > > >>>>>>>>>>>>>>>>> are
>> > > > >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
>> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> Jeelani
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
>> > > > >>>>>>>>>> <andor@cloudera.com.INVALID
>> > > > >>>>>>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>>>>> Fine.
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930.
>> Still we
>> > > > >>> have
>> > > > >>>>> the
>> > > > >>>>>>>>>> list
>> > > > >>>>>>>>>>>>>> of:
>> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
>> > > > >>> Broadcast
>> > > > >>>>>>>>>> protocol)
>> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
>> > > > >> sending
>> > > > >>> some
>> > > > >>>>>>>>>>>>>>>>>>>> transactions to
>> > > > >>>>>>>>>>>>>>>>>>>>> followers)
>> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
>> > > > >> between
>> > > > >>>>>> follower
>> > > > >>>>>>>>>>>>> sync
>> > > > >>>>>>>>>>>>>>>>>>>> with
>> > > > >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external
>> connection
>> > > > >>>>>> requests.)
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for
>> the
>> > > > >>> 3.5
>> > > > >>>>>>> release,
>> > > > >>>>>>>>>>>>>>>>> hence
>> > > > >>>>>>>>>>>>>>>>>>>> I
>> > > > >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the
>> next
>> > > > >>> stable
>> > > > >>>>>>>>>>>>> release.
>> > > > >>>>>>>>>>>>>>>>> PR
>> > > > >>>>>>>>>>>>>>>>>>>> has
>> > > > >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing
>> please.
>> > > > >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues
>> which
>> > > > >> have
>> > > > >>>>> been
>> > > > >>>>>>>>>> found
>> > > > >>>>>>>>>>>>> in
>> > > > >>>>>>>>>>>>>>>>>>>> the 3.5
>> > > > >>>>>>>>>>>>>>>>>>>>> branch.
>> > > > >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4
>> and
>> > > > >>> fixed in
>> > > > >>>>>>> 3.4,
>> > > > >>>>>>>>>>>>> but
>> > > > >>>>>>>>>>>>>>>>>>>> never has
>> > > > >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if
>> still
>> > > > >>>>> present.
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual
>> testing
>> > > > >>> and
>> > > > >>>>> see
>> > > > >>>>>>> if
>> > > > >>>>>>>>>> we
>> > > > >>>>>>>>>>>>>>>>>>>> could
>> > > > >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead
>> with
>> > > > >> a
>> > > > >>>>> stable
>> > > > >>>>>>>>>>>>>> release.
>> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
>> > > > >>>>>>>>>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
>> > > > >>>>>>> hanm@apache.org>
>> > > > >>>>>>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but
>> looks
>> > > > >>> like
>> > > > >>>>>> lots
>> > > > >>>>>>>>>> of
>> > > > >>>>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>>>>>>>> JIRA
>> > > > >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
>> > > > >>> ZOOKEEPER-1549,
>> > > > >>>>>> 2846,
>> > > > >>>>>>>>>> also
>> > > > >>>>>>>>>>>>>>>>>>>> affects
>> > > > >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues
>> out?
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
>> > > > >>> blocking
>> > > > >>>>>> issue
>> > > > >>>>>>>>>> for a
>> > > > >>>>>>>>>>>>>>>>>>>> stable 3.5
>> > > > >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
>> > > > >>> concerns
>> > > > >>>>>>> around
>> > > > >>>>>>>>>> it
>> > > > >>>>>>>>>>>>>>>>>>>> (somehow
>> > > > >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha
>> and
>> > > > >>> beta
>> > > > >>>>>>> releases
>> > > > >>>>>>>>>>>>> were
>> > > > >>>>>>>>>>>>>>>>>>>> created
>> > > > >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
>> > > > >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to
>> release.
>> > > > >>>>>> Something
>> > > > >>>>>>>>>>>>> might
>> > > > >>>>>>>>>>>>>>>>>>>> worth
>> > > > >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par
>> with
>> > > > >>> 3.4
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test
>> for the
>> > > > >>>>> record
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
>> > > > >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has
>> little or
>> > > > >> no
>> > > > >>>>> flaky
>> > > > >>>>>>>>>> tests
>> > > > >>>>>>>>>>>>> at
>> > > > >>>>>>>>>>>>>>>>>>>> all.
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
>> > > > >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
>> > > > >>>>>>>>>>>>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
>> > > > >>> appreciate
>> > > > >>>>>> it.
>> > > > >>>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>>> Andor
>> > > > >>>>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > > > --
>> > >
>> > >
>> > > -- Enrico Olivelli
>> > >
>>
>

Re: ZooKeeper 3.5 blocker issues

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Sure, sorry about that, I didn't check thoroughly the static analyser part.
Spotbugs works for me!

Thanks Enrico!

On Thu, Dec 20, 2018 at 10:10 AM Enrico Olivelli <eo...@gmail.com>
wrote:

> Great
>
> Il giorno gio 20 dic 2018 alle ore 10:07 Norbert Kalmar
> <nk...@cloudera.com.invalid> ha scritto:
> >
> > Subtasks:
> > Findbugs, checkstyle -
> https://issues.apache.org/jira/browse/ZOOKEEPER-3223
>
> We don't have checksyle. In my experience introducing checkstyle break
> every pending patch.
> I would like to narrow down this issue to "Spotbugs" and pick in up
>
> > CI integration - https://issues.apache.org/jira/browse/ZOOKEEPER-3224
> I would like to pick this up
>
> Enrico
>
> > Code coverage - https://issues.apache.org/jira/browse/ZOOKEEPER-3225 - I
> > laready started this one and some of it is committed with the patch, so I
> > will continue to work on it.
> > Recipes and contrib -
> https://issues.apache.org/jira/browse/ZOOKEEPER-3171
> > - Already on it, recipes is done, PR soon available.
> > Assembly - https://issues.apache.org/jira/browse/ZOOKEEPER-3029
> >
> > This are the tasks left I can think of. If anything is missing, feel free
> > to create a jira, or let me know.
> > The ones I'm already working on - 3225, 3171 - I made a comment, Those
> > should be ready this week.
>
> >
> > Thanks,
> > Norbert
> >
> >
> > On Thu, Dec 20, 2018 at 9:07 AM Enrico Olivelli <eo...@gmail.com>
> wrote:
> >
> > > Great.
> > > Can you create JIRA tickets for remaining subtask? So that I can pick
> them
> > > up?
> > > I volounter for spotbugs and for CI integration, but let's see the list
> > > Enrico
> > >
> > > Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:
> > >
> > > > Ok. Looks like ant still works properly, so let’s commit this patch
> and
> > > > you guys can collaborate to polish the Maven build.
> > > >
> > > > For now, it’s master-only.
> > > >
> > > > Thanks,
> > > > Andor
> > > >
> > > >
> > > >
> > > > > On 2018. Dec 19., at 16:44, Norbert Kalmar
> > > <nk...@cloudera.com.INVALID>
> > > > wrote:
> > > > >
> > > > > Thank you Enrico, I agree, that we could commit this patch at it's
> > > > current
> > > > > state, it fulfills the original jira anyways.
> > > > >
> > > > > I'll see what's wrong with the java tests, but honestly, it looks
> like
> > > > > they're just flaky... runs well on local builds with 8 thread.
> > > > >
> > > > > Regards,
> > > > > Norbert
> > > > >
> > > > > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes
> > > <tamaas@cloudera.com.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hi All,
> > > > >>
> > > > >> For assembly task I would promote the way how HBase works.
> > > > >> They create a pure source and a bin tarball separately. Please
> see how
> > > > they
> > > > >> create a release here:
> > > > >>
> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> > > > >> We could probably use the well known "copy+paste technology" to
> have
> > > it
> > > > >> within ZooKeeper the same way. ;-)
> > > > >>
> > > > >> Regards, Tamaas
> > > > >>
> > > > >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <
> eolivelli@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Great work Norbert
> > > > >>> I you want I can help,especially for rat, findbugs (need to
> switch to
> > > > >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
> > > > >>> Plugin in other projects)
> > > > >>> But I am not sure how can I help you concretely if we do not
> commit
> > > > your
> > > > >>> work.
> > > > >>> We could commit the work as it is now, leaving "ant" as official
> > > build
> > > > >>> method, but having the poms committed will ease collaboration.
> > > > >>>
> > > > >>> We will also have to work on CI jobs, I can help on that part as
> well
> > > > >>>
> > > > >>> Enrico
> > > > >>>
> > > > >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> > > > >>> <nk...@cloudera.com.invalid> ha scritto:
> > > > >>>>
> > > > >>>> Hi everyone,
> > > > >>>>
> > > > >>>> Some update on the maven migration: I had a few bumps here and
> there
> > > > >>> (just
> > > > >>>> looking at the latest patch Andor linked -
> > > > >>>> https://github.com/apache/zookeeper/pull/708 - you can see on
> the
> > > > >>> commits).
> > > > >>>> Current state is that the build works, tests run, but reports
> like
> > > > >>>> findbugs, clover etc. are not yet implemented. Maven has
> plugins for
> > > > >> them
> > > > >>>> usually, but it's not always trivial, especially with the C
> client.
> > > > The
> > > > >>>> assembly is also left to be done, but it should be fairly easy
> to
> > > do a
> > > > >>>> similar tarball then ant does (although this will be also an
> > > > >> interesting
> > > > >>>> task, as ant does some strange things, like duplicated sources
> of
> > > most
> > > > >>>> contrib projects).
> > > > >>>>
> > > > >>>> I had a seperate jira to do the recipes and contrib maven
> build. I
> > > do
> > > > >> not
> > > > >>>> have open PR for it, but recipes is done and I am now working
> on the
> > > > >>>> contrib projects. Most of them is manually build and never gets
> > > called
> > > > >>> from
> > > > >>>> the main build.xml. I will not integrate these either to the
> maven
> > > > >> build.
> > > > >>>> The reason is that there are plans to remove some of them from
> ZK
> > > repo
> > > > >>>> anyway. The other reason is that for starters, we want to
> replicate
> > > > the
> > > > >>> ant
> > > > >>>> build as closely as possible, without doing any nasty
> workarounds in
> > > > >>> maven
> > > > >>>> to achieve that. And from there, we can improve, use maven's
> > > > advantages
> > > > >>> to
> > > > >>>> shape the build of ZooKeeper. Once it is stable and proven to
> have
> > > all
> > > > >>> the
> > > > >>>> functionality required for build and release.
> > > > >>>>
> > > > >>>> Right now, I am trying to stabilize the build as much as
> possible.
> > > > >> Andor
> > > > >>>> also fixed some flaky C tests that for some strange reasons,
> become
> > > > >>>> extremely flaky with the maven build:
> > > > >>>> https://github.com/apache/zookeeper/pull/740
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Norbert
> > > > >>>>
> > > > >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> > > > >> <andor@cloudera.com.invalid
> > > > >>>>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Sure, good point. Let's put it on the list.
> > > > >>>>>
> > > > >>>>> Andor
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <
> phunt@apache.org>
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>>> Are folks OK to wait on that OWASP issue I documented over the
> > > > >>> weekend?
> > > > >>>>>> afaict we are not affected but it would be good to get another
> > > pair
> > > > >>> of
> > > > >>>>> eyes
> > > > >>>>>> on it.
> > > > >>>>>>
> > > > >>>>>> Patrick
> > > > >>>>>>
> > > > >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <
> andor@apache.org>
> > > > >>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi team,
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> I'm proudly announce that thanks to the joint effort from the
> > > > >>>>> community,
> > > > >>>>>>> the 3.5 blockers list has become empty:
> > > > >>>>>>>
> > > > >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND
> fixVersion =
> > > > >>> 3.5.5
> > > > >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
> key
> > > > >>> ASC"
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Well... almost. All the blocker issues have gone, but we
> still
> > > > >>> have the
> > > > >>>>>>> Maven migration to complete before the stable release. If you
> > > > >> have
> > > > >>> some
> > > > >>>>>>> free cycles, please join us testing the Maven build on this
> PR:
> > > > >>>>>>>
> > > > >>>>>>> https://github.com/apache/zookeeper/pull/708
> > > > >>>>>>>
> > > > >>>>>>> I hope we can merge it pretty soon.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> In terms of the builds, the weather at 3.5 branch is quite
> sunny
> > > > >>>>>> nowadays:
> > > > >>>>>>>
> > > > >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > > > >>>>>>>
> > > > >>>>>>> The Java 11 build is still having some difficulties, which
> > > > >>> hopefully I
> > > > >>>>>>> can address before the holidays:
> > > > >>>>>>>
> > > > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> If you happen to know about something which is important from
> > > > >> 3.5's
> > > > >>>>>>> perspective and missing from the above, please don't
> hesitate to
> > > > >>> share.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Happy ZooKeeping!
> > > > >>>>>>>
> > > > >>>>>>> Andor
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
> > > > >>>>>>>> Andor,
> > > > >>>>>>>>
> > > > >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
> > > > >>>>>>>> https://github.com/apache/zookeeper/pull/685.
> > > > >>>>>>>>
> > > > >>>>>>>> Fangmin
> > > > >>>>>>>>
> > > > >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> > > > >> lvfangmin@gmail.com>
> > > > >>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi Andor,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there is
> no
> > > > >>> one
> > > > >>>>>>> working
> > > > >>>>>>>>> on it yet.
> > > > >>>>>>>>>
> > > > >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
> > > > >>>>>>>>>
> > > > >>>>>>>>> Fangmin
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> > > > >> andor@apache.org>
> > > > >>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Hi folks,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> You’ve probably realised lots of update emails coming from
> > > > >>> Jira.
> > > > >>>>>> Please
> > > > >>>>>>>>>> be aware that we’ve updated a bunch of open
> blocker/critical
> > > > >>> 3.5
> > > > >>>>>>> tickets to
> > > > >>>>>>>>>> reflect to what we discussed in this email.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> If you open up the following jira filter:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
> > > > >> fixVersion
> > > > >>> =
> > > > >>>>>> 3.5.5
> > > > >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority
> DESC,
> > > > >>> key ASC
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> You’ll see the most up-to-date list of tickets which need
> to
> > > > >> be
> > > > >>>>>>> addressed
> > > > >>>>>>>>>> before the stable 3.5 release.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thank you for your efforts to get this done.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> > > > >>> already
> > > > >>>>> been
> > > > >>>>>>>>>> resolved. Have you created a separate ticket for the
> backport
> > > > >>> or
> > > > >>>>>> shall
> > > > >>>>>>> I
> > > > >>>>>>>>>> just reopen it with the right fix versions?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thanks,
> > > > >>>>>>>>>> Andor
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <
> andor@apache.org>
> > > > >>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Let me summarize and give a quick update on the
> outstanding
> > > > >>> issues
> > > > >>>>>> for
> > > > >>>>>>>>>> 3.5 GA:
> > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> follower
> > > > >>> sync
> > > > >>>>>> with
> > > > >>>>>>>>>> leader and follower receiving external connection
> requests.)
> > > > >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> > > > >> (ongoing)
> > > > >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
> > > > >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
> > > > >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as
> the
> > > > >>>>>> backports.
> > > > >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far as
> I
> > > > >> can
> > > > >>>>> see,
> > > > >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
> > > > >>>>>>>>>>> Please correct me if I’ve missed something.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Regards,
> > > > >>>>>>>>>>> Andor
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > > > >>>>>> <tamaas@cloudera.com.INVALID
> > > > >>>>>>>>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>> Hi All,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I would add ZOOKEEPER-3021
> > > > >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> > > > >>> Migrate
> > > > >>>>>>> project
> > > > >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
> > > > >>> migration
> > > > >>>>> has
> > > > >>>>>>>>>> started
> > > > >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> ZOOKEEPER-925 <
> > > > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > > > >>>>>>>
> > > > >>>>>>>>>> replace
> > > > >>>>>>>>>>>> our forrest site and documentation generation might also
> > > > >> be a
> > > > >>>>> good
> > > > >>>>>>>>>> idea,
> > > > >>>>>>>>>>>> since then we could deliver the new MarkDown based
> > > > >>> documentation.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Regards, Tamaas
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> > > > >>> lvfangmin@gmail.com
> > > > >>>>>>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
> > > > >> context.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if
> the
> > > > >>> peer
> > > > >>>>>> zxid
> > > > >>>>>>>>>> is not
> > > > >>>>>>>>>>>>> in it's in memory commit logs, the code is here:
> Leader on
> > > > >>> disk
> > > > >>>>>> txn
> > > > >>>>>>>>>> sync
> > > > >>>>>>>>>>>>> <
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > > >>>>>>>>>>>>>> .
> > > > >>>>>>>>>>>>> There is bug that potentially there will be gap in the
> txn
> > > > >>>>> files,
> > > > >>>>>>> like
> > > > >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will
> miss
> > > > >>> txns
> > > > >>>>> due
> > > > >>>>>>> to
> > > > >>>>>>>>>> this.
> > > > >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
> > > > >>>>>>>>>>>>> <
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > > >>>>>>>>>>>>>> ,
> > > > >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
> > > > >> better
> > > > >>> to
> > > > >>>>>> have
> > > > >>>>>>> a
> > > > >>>>>>>>>> PR to
> > > > >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP
> sync,
> > > > >> but
> > > > >>>>> from
> > > > >>>>>>> our
> > > > >>>>>>>>>> prod
> > > > >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I can send out the diff to disable it by default on
> 3.5 if
> > > > >>> you
> > > > >>>>>> guys
> > > > >>>>>>>>>> think
> > > > >>>>>>>>>>>>> this is the right way to do.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>> Fangmin
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> > > > >>> andor@apache.org>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>> What’s needed to turn it off?
> > > > >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
> > > > >>>>>>>>>>>>>> Shall we implement a feature switch for that and turn
> it
> > > > >>> off by
> > > > >>>>>>>>>> default?
> > > > >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> > > > >>> lvfangmin@gmail.com>
> > > > >>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> > > > >> case
> > > > >>> of
> > > > >>>>>>>>>>>>>> inconsistency
> > > > >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I
> mentioned
> > > > >> in
> > > > >>> a
> > > > >>>>>> newer
> > > > >>>>>>>>>> JIRA
> > > > >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
> > > > >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in
> the
> > > > >> txn
> > > > >>>>> file,
> > > > >>>>>>>>>> which
> > > > >>>>>>>>>>>>>> is a
> > > > >>>>>>>>>>>>>>> more common case could trigger this issue.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
> > > > >>> default
> > > > >>>>> for
> > > > >>>>>>> now
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114,
> we
> > > > >>> can use
> > > > >>>>>>> that
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>>>>>> validate the on disk txns during syncing.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>> Fangmin
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > > > >>>>> lvfangmin@gmail.com
> > > > >>>>>>>
> > > > >>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>> Andor,
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> > > > >> checking
> > > > >>> to
> > > > >>>>>> help
> > > > >>>>>>>>>>>>>> detecting
> > > > >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of
> code
> > > > >>>>> change.
> > > > >>>>>>> I'll
> > > > >>>>>>>>>>>>>> start
> > > > >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's
> being
> > > > >>>>> merged
> > > > >>>>>> in
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>>>>> next
> > > > >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but
> definitely
> > > > >>> not a
> > > > >>>>>>> block
> > > > >>>>>>>>>> for
> > > > >>>>>>>>>>>>>> 3.5.
> > > > >>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>> Fangmin
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > > > >>>>> andor@apache.org
> > > > >>>>>>>
> > > > >>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>> Fangmin,
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
> > > > >>>>> consistency
> > > > >>>>>>>>>> fixes
> > > > >>>>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
> > > > >>> backports
> > > > >>>>>> and
> > > > >>>>>>>>>>>>> we’ll
> > > > >>>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>>>> done with them.
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> > > > >> wouldn’t
> > > > >>>>> block
> > > > >>>>>>> 3.5
> > > > >>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>>>>> that. What do you think?
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> > > > >>> andor@apache.org
> > > > >>>>>>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> The updated list is as follows:
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> Broadcast
> > > > >>>>>> protocol)
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> between
> > > > >>>>> follower
> > > > >>>>>>> sync
> > > > >>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>>>>> leader and follower receiving external connection
> > > > >>> requests.)
> > > > >>>>>>>>>>>>>>>>>> The following are not critical and no blockers for
> > > > >> the
> > > > >>>>> stable
> > > > >>>>>>>>>>>>> release:
> > > > >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> New feature:
> > > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > > > >>>>> lvfangmin@gmail.com>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>> Hi Andor,
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was
> disabled
> > > > >>>>>> internally
> > > > >>>>>>>>>> after
> > > > >>>>>>>>>>>>>> we
> > > > >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The
> only
> > > > >>>>> solution
> > > > >>>>>> we
> > > > >>>>>>>>>> have
> > > > >>>>>>>>>>>>>>>>> for now
> > > > >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
> > > > >>> mentioned
> > > > >>>>>> in
> > > > >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
> > > > >>> issues we
> > > > >>>>>>> just
> > > > >>>>>>>>>>>>> fixed
> > > > >>>>>>>>>>>>>>>>> on
> > > > >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > > >>>>>>>>>> ZOOKEEPER-3127, I
> > > > >>>>>>>>>>>>>>>>> think we
> > > > >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release
> as
> > > > >>> well.
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>>>>>>>>> Fangmin
> > > > >>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > > >>>>>>> andor@apache.org
> > > > >>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove
> it
> > > > >>> from
> > > > >>>>> the
> > > > >>>>>>>>>> list
> > > > >>>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>>>>> get
> > > > >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to
> avoid
> > > > >>> data
> > > > >>>>>>>>>>>>>> inconsistency?
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > > >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> > > > >>> ignoring
> > > > >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
> > > > >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
> > > > >> can
> > > > >>> be
> > > > >>>>>>>>>> disabled.
> > > > >>>>>>>>>>>>> We
> > > > >>>>>>>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> Jeelani
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > > >>>>>>>>>> <andor@cloudera.com.INVALID
> > > > >>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>> Fine.
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still
> we
> > > > >>> have
> > > > >>>>> the
> > > > >>>>>>>>>> list
> > > > >>>>>>>>>>>>>> of:
> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> > > > >>> Broadcast
> > > > >>>>>>>>>> protocol)
> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> > > > >> sending
> > > > >>> some
> > > > >>>>>>>>>>>>>>>>>>>> transactions to
> > > > >>>>>>>>>>>>>>>>>>>>> followers)
> > > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> > > > >> between
> > > > >>>>>> follower
> > > > >>>>>>>>>>>>> sync
> > > > >>>>>>>>>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external
> connection
> > > > >>>>>> requests.)
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for
> the
> > > > >>> 3.5
> > > > >>>>>>> release,
> > > > >>>>>>>>>>>>>>>>> hence
> > > > >>>>>>>>>>>>>>>>>>>> I
> > > > >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the
> next
> > > > >>> stable
> > > > >>>>>>>>>>>>> release.
> > > > >>>>>>>>>>>>>>>>> PR
> > > > >>>>>>>>>>>>>>>>>>>> has
> > > > >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing
> please.
> > > > >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues which
> > > > >> have
> > > > >>>>> been
> > > > >>>>>>>>>> found
> > > > >>>>>>>>>>>>> in
> > > > >>>>>>>>>>>>>>>>>>>> the 3.5
> > > > >>>>>>>>>>>>>>>>>>>>> branch.
> > > > >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> > > > >>> fixed in
> > > > >>>>>>> 3.4,
> > > > >>>>>>>>>>>>> but
> > > > >>>>>>>>>>>>>>>>>>>> never has
> > > > >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if
> still
> > > > >>>>> present.
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual
> testing
> > > > >>> and
> > > > >>>>> see
> > > > >>>>>>> if
> > > > >>>>>>>>>> we
> > > > >>>>>>>>>>>>>>>>>>>> could
> > > > >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead
> with
> > > > >> a
> > > > >>>>> stable
> > > > >>>>>>>>>>>>>> release.
> > > > >>>>>>>>>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > > >>>>>>> hanm@apache.org>
> > > > >>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but
> looks
> > > > >>> like
> > > > >>>>>> lots
> > > > >>>>>>>>>> of
> > > > >>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>>>>>>> JIRA
> > > > >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
> > > > >>> ZOOKEEPER-1549,
> > > > >>>>>> 2846,
> > > > >>>>>>>>>> also
> > > > >>>>>>>>>>>>>>>>>>>> affects
> > > > >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues
> out?
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
> > > > >>> blocking
> > > > >>>>>> issue
> > > > >>>>>>>>>> for a
> > > > >>>>>>>>>>>>>>>>>>>> stable 3.5
> > > > >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
> > > > >>> concerns
> > > > >>>>>>> around
> > > > >>>>>>>>>> it
> > > > >>>>>>>>>>>>>>>>>>>> (somehow
> > > > >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha
> and
> > > > >>> beta
> > > > >>>>>>> releases
> > > > >>>>>>>>>>>>> were
> > > > >>>>>>>>>>>>>>>>>>>> created
> > > > >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > > >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to
> release.
> > > > >>>>>> Something
> > > > >>>>>>>>>>>>> might
> > > > >>>>>>>>>>>>>>>>>>>> worth
> > > > >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par
> with
> > > > >>> 3.4
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for
> the
> > > > >>>>> record
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > > >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little
> or
> > > > >> no
> > > > >>>>> flaky
> > > > >>>>>>>>>> tests
> > > > >>>>>>>>>>>>> at
> > > > >>>>>>>>>>>>>>>>>>>> all.
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > > >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > > >>>>>>>>>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> > > > >>> appreciate
> > > > >>>>>> it.
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > >
> > > > --
> > >
> > >
> > > -- Enrico Olivelli
> > >
>

Re: ZooKeeper 3.5 blocker issues

Posted by Enrico Olivelli <eo...@gmail.com>.
Great

Il giorno gio 20 dic 2018 alle ore 10:07 Norbert Kalmar
<nk...@cloudera.com.invalid> ha scritto:
>
> Subtasks:
> Findbugs, checkstyle - https://issues.apache.org/jira/browse/ZOOKEEPER-3223

We don't have checksyle. In my experience introducing checkstyle break
every pending patch.
I would like to narrow down this issue to "Spotbugs" and pick in up

> CI integration - https://issues.apache.org/jira/browse/ZOOKEEPER-3224
I would like to pick this up

Enrico

> Code coverage - https://issues.apache.org/jira/browse/ZOOKEEPER-3225 - I
> laready started this one and some of it is committed with the patch, so I
> will continue to work on it.
> Recipes and contrib - https://issues.apache.org/jira/browse/ZOOKEEPER-3171
> - Already on it, recipes is done, PR soon available.
> Assembly - https://issues.apache.org/jira/browse/ZOOKEEPER-3029
>
> This are the tasks left I can think of. If anything is missing, feel free
> to create a jira, or let me know.
> The ones I'm already working on - 3225, 3171 - I made a comment, Those
> should be ready this week.

>
> Thanks,
> Norbert
>
>
> On Thu, Dec 20, 2018 at 9:07 AM Enrico Olivelli <eo...@gmail.com> wrote:
>
> > Great.
> > Can you create JIRA tickets for remaining subtask? So that I can pick them
> > up?
> > I volounter for spotbugs and for CI integration, but let's see the list
> > Enrico
> >
> > Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:
> >
> > > Ok. Looks like ant still works properly, so let’s commit this patch and
> > > you guys can collaborate to polish the Maven build.
> > >
> > > For now, it’s master-only.
> > >
> > > Thanks,
> > > Andor
> > >
> > >
> > >
> > > > On 2018. Dec 19., at 16:44, Norbert Kalmar
> > <nk...@cloudera.com.INVALID>
> > > wrote:
> > > >
> > > > Thank you Enrico, I agree, that we could commit this patch at it's
> > > current
> > > > state, it fulfills the original jira anyways.
> > > >
> > > > I'll see what's wrong with the java tests, but honestly, it looks like
> > > > they're just flaky... runs well on local builds with 8 thread.
> > > >
> > > > Regards,
> > > > Norbert
> > > >
> > > > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes
> > <tamaas@cloudera.com.invalid
> > > >
> > > > wrote:
> > > >
> > > >> Hi All,
> > > >>
> > > >> For assembly task I would promote the way how HBase works.
> > > >> They create a pure source and a bin tarball separately. Please see how
> > > they
> > > >> create a release here:
> > > >> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> > > >> We could probably use the well known "copy+paste technology" to have
> > it
> > > >> within ZooKeeper the same way. ;-)
> > > >>
> > > >> Regards, Tamaas
> > > >>
> > > >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Great work Norbert
> > > >>> I you want I can help,especially for rat, findbugs (need to switch to
> > > >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
> > > >>> Plugin in other projects)
> > > >>> But I am not sure how can I help you concretely if we do not commit
> > > your
> > > >>> work.
> > > >>> We could commit the work as it is now, leaving "ant" as official
> > build
> > > >>> method, but having the poms committed will ease collaboration.
> > > >>>
> > > >>> We will also have to work on CI jobs, I can help on that part as well
> > > >>>
> > > >>> Enrico
> > > >>>
> > > >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> > > >>> <nk...@cloudera.com.invalid> ha scritto:
> > > >>>>
> > > >>>> Hi everyone,
> > > >>>>
> > > >>>> Some update on the maven migration: I had a few bumps here and there
> > > >>> (just
> > > >>>> looking at the latest patch Andor linked -
> > > >>>> https://github.com/apache/zookeeper/pull/708 - you can see on the
> > > >>> commits).
> > > >>>> Current state is that the build works, tests run, but reports like
> > > >>>> findbugs, clover etc. are not yet implemented. Maven has plugins for
> > > >> them
> > > >>>> usually, but it's not always trivial, especially with the C client.
> > > The
> > > >>>> assembly is also left to be done, but it should be fairly easy to
> > do a
> > > >>>> similar tarball then ant does (although this will be also an
> > > >> interesting
> > > >>>> task, as ant does some strange things, like duplicated sources of
> > most
> > > >>>> contrib projects).
> > > >>>>
> > > >>>> I had a seperate jira to do the recipes and contrib maven build. I
> > do
> > > >> not
> > > >>>> have open PR for it, but recipes is done and I am now working on the
> > > >>>> contrib projects. Most of them is manually build and never gets
> > called
> > > >>> from
> > > >>>> the main build.xml. I will not integrate these either to the maven
> > > >> build.
> > > >>>> The reason is that there are plans to remove some of them from ZK
> > repo
> > > >>>> anyway. The other reason is that for starters, we want to replicate
> > > the
> > > >>> ant
> > > >>>> build as closely as possible, without doing any nasty workarounds in
> > > >>> maven
> > > >>>> to achieve that. And from there, we can improve, use maven's
> > > advantages
> > > >>> to
> > > >>>> shape the build of ZooKeeper. Once it is stable and proven to have
> > all
> > > >>> the
> > > >>>> functionality required for build and release.
> > > >>>>
> > > >>>> Right now, I am trying to stabilize the build as much as possible.
> > > >> Andor
> > > >>>> also fixed some flaky C tests that for some strange reasons, become
> > > >>>> extremely flaky with the maven build:
> > > >>>> https://github.com/apache/zookeeper/pull/740
> > > >>>>
> > > >>>> Regards,
> > > >>>> Norbert
> > > >>>>
> > > >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> > > >> <andor@cloudera.com.invalid
> > > >>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Sure, good point. Let's put it on the list.
> > > >>>>>
> > > >>>>> Andor
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
> > > >>> wrote:
> > > >>>>>
> > > >>>>>> Are folks OK to wait on that OWASP issue I documented over the
> > > >>> weekend?
> > > >>>>>> afaict we are not affected but it would be good to get another
> > pair
> > > >>> of
> > > >>>>> eyes
> > > >>>>>> on it.
> > > >>>>>>
> > > >>>>>> Patrick
> > > >>>>>>
> > > >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
> > > >>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi team,
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> I'm proudly announce that thanks to the joint effort from the
> > > >>>>> community,
> > > >>>>>>> the 3.5 blockers list has become empty:
> > > >>>>>>>
> > > >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
> > > >>> 3.5.5
> > > >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC, key
> > > >>> ASC"
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Well... almost. All the blocker issues have gone, but we still
> > > >>> have the
> > > >>>>>>> Maven migration to complete before the stable release. If you
> > > >> have
> > > >>> some
> > > >>>>>>> free cycles, please join us testing the Maven build on this PR:
> > > >>>>>>>
> > > >>>>>>> https://github.com/apache/zookeeper/pull/708
> > > >>>>>>>
> > > >>>>>>> I hope we can merge it pretty soon.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> In terms of the builds, the weather at 3.5 branch is quite sunny
> > > >>>>>> nowadays:
> > > >>>>>>>
> > > >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > > >>>>>>>
> > > >>>>>>> The Java 11 build is still having some difficulties, which
> > > >>> hopefully I
> > > >>>>>>> can address before the holidays:
> > > >>>>>>>
> > > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> If you happen to know about something which is important from
> > > >> 3.5's
> > > >>>>>>> perspective and missing from the above, please don't hesitate to
> > > >>> share.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Happy ZooKeeping!
> > > >>>>>>>
> > > >>>>>>> Andor
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
> > > >>>>>>>> Andor,
> > > >>>>>>>>
> > > >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
> > > >>>>>>>> https://github.com/apache/zookeeper/pull/685.
> > > >>>>>>>>
> > > >>>>>>>> Fangmin
> > > >>>>>>>>
> > > >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> > > >> lvfangmin@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi Andor,
> > > >>>>>>>>>
> > > >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there is no
> > > >>> one
> > > >>>>>>> working
> > > >>>>>>>>> on it yet.
> > > >>>>>>>>>
> > > >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
> > > >>>>>>>>>
> > > >>>>>>>>> Fangmin
> > > >>>>>>>>>
> > > >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> > > >> andor@apache.org>
> > > >>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi folks,
> > > >>>>>>>>>>
> > > >>>>>>>>>> You’ve probably realised lots of update emails coming from
> > > >>> Jira.
> > > >>>>>> Please
> > > >>>>>>>>>> be aware that we’ve updated a bunch of open blocker/critical
> > > >>> 3.5
> > > >>>>>>> tickets to
> > > >>>>>>>>>> reflect to what we discussed in this email.
> > > >>>>>>>>>>
> > > >>>>>>>>>> If you open up the following jira filter:
> > > >>>>>>>>>>
> > > >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
> > > >> fixVersion
> > > >>> =
> > > >>>>>> 3.5.5
> > > >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
> > > >>> key ASC
> > > >>>>>>>>>>
> > > >>>>>>>>>> You’ll see the most up-to-date list of tickets which need to
> > > >> be
> > > >>>>>>> addressed
> > > >>>>>>>>>> before the stable 3.5 release.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thank you for your efforts to get this done.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> > > >>> already
> > > >>>>> been
> > > >>>>>>>>>> resolved. Have you created a separate ticket for the backport
> > > >>> or
> > > >>>>>> shall
> > > >>>>>>> I
> > > >>>>>>>>>> just reopen it with the right fix versions?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> Andor
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
> > > >>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Hi,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Let me summarize and give a quick update on the outstanding
> > > >>> issues
> > > >>>>>> for
> > > >>>>>>>>>> 3.5 GA:
> > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> > > >>> sync
> > > >>>>>> with
> > > >>>>>>>>>> leader and follower receiving external connection requests.)
> > > >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> > > >> (ongoing)
> > > >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
> > > >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
> > > >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as the
> > > >>>>>> backports.
> > > >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far as I
> > > >> can
> > > >>>>> see,
> > > >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
> > > >>>>>>>>>>> Please correct me if I’ve missed something.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Andor
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > > >>>>>> <tamaas@cloudera.com.INVALID
> > > >>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I would add ZOOKEEPER-3021
> > > >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> > > >>> Migrate
> > > >>>>>>> project
> > > >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
> > > >>> migration
> > > >>>>> has
> > > >>>>>>>>>> started
> > > >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> ZOOKEEPER-925 <
> > > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > > >>>>>>>
> > > >>>>>>>>>> replace
> > > >>>>>>>>>>>> our forrest site and documentation generation might also
> > > >> be a
> > > >>>>> good
> > > >>>>>>>>>> idea,
> > > >>>>>>>>>>>> since then we could deliver the new MarkDown based
> > > >>> documentation.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Regards, Tamaas
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> > > >>> lvfangmin@gmail.com
> > > >>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
> > > >> context.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if the
> > > >>> peer
> > > >>>>>> zxid
> > > >>>>>>>>>> is not
> > > >>>>>>>>>>>>> in it's in memory commit logs, the code is here: Leader on
> > > >>> disk
> > > >>>>>> txn
> > > >>>>>>>>>> sync
> > > >>>>>>>>>>>>> <
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > >>>>>>>>>>>>>> .
> > > >>>>>>>>>>>>> There is bug that potentially there will be gap in the txn
> > > >>>>> files,
> > > >>>>>>> like
> > > >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will miss
> > > >>> txns
> > > >>>>> due
> > > >>>>>>> to
> > > >>>>>>>>>> this.
> > > >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
> > > >>>>>>>>>>>>> <
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > >>>>>>>>>>>>>> ,
> > > >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
> > > >> better
> > > >>> to
> > > >>>>>> have
> > > >>>>>>> a
> > > >>>>>>>>>> PR to
> > > >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP sync,
> > > >> but
> > > >>>>> from
> > > >>>>>>> our
> > > >>>>>>>>>> prod
> > > >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> I can send out the diff to disable it by default on 3.5 if
> > > >>> you
> > > >>>>>> guys
> > > >>>>>>>>>> think
> > > >>>>>>>>>>>>> this is the right way to do.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>> Fangmin
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> > > >>> andor@apache.org>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>> What’s needed to turn it off?
> > > >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
> > > >>>>>>>>>>>>>> Shall we implement a feature switch for that and turn it
> > > >>> off by
> > > >>>>>>>>>> default?
> > > >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> > > >>> lvfangmin@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> > > >> case
> > > >>> of
> > > >>>>>>>>>>>>>> inconsistency
> > > >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I mentioned
> > > >> in
> > > >>> a
> > > >>>>>> newer
> > > >>>>>>>>>> JIRA
> > > >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
> > > >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > >>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in the
> > > >> txn
> > > >>>>> file,
> > > >>>>>>>>>> which
> > > >>>>>>>>>>>>>> is a
> > > >>>>>>>>>>>>>>> more common case could trigger this issue.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
> > > >>> default
> > > >>>>> for
> > > >>>>>>> now
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
> > > >>> can use
> > > >>>>>>> that
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>>>>> validate the on disk txns during syncing.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>> Fangmin
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > > >>>>> lvfangmin@gmail.com
> > > >>>>>>>
> > > >>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>> Andor,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> > > >> checking
> > > >>> to
> > > >>>>>> help
> > > >>>>>>>>>>>>>> detecting
> > > >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of code
> > > >>>>> change.
> > > >>>>>>> I'll
> > > >>>>>>>>>>>>>> start
> > > >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's being
> > > >>>>> merged
> > > >>>>>> in
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>>> next
> > > >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
> > > >>> not a
> > > >>>>>>> block
> > > >>>>>>>>>> for
> > > >>>>>>>>>>>>>> 3.5.
> > > >>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>> Fangmin
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > > >>>>> andor@apache.org
> > > >>>>>>>
> > > >>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>> Fangmin,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
> > > >>>>> consistency
> > > >>>>>>>>>> fixes
> > > >>>>>>>>>>>>> in
> > > >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
> > > >>> backports
> > > >>>>>> and
> > > >>>>>>>>>>>>> we’ll
> > > >>>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>> done with them.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> > > >> wouldn’t
> > > >>>>> block
> > > >>>>>>> 3.5
> > > >>>>>>>>>>>>> with
> > > >>>>>>>>>>>>>>>>> that. What do you think?
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> > > >>> andor@apache.org
> > > >>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The updated list is as follows:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > >>>>>> protocol)
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > > >>>>> follower
> > > >>>>>>> sync
> > > >>>>>>>>>>>>> with
> > > >>>>>>>>>>>>>>>>> leader and follower receiving external connection
> > > >>> requests.)
> > > >>>>>>>>>>>>>>>>>> The following are not critical and no blockers for
> > > >> the
> > > >>>>> stable
> > > >>>>>>>>>>>>> release:
> > > >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> New feature:
> > > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > > >>>>> lvfangmin@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>> Hi Andor,
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > > >>>>>> internally
> > > >>>>>>>>>> after
> > > >>>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The only
> > > >>>>> solution
> > > >>>>>> we
> > > >>>>>>>>>> have
> > > >>>>>>>>>>>>>>>>> for now
> > > >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
> > > >>> mentioned
> > > >>>>>> in
> > > >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
> > > >>> issues we
> > > >>>>>>> just
> > > >>>>>>>>>>>>> fixed
> > > >>>>>>>>>>>>>>>>> on
> > > >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > >>>>>>>>>> ZOOKEEPER-3127, I
> > > >>>>>>>>>>>>>>>>> think we
> > > >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release as
> > > >>> well.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>>> Fangmin
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > >>>>>>> andor@apache.org
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
> > > >>> from
> > > >>>>> the
> > > >>>>>>>>>> list
> > > >>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>> get
> > > >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to avoid
> > > >>> data
> > > >>>>>>>>>>>>>> inconsistency?
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> > > >>> ignoring
> > > >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
> > > >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
> > > >> can
> > > >>> be
> > > >>>>>>>>>> disabled.
> > > >>>>>>>>>>>>> We
> > > >>>>>>>>>>>>>>>>> are
> > > >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > >>>>>>>>>>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Jeelani
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > >>>>>>>>>> <andor@cloudera.com.INVALID
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>> Fine.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
> > > >>> have
> > > >>>>> the
> > > >>>>>>>>>> list
> > > >>>>>>>>>>>>>> of:
> > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> > > >>> Broadcast
> > > >>>>>>>>>> protocol)
> > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> > > >> sending
> > > >>> some
> > > >>>>>>>>>>>>>>>>>>>> transactions to
> > > >>>>>>>>>>>>>>>>>>>>> followers)
> > > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> > > >> between
> > > >>>>>> follower
> > > >>>>>>>>>>>>> sync
> > > >>>>>>>>>>>>>>>>>>>> with
> > > >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external connection
> > > >>>>>> requests.)
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
> > > >>> 3.5
> > > >>>>>>> release,
> > > >>>>>>>>>>>>>>>>> hence
> > > >>>>>>>>>>>>>>>>>>>> I
> > > >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
> > > >>> stable
> > > >>>>>>>>>>>>> release.
> > > >>>>>>>>>>>>>>>>> PR
> > > >>>>>>>>>>>>>>>>>>>> has
> > > >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > > >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues which
> > > >> have
> > > >>>>> been
> > > >>>>>>>>>> found
> > > >>>>>>>>>>>>> in
> > > >>>>>>>>>>>>>>>>>>>> the 3.5
> > > >>>>>>>>>>>>>>>>>>>>> branch.
> > > >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> > > >>> fixed in
> > > >>>>>>> 3.4,
> > > >>>>>>>>>>>>> but
> > > >>>>>>>>>>>>>>>>>>>> never has
> > > >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> > > >>>>> present.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual testing
> > > >>> and
> > > >>>>> see
> > > >>>>>>> if
> > > >>>>>>>>>> we
> > > >>>>>>>>>>>>>>>>>>>> could
> > > >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead with
> > > >> a
> > > >>>>> stable
> > > >>>>>>>>>>>>>> release.
> > > >>>>>>>>>>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > >>>>>>> hanm@apache.org>
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but looks
> > > >>> like
> > > >>>>>> lots
> > > >>>>>>>>>> of
> > > >>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>> JIRA
> > > >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
> > > >>> ZOOKEEPER-1549,
> > > >>>>>> 2846,
> > > >>>>>>>>>> also
> > > >>>>>>>>>>>>>>>>>>>> affects
> > > >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
> > > >>> blocking
> > > >>>>>> issue
> > > >>>>>>>>>> for a
> > > >>>>>>>>>>>>>>>>>>>> stable 3.5
> > > >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
> > > >>> concerns
> > > >>>>>>> around
> > > >>>>>>>>>> it
> > > >>>>>>>>>>>>>>>>>>>> (somehow
> > > >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
> > > >>> beta
> > > >>>>>>> releases
> > > >>>>>>>>>>>>> were
> > > >>>>>>>>>>>>>>>>>>>> created
> > > >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > > >>>>>> Something
> > > >>>>>>>>>>>>> might
> > > >>>>>>>>>>>>>>>>>>>> worth
> > > >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
> > > >>> 3.4
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> > > >>>>> record
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or
> > > >> no
> > > >>>>> flaky
> > > >>>>>>>>>> tests
> > > >>>>>>>>>>>>> at
> > > >>>>>>>>>>>>>>>>>>>> all.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > >>>>>>>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> > > >>> appreciate
> > > >>>>>> it.
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> > > --
> >
> >
> > -- Enrico Olivelli
> >

Re: ZooKeeper 3.5 blocker issues

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Subtasks:
Findbugs, checkstyle - https://issues.apache.org/jira/browse/ZOOKEEPER-3223
CI integration - https://issues.apache.org/jira/browse/ZOOKEEPER-3224
Code coverage - https://issues.apache.org/jira/browse/ZOOKEEPER-3225 - I
laready started this one and some of it is committed with the patch, so I
will continue to work on it.
Recipes and contrib - https://issues.apache.org/jira/browse/ZOOKEEPER-3171
- Already on it, recipes is done, PR soon available.
Assembly - https://issues.apache.org/jira/browse/ZOOKEEPER-3029

This are the tasks left I can think of. If anything is missing, feel free
to create a jira, or let me know.
The ones I'm already working on - 3225, 3171 - I made a comment, Those
should be ready this week.

Thanks,
Norbert


On Thu, Dec 20, 2018 at 9:07 AM Enrico Olivelli <eo...@gmail.com> wrote:

> Great.
> Can you create JIRA tickets for remaining subtask? So that I can pick them
> up?
> I volounter for spotbugs and for CI integration, but let's see the list
> Enrico
>
> Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:
>
> > Ok. Looks like ant still works properly, so let’s commit this patch and
> > you guys can collaborate to polish the Maven build.
> >
> > For now, it’s master-only.
> >
> > Thanks,
> > Andor
> >
> >
> >
> > > On 2018. Dec 19., at 16:44, Norbert Kalmar
> <nk...@cloudera.com.INVALID>
> > wrote:
> > >
> > > Thank you Enrico, I agree, that we could commit this patch at it's
> > current
> > > state, it fulfills the original jira anyways.
> > >
> > > I'll see what's wrong with the java tests, but honestly, it looks like
> > > they're just flaky... runs well on local builds with 8 thread.
> > >
> > > Regards,
> > > Norbert
> > >
> > > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes
> <tamaas@cloudera.com.invalid
> > >
> > > wrote:
> > >
> > >> Hi All,
> > >>
> > >> For assembly task I would promote the way how HBase works.
> > >> They create a pure source and a bin tarball separately. Please see how
> > they
> > >> create a release here:
> > >> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> > >> We could probably use the well known "copy+paste technology" to have
> it
> > >> within ZooKeeper the same way. ;-)
> > >>
> > >> Regards, Tamaas
> > >>
> > >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com>
> > >> wrote:
> > >>
> > >>> Great work Norbert
> > >>> I you want I can help,especially for rat, findbugs (need to switch to
> > >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
> > >>> Plugin in other projects)
> > >>> But I am not sure how can I help you concretely if we do not commit
> > your
> > >>> work.
> > >>> We could commit the work as it is now, leaving "ant" as official
> build
> > >>> method, but having the poms committed will ease collaboration.
> > >>>
> > >>> We will also have to work on CI jobs, I can help on that part as well
> > >>>
> > >>> Enrico
> > >>>
> > >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> > >>> <nk...@cloudera.com.invalid> ha scritto:
> > >>>>
> > >>>> Hi everyone,
> > >>>>
> > >>>> Some update on the maven migration: I had a few bumps here and there
> > >>> (just
> > >>>> looking at the latest patch Andor linked -
> > >>>> https://github.com/apache/zookeeper/pull/708 - you can see on the
> > >>> commits).
> > >>>> Current state is that the build works, tests run, but reports like
> > >>>> findbugs, clover etc. are not yet implemented. Maven has plugins for
> > >> them
> > >>>> usually, but it's not always trivial, especially with the C client.
> > The
> > >>>> assembly is also left to be done, but it should be fairly easy to
> do a
> > >>>> similar tarball then ant does (although this will be also an
> > >> interesting
> > >>>> task, as ant does some strange things, like duplicated sources of
> most
> > >>>> contrib projects).
> > >>>>
> > >>>> I had a seperate jira to do the recipes and contrib maven build. I
> do
> > >> not
> > >>>> have open PR for it, but recipes is done and I am now working on the
> > >>>> contrib projects. Most of them is manually build and never gets
> called
> > >>> from
> > >>>> the main build.xml. I will not integrate these either to the maven
> > >> build.
> > >>>> The reason is that there are plans to remove some of them from ZK
> repo
> > >>>> anyway. The other reason is that for starters, we want to replicate
> > the
> > >>> ant
> > >>>> build as closely as possible, without doing any nasty workarounds in
> > >>> maven
> > >>>> to achieve that. And from there, we can improve, use maven's
> > advantages
> > >>> to
> > >>>> shape the build of ZooKeeper. Once it is stable and proven to have
> all
> > >>> the
> > >>>> functionality required for build and release.
> > >>>>
> > >>>> Right now, I am trying to stabilize the build as much as possible.
> > >> Andor
> > >>>> also fixed some flaky C tests that for some strange reasons, become
> > >>>> extremely flaky with the maven build:
> > >>>> https://github.com/apache/zookeeper/pull/740
> > >>>>
> > >>>> Regards,
> > >>>> Norbert
> > >>>>
> > >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> > >> <andor@cloudera.com.invalid
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Sure, good point. Let's put it on the list.
> > >>>>>
> > >>>>> Andor
> > >>>>>
> > >>>>>
> > >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
> > >>> wrote:
> > >>>>>
> > >>>>>> Are folks OK to wait on that OWASP issue I documented over the
> > >>> weekend?
> > >>>>>> afaict we are not affected but it would be good to get another
> pair
> > >>> of
> > >>>>> eyes
> > >>>>>> on it.
> > >>>>>>
> > >>>>>> Patrick
> > >>>>>>
> > >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
> > >>> wrote:
> > >>>>>>
> > >>>>>>> Hi team,
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I'm proudly announce that thanks to the joint effort from the
> > >>>>> community,
> > >>>>>>> the 3.5 blockers list has become empty:
> > >>>>>>>
> > >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
> > >>> 3.5.5
> > >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC, key
> > >>> ASC"
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Well... almost. All the blocker issues have gone, but we still
> > >>> have the
> > >>>>>>> Maven migration to complete before the stable release. If you
> > >> have
> > >>> some
> > >>>>>>> free cycles, please join us testing the Maven build on this PR:
> > >>>>>>>
> > >>>>>>> https://github.com/apache/zookeeper/pull/708
> > >>>>>>>
> > >>>>>>> I hope we can merge it pretty soon.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> In terms of the builds, the weather at 3.5 branch is quite sunny
> > >>>>>> nowadays:
> > >>>>>>>
> > >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > >>>>>>>
> > >>>>>>> The Java 11 build is still having some difficulties, which
> > >>> hopefully I
> > >>>>>>> can address before the holidays:
> > >>>>>>>
> > >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> If you happen to know about something which is important from
> > >> 3.5's
> > >>>>>>> perspective and missing from the above, please don't hesitate to
> > >>> share.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Happy ZooKeeping!
> > >>>>>>>
> > >>>>>>> Andor
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
> > >>>>>>>> Andor,
> > >>>>>>>>
> > >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
> > >>>>>>>> https://github.com/apache/zookeeper/pull/685.
> > >>>>>>>>
> > >>>>>>>> Fangmin
> > >>>>>>>>
> > >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> > >> lvfangmin@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi Andor,
> > >>>>>>>>>
> > >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there is no
> > >>> one
> > >>>>>>> working
> > >>>>>>>>> on it yet.
> > >>>>>>>>>
> > >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
> > >>>>>>>>>
> > >>>>>>>>> Fangmin
> > >>>>>>>>>
> > >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> > >> andor@apache.org>
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi folks,
> > >>>>>>>>>>
> > >>>>>>>>>> You’ve probably realised lots of update emails coming from
> > >>> Jira.
> > >>>>>> Please
> > >>>>>>>>>> be aware that we’ve updated a bunch of open blocker/critical
> > >>> 3.5
> > >>>>>>> tickets to
> > >>>>>>>>>> reflect to what we discussed in this email.
> > >>>>>>>>>>
> > >>>>>>>>>> If you open up the following jira filter:
> > >>>>>>>>>>
> > >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
> > >> fixVersion
> > >>> =
> > >>>>>> 3.5.5
> > >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
> > >>> key ASC
> > >>>>>>>>>>
> > >>>>>>>>>> You’ll see the most up-to-date list of tickets which need to
> > >> be
> > >>>>>>> addressed
> > >>>>>>>>>> before the stable 3.5 release.
> > >>>>>>>>>>
> > >>>>>>>>>> Thank you for your efforts to get this done.
> > >>>>>>>>>>
> > >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> > >>> already
> > >>>>> been
> > >>>>>>>>>> resolved. Have you created a separate ticket for the backport
> > >>> or
> > >>>>>> shall
> > >>>>>>> I
> > >>>>>>>>>> just reopen it with the right fix versions?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Andor
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
> > >>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Let me summarize and give a quick update on the outstanding
> > >>> issues
> > >>>>>> for
> > >>>>>>>>>> 3.5 GA:
> > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> > >>> sync
> > >>>>>> with
> > >>>>>>>>>> leader and follower receiving external connection requests.)
> > >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> > >> (ongoing)
> > >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
> > >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
> > >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > >>>>>>>>>>>
> > >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as the
> > >>>>>> backports.
> > >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far as I
> > >> can
> > >>>>> see,
> > >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
> > >>>>>>>>>>> Please correct me if I’ve missed something.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Andor
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > >>>>>> <tamaas@cloudera.com.INVALID
> > >>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>> Hi All,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I would add ZOOKEEPER-3021
> > >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> > >>> Migrate
> > >>>>>>> project
> > >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
> > >>> migration
> > >>>>> has
> > >>>>>>>>>> started
> > >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ZOOKEEPER-925 <
> > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > >>>>>>>
> > >>>>>>>>>> replace
> > >>>>>>>>>>>> our forrest site and documentation generation might also
> > >> be a
> > >>>>> good
> > >>>>>>>>>> idea,
> > >>>>>>>>>>>> since then we could deliver the new MarkDown based
> > >>> documentation.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Regards, Tamaas
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> > >>> lvfangmin@gmail.com
> > >>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
> > >> context.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if the
> > >>> peer
> > >>>>>> zxid
> > >>>>>>>>>> is not
> > >>>>>>>>>>>>> in it's in memory commit logs, the code is here: Leader on
> > >>> disk
> > >>>>>> txn
> > >>>>>>>>>> sync
> > >>>>>>>>>>>>> <
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > >>>>>>>>>>>>>> .
> > >>>>>>>>>>>>> There is bug that potentially there will be gap in the txn
> > >>>>> files,
> > >>>>>>> like
> > >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will miss
> > >>> txns
> > >>>>> due
> > >>>>>>> to
> > >>>>>>>>>> this.
> > >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
> > >>>>>>>>>>>>> <
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > >>>>>>>>>>>>>> ,
> > >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
> > >> better
> > >>> to
> > >>>>>> have
> > >>>>>>> a
> > >>>>>>>>>> PR to
> > >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP sync,
> > >> but
> > >>>>> from
> > >>>>>>> our
> > >>>>>>>>>> prod
> > >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I can send out the diff to disable it by default on 3.5 if
> > >>> you
> > >>>>>> guys
> > >>>>>>>>>> think
> > >>>>>>>>>>>>> this is the right way to do.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>> Fangmin
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> > >>> andor@apache.org>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> What’s needed to turn it off?
> > >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
> > >>>>>>>>>>>>>> Shall we implement a feature switch for that and turn it
> > >>> off by
> > >>>>>>>>>> default?
> > >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> > >>> lvfangmin@gmail.com>
> > >>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> > >> case
> > >>> of
> > >>>>>>>>>>>>>> inconsistency
> > >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I mentioned
> > >> in
> > >>> a
> > >>>>>> newer
> > >>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
> > >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in the
> > >> txn
> > >>>>> file,
> > >>>>>>>>>> which
> > >>>>>>>>>>>>>> is a
> > >>>>>>>>>>>>>>> more common case could trigger this issue.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
> > >>> default
> > >>>>> for
> > >>>>>>> now
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
> > >>> can use
> > >>>>>>> that
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>>> validate the on disk txns during syncing.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>> Fangmin
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > >>>>> lvfangmin@gmail.com
> > >>>>>>>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>> Andor,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> > >> checking
> > >>> to
> > >>>>>> help
> > >>>>>>>>>>>>>> detecting
> > >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of code
> > >>>>> change.
> > >>>>>>> I'll
> > >>>>>>>>>>>>>> start
> > >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's being
> > >>>>> merged
> > >>>>>> in
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>> next
> > >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
> > >>> not a
> > >>>>>>> block
> > >>>>>>>>>> for
> > >>>>>>>>>>>>>> 3.5.
> > >>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>> Fangmin
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > >>>>> andor@apache.org
> > >>>>>>>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>> Fangmin,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
> > >>>>> consistency
> > >>>>>>>>>> fixes
> > >>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
> > >>> backports
> > >>>>>> and
> > >>>>>>>>>>>>> we’ll
> > >>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>> done with them.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> > >> wouldn’t
> > >>>>> block
> > >>>>>>> 3.5
> > >>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>> that. What do you think?
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> > >>> andor@apache.org
> > >>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The updated list is as follows:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > >>>>>> protocol)
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > >>>>> follower
> > >>>>>>> sync
> > >>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>> leader and follower receiving external connection
> > >>> requests.)
> > >>>>>>>>>>>>>>>>>> The following are not critical and no blockers for
> > >> the
> > >>>>> stable
> > >>>>>>>>>>>>> release:
> > >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> New feature:
> > >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > >>>>> lvfangmin@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>> Hi Andor,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > >>>>>> internally
> > >>>>>>>>>> after
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The only
> > >>>>> solution
> > >>>>>> we
> > >>>>>>>>>> have
> > >>>>>>>>>>>>>>>>> for now
> > >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
> > >>> mentioned
> > >>>>>> in
> > >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
> > >>> issues we
> > >>>>>>> just
> > >>>>>>>>>>>>> fixed
> > >>>>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > >>>>>>>>>> ZOOKEEPER-3127, I
> > >>>>>>>>>>>>>>>>> think we
> > >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release as
> > >>> well.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>> Fangmin
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > >>>>>>> andor@apache.org
> > >>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
> > >>> from
> > >>>>> the
> > >>>>>>>>>> list
> > >>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> get
> > >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to avoid
> > >>> data
> > >>>>>>>>>>>>>> inconsistency?
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> > >>> ignoring
> > >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
> > >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
> > >> can
> > >>> be
> > >>>>>>>>>> disabled.
> > >>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > >>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Jeelani
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > >>>>>>>>>> <andor@cloudera.com.INVALID
> > >>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>> Fine.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
> > >>> have
> > >>>>> the
> > >>>>>>>>>> list
> > >>>>>>>>>>>>>> of:
> > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> > >>> Broadcast
> > >>>>>>>>>> protocol)
> > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> > >> sending
> > >>> some
> > >>>>>>>>>>>>>>>>>>>> transactions to
> > >>>>>>>>>>>>>>>>>>>>> followers)
> > >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> > >> between
> > >>>>>> follower
> > >>>>>>>>>>>>> sync
> > >>>>>>>>>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external connection
> > >>>>>> requests.)
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
> > >>> 3.5
> > >>>>>>> release,
> > >>>>>>>>>>>>>>>>> hence
> > >>>>>>>>>>>>>>>>>>>> I
> > >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
> > >>> stable
> > >>>>>>>>>>>>> release.
> > >>>>>>>>>>>>>>>>> PR
> > >>>>>>>>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues which
> > >> have
> > >>>>> been
> > >>>>>>>>>> found
> > >>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>> the 3.5
> > >>>>>>>>>>>>>>>>>>>>> branch.
> > >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> > >>> fixed in
> > >>>>>>> 3.4,
> > >>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>>>> never has
> > >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> > >>>>> present.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual testing
> > >>> and
> > >>>>> see
> > >>>>>>> if
> > >>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>> could
> > >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead with
> > >> a
> > >>>>> stable
> > >>>>>>>>>>>>>> release.
> > >>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > >>>>>>> hanm@apache.org>
> > >>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but looks
> > >>> like
> > >>>>>> lots
> > >>>>>>>>>> of
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
> > >>> ZOOKEEPER-1549,
> > >>>>>> 2846,
> > >>>>>>>>>> also
> > >>>>>>>>>>>>>>>>>>>> affects
> > >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
> > >>> blocking
> > >>>>>> issue
> > >>>>>>>>>> for a
> > >>>>>>>>>>>>>>>>>>>> stable 3.5
> > >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
> > >>> concerns
> > >>>>>>> around
> > >>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>>>> (somehow
> > >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
> > >>> beta
> > >>>>>>> releases
> > >>>>>>>>>>>>> were
> > >>>>>>>>>>>>>>>>>>>> created
> > >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > >>>>>> Something
> > >>>>>>>>>>>>> might
> > >>>>>>>>>>>>>>>>>>>> worth
> > >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
> > >>> 3.4
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> > >>>>> record
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or
> > >> no
> > >>>>> flaky
> > >>>>>>>>>> tests
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>>>>>>>> all.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > >>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> > >>> appreciate
> > >>>>>> it.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> > --
>
>
> -- Enrico Olivelli
>

Re: ZooKeeper 3.5 blocker issues

Posted by Enrico Olivelli <eo...@gmail.com>.
Great.
Can you create JIRA tickets for remaining subtask? So that I can pick them
up?
I volounter for spotbugs and for CI integration, but let's see the list
Enrico

Il gio 20 dic 2018, 07:21 Andor Molnar <an...@apache.org> ha scritto:

> Ok. Looks like ant still works properly, so let’s commit this patch and
> you guys can collaborate to polish the Maven build.
>
> For now, it’s master-only.
>
> Thanks,
> Andor
>
>
>
> > On 2018. Dec 19., at 16:44, Norbert Kalmar <nk...@cloudera.com.INVALID>
> wrote:
> >
> > Thank you Enrico, I agree, that we could commit this patch at it's
> current
> > state, it fulfills the original jira anyways.
> >
> > I'll see what's wrong with the java tests, but honestly, it looks like
> > they're just flaky... runs well on local builds with 8 thread.
> >
> > Regards,
> > Norbert
> >
> > On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes <tamaas@cloudera.com.invalid
> >
> > wrote:
> >
> >> Hi All,
> >>
> >> For assembly task I would promote the way how HBase works.
> >> They create a pure source and a bin tarball separately. Please see how
> they
> >> create a release here:
> >> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> >> We could probably use the well known "copy+paste technology" to have it
> >> within ZooKeeper the same way. ;-)
> >>
> >> Regards, Tamaas
> >>
> >> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com>
> >> wrote:
> >>
> >>> Great work Norbert
> >>> I you want I can help,especially for rat, findbugs (need to switch to
> >>> spotbugs anyway) and OWASP stuff (recently I started using Maven
> >>> Plugin in other projects)
> >>> But I am not sure how can I help you concretely if we do not commit
> your
> >>> work.
> >>> We could commit the work as it is now, leaving "ant" as official build
> >>> method, but having the poms committed will ease collaboration.
> >>>
> >>> We will also have to work on CI jobs, I can help on that part as well
> >>>
> >>> Enrico
> >>>
> >>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> >>> <nk...@cloudera.com.invalid> ha scritto:
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>> Some update on the maven migration: I had a few bumps here and there
> >>> (just
> >>>> looking at the latest patch Andor linked -
> >>>> https://github.com/apache/zookeeper/pull/708 - you can see on the
> >>> commits).
> >>>> Current state is that the build works, tests run, but reports like
> >>>> findbugs, clover etc. are not yet implemented. Maven has plugins for
> >> them
> >>>> usually, but it's not always trivial, especially with the C client.
> The
> >>>> assembly is also left to be done, but it should be fairly easy to do a
> >>>> similar tarball then ant does (although this will be also an
> >> interesting
> >>>> task, as ant does some strange things, like duplicated sources of most
> >>>> contrib projects).
> >>>>
> >>>> I had a seperate jira to do the recipes and contrib maven build. I do
> >> not
> >>>> have open PR for it, but recipes is done and I am now working on the
> >>>> contrib projects. Most of them is manually build and never gets called
> >>> from
> >>>> the main build.xml. I will not integrate these either to the maven
> >> build.
> >>>> The reason is that there are plans to remove some of them from ZK repo
> >>>> anyway. The other reason is that for starters, we want to replicate
> the
> >>> ant
> >>>> build as closely as possible, without doing any nasty workarounds in
> >>> maven
> >>>> to achieve that. And from there, we can improve, use maven's
> advantages
> >>> to
> >>>> shape the build of ZooKeeper. Once it is stable and proven to have all
> >>> the
> >>>> functionality required for build and release.
> >>>>
> >>>> Right now, I am trying to stabilize the build as much as possible.
> >> Andor
> >>>> also fixed some flaky C tests that for some strange reasons, become
> >>>> extremely flaky with the maven build:
> >>>> https://github.com/apache/zookeeper/pull/740
> >>>>
> >>>> Regards,
> >>>> Norbert
> >>>>
> >>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> >> <andor@cloudera.com.invalid
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Sure, good point. Let's put it on the list.
> >>>>>
> >>>>> Andor
> >>>>>
> >>>>>
> >>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> Are folks OK to wait on that OWASP issue I documented over the
> >>> weekend?
> >>>>>> afaict we are not affected but it would be good to get another pair
> >>> of
> >>>>> eyes
> >>>>>> on it.
> >>>>>>
> >>>>>> Patrick
> >>>>>>
> >>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
> >>> wrote:
> >>>>>>
> >>>>>>> Hi team,
> >>>>>>>
> >>>>>>>
> >>>>>>> I'm proudly announce that thanks to the joint effort from the
> >>>>> community,
> >>>>>>> the 3.5 blockers list has become empty:
> >>>>>>>
> >>>>>>> "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
> >>> 3.5.5
> >>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC, key
> >>> ASC"
> >>>>>>>
> >>>>>>>
> >>>>>>> Well... almost. All the blocker issues have gone, but we still
> >>> have the
> >>>>>>> Maven migration to complete before the stable release. If you
> >> have
> >>> some
> >>>>>>> free cycles, please join us testing the Maven build on this PR:
> >>>>>>>
> >>>>>>> https://github.com/apache/zookeeper/pull/708
> >>>>>>>
> >>>>>>> I hope we can merge it pretty soon.
> >>>>>>>
> >>>>>>>
> >>>>>>> In terms of the builds, the weather at 3.5 branch is quite sunny
> >>>>>> nowadays:
> >>>>>>>
> >>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
> >>>>>>>
> >>>>>>> The Java 11 build is still having some difficulties, which
> >>> hopefully I
> >>>>>>> can address before the holidays:
> >>>>>>>
> >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> >>>>>>>
> >>>>>>>
> >>>>>>> If you happen to know about something which is important from
> >> 3.5's
> >>>>>>> perspective and missing from the above, please don't hesitate to
> >>> share.
> >>>>>>>
> >>>>>>>
> >>>>>>> Happy ZooKeeping!
> >>>>>>>
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
> >>>>>>>> Andor,
> >>>>>>>>
> >>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
> >>>>>>>> https://github.com/apache/zookeeper/pull/685.
> >>>>>>>>
> >>>>>>>> Fangmin
> >>>>>>>>
> >>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> >> lvfangmin@gmail.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Andor,
> >>>>>>>>>
> >>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there is no
> >>> one
> >>>>>>> working
> >>>>>>>>> on it yet.
> >>>>>>>>>
> >>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
> >>>>>>>>>
> >>>>>>>>> Fangmin
> >>>>>>>>>
> >>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> >> andor@apache.org>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi folks,
> >>>>>>>>>>
> >>>>>>>>>> You’ve probably realised lots of update emails coming from
> >>> Jira.
> >>>>>> Please
> >>>>>>>>>> be aware that we’ve updated a bunch of open blocker/critical
> >>> 3.5
> >>>>>>> tickets to
> >>>>>>>>>> reflect to what we discussed in this email.
> >>>>>>>>>>
> >>>>>>>>>> If you open up the following jira filter:
> >>>>>>>>>>
> >>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
> >> fixVersion
> >>> =
> >>>>>> 3.5.5
> >>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
> >>> key ASC
> >>>>>>>>>>
> >>>>>>>>>> You’ll see the most up-to-date list of tickets which need to
> >> be
> >>>>>>> addressed
> >>>>>>>>>> before the stable 3.5 release.
> >>>>>>>>>>
> >>>>>>>>>> Thank you for your efforts to get this done.
> >>>>>>>>>>
> >>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> >>> already
> >>>>> been
> >>>>>>>>>> resolved. Have you created a separate ticket for the backport
> >>> or
> >>>>>> shall
> >>>>>>> I
> >>>>>>>>>> just reopen it with the right fix versions?
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Andor
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
> >>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> Let me summarize and give a quick update on the outstanding
> >>> issues
> >>>>>> for
> >>>>>>>>>> 3.5 GA:
> >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> >>> sync
> >>>>>> with
> >>>>>>>>>> leader and follower receiving external connection requests.)
> >>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> >> (ongoing)
> >>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
> >>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
> >>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> >>>>>>>>>>>
> >>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as the
> >>>>>> backports.
> >>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far as I
> >> can
> >>>>> see,
> >>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
> >>>>>>>>>>> Please correct me if I’ve missed something.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Andor
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> >>>>>> <tamaas@cloudera.com.INVALID
> >>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would add ZOOKEEPER-3021
> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> >>> Migrate
> >>>>>>> project
> >>>>>>>>>>>> structure to Maven build as a blocker too. Since the
> >>> migration
> >>>>> has
> >>>>>>>>>> started
> >>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x GA.
> >>>>>>>>>>>>
> >>>>>>>>>>>> ZOOKEEPER-925 <
> >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> >>>>>>>
> >>>>>>>>>> replace
> >>>>>>>>>>>> our forrest site and documentation generation might also
> >> be a
> >>>>> good
> >>>>>>>>>> idea,
> >>>>>>>>>>>> since then we could deliver the new MarkDown based
> >>> documentation.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards, Tamaas
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> >>> lvfangmin@gmail.com
> >>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
> >> context.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Leader will use on disk txn sync with followers to if the
> >>> peer
> >>>>>> zxid
> >>>>>>>>>> is not
> >>>>>>>>>>>>> in it's in memory commit logs, the code is here: Leader on
> >>> disk
> >>>>>> txn
> >>>>>>>>>> sync
> >>>>>>>>>>>>> <
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> >>>>>>>>>>>>>> .
> >>>>>>>>>>>>> There is bug that potentially there will be gap in the txn
> >>>>> files,
> >>>>>>> like
> >>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will miss
> >>> txns
> >>>>> due
> >>>>>>> to
> >>>>>>>>>> this.
> >>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
> >>>>>>>>>>>>> <
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> >>>>>>>>>>>>>> ,
> >>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
> >> better
> >>> to
> >>>>>> have
> >>>>>>> a
> >>>>>>>>>> PR to
> >>>>>>>>>>>>> set this to -1 by default. It might have more SNAP sync,
> >> but
> >>>>> from
> >>>>>>> our
> >>>>>>>>>> prod
> >>>>>>>>>>>>> it doesn't seem to be a big problem to me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I can send out the diff to disable it by default on 3.5 if
> >>> you
> >>>>>> guys
> >>>>>>>>>> think
> >>>>>>>>>>>>> this is the right way to do.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Fangmin
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> >>> andor@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>> What’s needed to turn it off?
> >>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
> >>>>>>>>>>>>>> Shall we implement a feature switch for that and turn it
> >>> off by
> >>>>>>>>>> default?
> >>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> >>> lvfangmin@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> >> case
> >>> of
> >>>>>>>>>>>>>> inconsistency
> >>>>>>>>>>>>>>> which could caused by on disk txn sync, as I mentioned
> >> in
> >>> a
> >>>>>> newer
> >>>>>>>>>> JIRA
> >>>>>>>>>>>>>>> ZOOKEEPER-2846 <
> >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in the
> >> txn
> >>>>> file,
> >>>>>>>>>> which
> >>>>>>>>>>>>>> is a
> >>>>>>>>>>>>>>> more common case could trigger this issue.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
> >>> default
> >>>>> for
> >>>>>>> now
> >>>>>>>>>> to
> >>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
> >>> can use
> >>>>>>> that
> >>>>>>>>>> to
> >>>>>>>>>>>>>>> validate the on disk txns during syncing.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Fangmin
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> >>>>> lvfangmin@gmail.com
> >>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Andor,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> >> checking
> >>> to
> >>>>>> help
> >>>>>>>>>>>>>> detecting
> >>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of code
> >>>>> change.
> >>>>>>> I'll
> >>>>>>>>>>>>>> start
> >>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's being
> >>>>> merged
> >>>>>> in
> >>>>>>>>>> the
> >>>>>>>>>>>>>> next
> >>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
> >>> not a
> >>>>>>> block
> >>>>>>>>>> for
> >>>>>>>>>>>>>> 3.5.
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Fangmin
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> >>>>> andor@apache.org
> >>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>> Fangmin,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
> >>>>> consistency
> >>>>>>>>>> fixes
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
> >>> backports
> >>>>>> and
> >>>>>>>>>>>>> we’ll
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> done with them.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> >> wouldn’t
> >>>>> block
> >>>>>>> 3.5
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>> that. What do you think?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> >>> andor@apache.org
> >>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The updated list is as follows:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> >>>>>> protocol)
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> >>>>> follower
> >>>>>>> sync
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>> leader and follower receiving external connection
> >>> requests.)
> >>>>>>>>>>>>>>>>>> The following are not critical and no blockers for
> >> the
> >>>>> stable
> >>>>>>>>>>>>> release:
> >>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> New feature:
> >>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> >>>>> lvfangmin@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>> Hi Andor,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was disabled
> >>>>>> internally
> >>>>>>>>>> after
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The only
> >>>>> solution
> >>>>>> we
> >>>>>>>>>> have
> >>>>>>>>>>>>>>>>> for now
> >>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
> >>> mentioned
> >>>>>> in
> >>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
> >>> issues we
> >>>>>>> just
> >>>>>>>>>>>>> fixed
> >>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> >>>>>>>>>> ZOOKEEPER-3127, I
> >>>>>>>>>>>>>>>>> think we
> >>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release as
> >>> well.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>> Fangmin
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> >>>>>>> andor@apache.org
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> Hi Jeelani,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
> >>> from
> >>>>> the
> >>>>>>>>>> list
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> get
> >>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to avoid
> >>> data
> >>>>>>>>>>>>>> inconsistency?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> >>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> >>> ignoring
> >>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
> >>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
> >> can
> >>> be
> >>>>>>>>>> disabled.
> >>>>>>>>>>>>> We
> >>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> >>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Jeelani
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> >>>>>>>>>> <andor@cloudera.com.INVALID
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>> Fine.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
> >>> have
> >>>>> the
> >>>>>>>>>> list
> >>>>>>>>>>>>>> of:
> >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> >>> Broadcast
> >>>>>>>>>> protocol)
> >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> >> sending
> >>> some
> >>>>>>>>>>>>>>>>>>>> transactions to
> >>>>>>>>>>>>>>>>>>>>> followers)
> >>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> >> between
> >>>>>> follower
> >>>>>>>>>>>>> sync
> >>>>>>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>>> leader and follower receiving external connection
> >>>>>> requests.)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
> >>> 3.5
> >>>>>>> release,
> >>>>>>>>>>>>>>>>> hence
> >>>>>>>>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
> >>> stable
> >>>>>>>>>>>>> release.
> >>>>>>>>>>>>>>>>> PR
> >>>>>>>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> >>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues which
> >> have
> >>>>> been
> >>>>>>>>>> found
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>> the 3.5
> >>>>>>>>>>>>>>>>>>>>> branch.
> >>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> >>> fixed in
> >>>>>>> 3.4,
> >>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>>>> never has
> >>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> >>>>> present.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual testing
> >>> and
> >>>>> see
> >>>>>>> if
> >>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead with
> >> a
> >>>>> stable
> >>>>>>>>>>>>>> release.
> >>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> >>>>>>> hanm@apache.org>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but looks
> >>> like
> >>>>>> lots
> >>>>>>>>>> of
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> JIRA
> >>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
> >>> ZOOKEEPER-1549,
> >>>>>> 2846,
> >>>>>>>>>> also
> >>>>>>>>>>>>>>>>>>>> affects
> >>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
> >>> blocking
> >>>>>> issue
> >>>>>>>>>> for a
> >>>>>>>>>>>>>>>>>>>> stable 3.5
> >>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
> >>> concerns
> >>>>>>> around
> >>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>> (somehow
> >>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
> >>> beta
> >>>>>>> releases
> >>>>>>>>>>>>> were
> >>>>>>>>>>>>>>>>>>>> created
> >>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> >>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> >>>>>> Something
> >>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>>>>>> worth
> >>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
> >>> 3.4
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> >>>>> record
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> >>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or
> >> no
> >>>>> flaky
> >>>>>>>>>> tests
> >>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>> all.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> >>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> >>> appreciate
> >>>>>> it.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
>
> --


-- Enrico Olivelli

Re: ZooKeeper 3.5 blocker issues

Posted by Andor Molnar <an...@apache.org>.
Ok. Looks like ant still works properly, so let’s commit this patch and you guys can collaborate to polish the Maven build.

For now, it’s master-only.

Thanks,
Andor



> On 2018. Dec 19., at 16:44, Norbert Kalmar <nk...@cloudera.com.INVALID> wrote:
> 
> Thank you Enrico, I agree, that we could commit this patch at it's current
> state, it fulfills the original jira anyways.
> 
> I'll see what's wrong with the java tests, but honestly, it looks like
> they're just flaky... runs well on local builds with 8 thread.
> 
> Regards,
> Norbert
> 
> On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes <ta...@cloudera.com.invalid>
> wrote:
> 
>> Hi All,
>> 
>> For assembly task I would promote the way how HBase works.
>> They create a pure source and a bin tarball separately. Please see how they
>> create a release here:
>> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
>> We could probably use the well known "copy+paste technology" to have it
>> within ZooKeeper the same way. ;-)
>> 
>> Regards, Tamaas
>> 
>> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com>
>> wrote:
>> 
>>> Great work Norbert
>>> I you want I can help,especially for rat, findbugs (need to switch to
>>> spotbugs anyway) and OWASP stuff (recently I started using Maven
>>> Plugin in other projects)
>>> But I am not sure how can I help you concretely if we do not commit your
>>> work.
>>> We could commit the work as it is now, leaving "ant" as official build
>>> method, but having the poms committed will ease collaboration.
>>> 
>>> We will also have to work on CI jobs, I can help on that part as well
>>> 
>>> Enrico
>>> 
>>> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
>>> <nk...@cloudera.com.invalid> ha scritto:
>>>> 
>>>> Hi everyone,
>>>> 
>>>> Some update on the maven migration: I had a few bumps here and there
>>> (just
>>>> looking at the latest patch Andor linked -
>>>> https://github.com/apache/zookeeper/pull/708 - you can see on the
>>> commits).
>>>> Current state is that the build works, tests run, but reports like
>>>> findbugs, clover etc. are not yet implemented. Maven has plugins for
>> them
>>>> usually, but it's not always trivial, especially with the C client. The
>>>> assembly is also left to be done, but it should be fairly easy to do a
>>>> similar tarball then ant does (although this will be also an
>> interesting
>>>> task, as ant does some strange things, like duplicated sources of most
>>>> contrib projects).
>>>> 
>>>> I had a seperate jira to do the recipes and contrib maven build. I do
>> not
>>>> have open PR for it, but recipes is done and I am now working on the
>>>> contrib projects. Most of them is manually build and never gets called
>>> from
>>>> the main build.xml. I will not integrate these either to the maven
>> build.
>>>> The reason is that there are plans to remove some of them from ZK repo
>>>> anyway. The other reason is that for starters, we want to replicate the
>>> ant
>>>> build as closely as possible, without doing any nasty workarounds in
>>> maven
>>>> to achieve that. And from there, we can improve, use maven's advantages
>>> to
>>>> shape the build of ZooKeeper. Once it is stable and proven to have all
>>> the
>>>> functionality required for build and release.
>>>> 
>>>> Right now, I am trying to stabilize the build as much as possible.
>> Andor
>>>> also fixed some flaky C tests that for some strange reasons, become
>>>> extremely flaky with the maven build:
>>>> https://github.com/apache/zookeeper/pull/740
>>>> 
>>>> Regards,
>>>> Norbert
>>>> 
>>>> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
>> <andor@cloudera.com.invalid
>>>> 
>>>> wrote:
>>>> 
>>>>> Sure, good point. Let's put it on the list.
>>>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
>>> wrote:
>>>>> 
>>>>>> Are folks OK to wait on that OWASP issue I documented over the
>>> weekend?
>>>>>> afaict we are not affected but it would be good to get another pair
>>> of
>>>>> eyes
>>>>>> on it.
>>>>>> 
>>>>>> Patrick
>>>>>> 
>>>>>> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
>>> wrote:
>>>>>> 
>>>>>>> Hi team,
>>>>>>> 
>>>>>>> 
>>>>>>> I'm proudly announce that thanks to the joint effort from the
>>>>> community,
>>>>>>> the 3.5 blockers list has become empty:
>>>>>>> 
>>>>>>> "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
>>> 3.5.5
>>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC, key
>>> ASC"
>>>>>>> 
>>>>>>> 
>>>>>>> Well... almost. All the blocker issues have gone, but we still
>>> have the
>>>>>>> Maven migration to complete before the stable release. If you
>> have
>>> some
>>>>>>> free cycles, please join us testing the Maven build on this PR:
>>>>>>> 
>>>>>>> https://github.com/apache/zookeeper/pull/708
>>>>>>> 
>>>>>>> I hope we can merge it pretty soon.
>>>>>>> 
>>>>>>> 
>>>>>>> In terms of the builds, the weather at 3.5 branch is quite sunny
>>>>>> nowadays:
>>>>>>> 
>>>>>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/
>>>>>>> 
>>>>>>> The Java 11 build is still having some difficulties, which
>>> hopefully I
>>>>>>> can address before the holidays:
>>>>>>> 
>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
>>>>>>> 
>>>>>>> 
>>>>>>> If you happen to know about something which is important from
>> 3.5's
>>>>>>> perspective and missing from the above, please don't hesitate to
>>> share.
>>>>>>> 
>>>>>>> 
>>>>>>> Happy ZooKeeping!
>>>>>>> 
>>>>>>> Andor
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 11/2/18 21:12, Fangmin Lv wrote:
>>>>>>>> Andor,
>>>>>>>> 
>>>>>>>> Here is the PR to port ZK-3104 from master to 3.4:
>>>>>>>> https://github.com/apache/zookeeper/pull/685.
>>>>>>>> 
>>>>>>>> Fangmin
>>>>>>>> 
>>>>>>>> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
>> lvfangmin@gmail.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Andor,
>>>>>>>>> 
>>>>>>>>> Is anyone working on ZK-2778? I can pick it up if there is no
>>> one
>>>>>>> working
>>>>>>>>> on it yet.
>>>>>>>>> 
>>>>>>>>> I'll open a 3.5 PR for ZK-3104 today.
>>>>>>>>> 
>>>>>>>>> Fangmin
>>>>>>>>> 
>>>>>>>>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
>> andor@apache.org>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> You’ve probably realised lots of update emails coming from
>>> Jira.
>>>>>> Please
>>>>>>>>>> be aware that we’ve updated a bunch of open blocker/critical
>>> 3.5
>>>>>>> tickets to
>>>>>>>>>> reflect to what we discussed in this email.
>>>>>>>>>> 
>>>>>>>>>> If you open up the following jira filter:
>>>>>>>>>> 
>>>>>>>>>> project = ZooKeeper and resolution = Unresolved and
>> fixVersion
>>> =
>>>>>> 3.5.5
>>>>>>>>>> AND priority in (blocker, critical) ORDER BY priority DESC,
>>> key ASC
>>>>>>>>>> 
>>>>>>>>>> You’ll see the most up-to-date list of tickets which need to
>> be
>>>>>>> addressed
>>>>>>>>>> before the stable 3.5 release.
>>>>>>>>>> 
>>>>>>>>>> Thank you for your efforts to get this done.
>>>>>>>>>> 
>>>>>>>>>> Fangmin, ZK-3104 is waiting for backport, but ticket has
>>> already
>>>>> been
>>>>>>>>>> resolved. Have you created a separate ticket for the backport
>>> or
>>>>>> shall
>>>>>>> I
>>>>>>>>>> just reopen it with the right fix versions?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Andor
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> Let me summarize and give a quick update on the outstanding
>>> issues
>>>>>> for
>>>>>>>>>> 3.5 GA:
>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
>>> sync
>>>>>> with
>>>>>>>>>> leader and follower receiving external connection requests.)
>>>>>>>>>>> - ZOOKEEPER-3021 Migrate project structure to Maven
>> (ongoing)
>>>>>>>>>>> - ZOOKEEPER-925 Docs generation to Maven
>>>>>>>>>>> - ZOOKEEPER-3104 (waiting for backport)
>>>>>>>>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
>>>>>>>>>>> 
>>>>>>>>>>> The 2 Maven related tickets are no-brainers as well as the
>>>>>> backports.
>>>>>>>>>> ZK-2778 has been picked up by Maoling (thanks!) as far as I
>> can
>>>>> see,
>>>>>>>>>> ZK-1818 is the only one waiting for a volunteer.
>>>>>>>>>>> Please correct me if I’ve missed something.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Andor
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
>>>>>> <tamaas@cloudera.com.INVALID
>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>> 
>>>>>>>>>>>> I would add ZOOKEEPER-3021
>>>>>>>>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
>>> Migrate
>>>>>>> project
>>>>>>>>>>>> structure to Maven build as a blocker too. Since the
>>> migration
>>>>> has
>>>>>>>>>> started
>>>>>>>>>>>> it would be good to finish before releasing ZK 3.5.x GA.
>>>>>>>>>>>> 
>>>>>>>>>>>> ZOOKEEPER-925 <
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-925
>>>>>>> 
>>>>>>>>>> replace
>>>>>>>>>>>> our forrest site and documentation generation might also
>> be a
>>>>> good
>>>>>>>>>> idea,
>>>>>>>>>>>> since then we could deliver the new MarkDown based
>>> documentation.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards, Tamaas
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
>>> lvfangmin@gmail.com
>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>>> Oh, sorry for the confusion, I should provide more
>> context.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Leader will use on disk txn sync with followers to if the
>>> peer
>>>>>> zxid
>>>>>>>>>> is not
>>>>>>>>>>>>> in it's in memory commit logs, the code is here: Leader on
>>> disk
>>>>>> txn
>>>>>>>>>> sync
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
>>>>>>>>>>>>>> .
>>>>>>>>>>>>> There is bug that potentially there will be gap in the txn
>>>>> files,
>>>>>>> like
>>>>>>>>>>>>> after snap sync, etc, so it's possible the peer will miss
>>> txns
>>>>> due
>>>>>>> to
>>>>>>>>>> this.
>>>>>>>>>>>>> The option to disable it is snapshotSizeFactor
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
>>>>>>>>>>>>>> ,
>>>>>>>>>>>>> set it to -1 will disable this feature. On 3.5, it's
>> better
>>> to
>>>>>> have
>>>>>>> a
>>>>>>>>>> PR to
>>>>>>>>>>>>> set this to -1 by default. It might have more SNAP sync,
>> but
>>>>> from
>>>>>>> our
>>>>>>>>>> prod
>>>>>>>>>>>>> it doesn't seem to be a big problem to me.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I can send out the diff to disable it by default on 3.5 if
>>> you
>>>>>> guys
>>>>>>>>>> think
>>>>>>>>>>>>> this is the right way to do.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Fangmin
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
>>> andor@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> What’s needed to turn it off?
>>>>>>>>>>>>>> Do we need a PR or it’s just a config option?
>>>>>>>>>>>>>> Shall we implement a feature switch for that and turn it
>>> off by
>>>>>>>>>> default?
>>>>>>>>>>>>>> Sorry I don’t have too much insight on disk txn sync.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
>>> lvfangmin@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
>> case
>>> of
>>>>>>>>>>>>>> inconsistency
>>>>>>>>>>>>>>> which could caused by on disk txn sync, as I mentioned
>> in
>>> a
>>>>>> newer
>>>>>>>>>> JIRA
>>>>>>>>>>>>>>> ZOOKEEPER-2846 <
>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> snap sync or txn sync could also leave txns gap in the
>> txn
>>>>> file,
>>>>>>>>>> which
>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>> more common case could trigger this issue.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I would suggest to turn off the on disk txn sync by
>>> default
>>>>> for
>>>>>>> now
>>>>>>>>>> to
>>>>>>>>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
>>> can use
>>>>>>> that
>>>>>>>>>> to
>>>>>>>>>>>>>>> validate the on disk txns during syncing.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Fangmin
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
>>>>> lvfangmin@gmail.com
>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Andor,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
>> checking
>>> to
>>>>>> help
>>>>>>>>>>>>>> detecting
>>>>>>>>>>>>>>>> inconsistency, it's a new feature with amounts of code
>>>>> change.
>>>>>>> I'll
>>>>>>>>>>>>>> start
>>>>>>>>>>>>>>>> upstream it part by part, but I don't expect it's being
>>>>> merged
>>>>>> in
>>>>>>>>>> the
>>>>>>>>>>>>>> next
>>>>>>>>>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
>>> not a
>>>>>>> block
>>>>>>>>>> for
>>>>>>>>>>>>>> 3.5.
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Fangmin
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
>>>>> andor@apache.org
>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Fangmin,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Sorry, I just noticed that you want to include the
>>>>> consistency
>>>>>>>>>> fixes
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> the stable version which is fine. Let’s finish the
>>> backports
>>>>>> and
>>>>>>>>>>>>> we’ll
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> done with them.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
>> wouldn’t
>>>>> block
>>>>>>> 3.5
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> that. What do you think?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
>>> andor@apache.org
>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Cool, thanks for the clarification.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The updated list is as follows:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
>>>>>> protocol)
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
>>>>> follower
>>>>>>> sync
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> leader and follower receiving external connection
>>> requests.)
>>>>>>>>>>>>>>>>>> The following are not critical and no blockers for
>> the
>>>>> stable
>>>>>>>>>>>>> release:
>>>>>>>>>>>>>>>>>> Waiting for to be ported to 3.5:
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-3104
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-3125
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-3127
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> New feature:
>>>>>>>>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
>>>>> lvfangmin@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> Hi Andor,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> That's the on disk txn feature, which was disabled
>>>>>> internally
>>>>>>>>>> after
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> found the potentially inconsistent issue. The only
>>>>> solution
>>>>>> we
>>>>>>>>>> have
>>>>>>>>>>>>>>>>> for now
>>>>>>>>>>>>>>>>>>> is waiting for the new digest checking feature I
>>> mentioned
>>>>>> in
>>>>>>>>>>>>>>>>>>> ZOOKEEPER-3114.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I think there are some other critical consistent
>>> issues we
>>>>>>> just
>>>>>>>>>>>>> fixed
>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
>>>>>>>>>> ZOOKEEPER-3127, I
>>>>>>>>>>>>>>>>> think we
>>>>>>>>>>>>>>>>>>> should include that in the official 3.5 release as
>>> well.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Fangmin
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
>>>>>>> andor@apache.org
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Hi Jeelani,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
>>> from
>>>>> the
>>>>>>>>>> list
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>>> closer to a stable release. :)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> What's the feature which can be disabled to avoid
>>> data
>>>>>>>>>>>>>> inconsistency?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
>>>>>>>>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
>>> ignoring
>>>>>>>>>>>>>>>>> ZOOKEEPER-2418 as
>>>>>>>>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
>> can
>>> be
>>>>>>>>>> disabled.
>>>>>>>>>>>>> We
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Jeelani
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
>>>>>>>>>> <andor@cloudera.com.INVALID
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> Fine.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
>>> have
>>>>> the
>>>>>>>>>> list
>>>>>>>>>>>>>> of:
>>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
>>> Broadcast
>>>>>>>>>> protocol)
>>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
>> sending
>>> some
>>>>>>>>>>>>>>>>>>>> transactions to
>>>>>>>>>>>>>>>>>>>>> followers)
>>>>>>>>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
>> between
>>>>>> follower
>>>>>>>>>>>>> sync
>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> leader and follower receiving external connection
>>>>>> requests.)
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
>>> 3.5
>>>>>>> release,
>>>>>>>>>>>>>>>>> hence
>>>>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
>>> stable
>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>> PR
>>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>>> been out for a long time, get on reviewing please.
>>>>>>>>>>>>>>>>>>>>> The rest are also long outstanding issues which
>> have
>>>>> been
>>>>>>>>>> found
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> the 3.5
>>>>>>>>>>>>>>>>>>>>> branch.
>>>>>>>>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
>>> fixed in
>>>>>>> 3.4,
>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>> never has
>>>>>>>>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
>>>>> present.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I think we should at least run some manual testing
>>> and
>>>>> see
>>>>>>> if
>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>>>>>> repro any of these issues before going ahead with
>> a
>>>>> stable
>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
>>>>>>> hanm@apache.org>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> I haven't went through the entire list, but looks
>>> like
>>>>>> lots
>>>>>>>>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> JIRA
>>>>>>>>>>>>>>>>>>>>>> issues listed in this thread, such as
>>> ZOOKEEPER-1549,
>>>>>> 2846,
>>>>>>>>>> also
>>>>>>>>>>>>>>>>>>>> affects
>>>>>>>>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> I think historically the single outstanding
>>> blocking
>>>>>> issue
>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>> stable 3.5
>>>>>>>>>>>>>>>>>>>>>> release is the reconfig feature and security
>>> concerns
>>>>>>> around
>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>> (somehow
>>>>>>>>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
>>> beta
>>>>>>> releases
>>>>>>>>>>>>> were
>>>>>>>>>>>>>>>>>>>> created
>>>>>>>>>>>>>>>>>>>>>> to stabilize that feature.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
>>>>>>>>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> So it looks like we are in good shape to release.
>>>>>> Something
>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>> worth
>>>>>>>>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
>>> 3.4
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
>>>>> record
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
>>>>>>>>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or
>> no
>>>>> flaky
>>>>>>>>>> tests
>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>> all.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
>>>>>>>>>>>>>>>>>>>> <an...@cloudera.com.invalid>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
>>> appreciate
>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 


Re: ZooKeeper 3.5 blocker issues

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Thank you Enrico, I agree, that we could commit this patch at it's current
state, it fulfills the original jira anyways.

I'll see what's wrong with the java tests, but honestly, it looks like
they're just flaky... runs well on local builds with 8 thread.

Regards,
Norbert

On Wed, Dec 19, 2018 at 2:50 PM Tamas Penzes <ta...@cloudera.com.invalid>
wrote:

> Hi All,
>
> For assembly task I would promote the way how HBase works.
> They create a pure source and a bin tarball separately. Please see how they
> create a release here:
> https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
> We could probably use the well known "copy+paste technology" to have it
> within ZooKeeper the same way. ;-)
>
> Regards, Tamaas
>
> On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com>
> wrote:
>
> > Great work Norbert
> > I you want I can help,especially for rat, findbugs (need to switch to
> > spotbugs anyway) and OWASP stuff (recently I started using Maven
> > Plugin in other projects)
> > But I am not sure how can I help you concretely if we do not commit your
> > work.
> > We could commit the work as it is now, leaving "ant" as official build
> > method, but having the poms committed will ease collaboration.
> >
> > We will also have to work on CI jobs, I can help on that part as well
> >
> > Enrico
> >
> > Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> > <nk...@cloudera.com.invalid> ha scritto:
> > >
> > > Hi everyone,
> > >
> > > Some update on the maven migration: I had a few bumps here and there
> > (just
> > > looking at the latest patch Andor linked -
> > > https://github.com/apache/zookeeper/pull/708 - you can see on the
> > commits).
> > > Current state is that the build works, tests run, but reports like
> > > findbugs, clover etc. are not yet implemented. Maven has plugins for
> them
> > > usually, but it's not always trivial, especially with the C client. The
> > > assembly is also left to be done, but it should be fairly easy to do a
> > > similar tarball then ant does (although this will be also an
> interesting
> > > task, as ant does some strange things, like duplicated sources of most
> > > contrib projects).
> > >
> > > I had a seperate jira to do the recipes and contrib maven build. I do
> not
> > > have open PR for it, but recipes is done and I am now working on the
> > > contrib projects. Most of them is manually build and never gets called
> > from
> > > the main build.xml. I will not integrate these either to the maven
> build.
> > > The reason is that there are plans to remove some of them from ZK repo
> > > anyway. The other reason is that for starters, we want to replicate the
> > ant
> > > build as closely as possible, without doing any nasty workarounds in
> > maven
> > > to achieve that. And from there, we can improve, use maven's advantages
> > to
> > > shape the build of ZooKeeper. Once it is stable and proven to have all
> > the
> > > functionality required for build and release.
> > >
> > > Right now, I am trying to stabilize the build as much as possible.
> Andor
> > > also fixed some flaky C tests that for some strange reasons, become
> > > extremely flaky with the maven build:
> > > https://github.com/apache/zookeeper/pull/740
> > >
> > > Regards,
> > > Norbert
> > >
> > > On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar
> <andor@cloudera.com.invalid
> > >
> > > wrote:
> > >
> > > > Sure, good point. Let's put it on the list.
> > > >
> > > > Andor
> > > >
> > > >
> > > > On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
> > wrote:
> > > >
> > > > > Are folks OK to wait on that OWASP issue I documented over the
> > weekend?
> > > > > afaict we are not affected but it would be good to get another pair
> > of
> > > > eyes
> > > > > on it.
> > > > >
> > > > > Patrick
> > > > >
> > > > > On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi team,
> > > > > >
> > > > > >
> > > > > > I'm proudly announce that thanks to the joint effort from the
> > > > community,
> > > > > > the 3.5 blockers list has become empty:
> > > > > >
> > > > > > "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
> > 3.5.5
> > > > > > AND priority in (blocker, critical) ORDER BY priority DESC, key
> > ASC"
> > > > > >
> > > > > >
> > > > > > Well... almost. All the blocker issues have gone, but we still
> > have the
> > > > > > Maven migration to complete before the stable release. If you
> have
> > some
> > > > > > free cycles, please join us testing the Maven build on this PR:
> > > > > >
> > > > > > https://github.com/apache/zookeeper/pull/708
> > > > > >
> > > > > > I hope we can merge it pretty soon.
> > > > > >
> > > > > >
> > > > > > In terms of the builds, the weather at 3.5 branch is quite sunny
> > > > > nowadays:
> > > > > >
> > > > > > https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > > > > >
> > > > > > The Java 11 build is still having some difficulties, which
> > hopefully I
> > > > > > can address before the holidays:
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > > > > >
> > > > > >
> > > > > > If you happen to know about something which is important from
> 3.5's
> > > > > > perspective and missing from the above, please don't hesitate to
> > share.
> > > > > >
> > > > > >
> > > > > > Happy ZooKeeping!
> > > > > >
> > > > > > Andor
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 11/2/18 21:12, Fangmin Lv wrote:
> > > > > > > Andor,
> > > > > > >
> > > > > > > Here is the PR to port ZK-3104 from master to 3.4:
> > > > > > > https://github.com/apache/zookeeper/pull/685.
> > > > > > >
> > > > > > > Fangmin
> > > > > > >
> > > > > > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <
> lvfangmin@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > >> Hi Andor,
> > > > > > >>
> > > > > > >> Is anyone working on ZK-2778? I can pick it up if there is no
> > one
> > > > > > working
> > > > > > >> on it yet.
> > > > > > >>
> > > > > > >> I'll open a 3.5 PR for ZK-3104 today.
> > > > > > >>
> > > > > > >> Fangmin
> > > > > > >>
> > > > > > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <
> andor@apache.org>
> > > > > wrote:
> > > > > > >>
> > > > > > >>> Hi folks,
> > > > > > >>>
> > > > > > >>> You’ve probably realised lots of update emails coming from
> > Jira.
> > > > > Please
> > > > > > >>> be aware that we’ve updated a bunch of open blocker/critical
> > 3.5
> > > > > > tickets to
> > > > > > >>> reflect to what we discussed in this email.
> > > > > > >>>
> > > > > > >>> If you open up the following jira filter:
> > > > > > >>>
> > > > > > >>> project = ZooKeeper and resolution = Unresolved and
> fixVersion
> > =
> > > > > 3.5.5
> > > > > > >>> AND priority in (blocker, critical) ORDER BY priority DESC,
> > key ASC
> > > > > > >>>
> > > > > > >>> You’ll see the most up-to-date list of tickets which need to
> be
> > > > > > addressed
> > > > > > >>> before the stable 3.5 release.
> > > > > > >>>
> > > > > > >>> Thank you for your efforts to get this done.
> > > > > > >>>
> > > > > > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> > already
> > > > been
> > > > > > >>> resolved. Have you created a separate ticket for the backport
> > or
> > > > > shall
> > > > > > I
> > > > > > >>> just reopen it with the right fix versions?
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Andor
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
> > wrote:
> > > > > > >>>>
> > > > > > >>>> Hi,
> > > > > > >>>>
> > > > > > >>>> Let me summarize and give a quick update on the outstanding
> > issues
> > > > > for
> > > > > > >>> 3.5 GA:
> > > > > > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> > sync
> > > > > with
> > > > > > >>> leader and follower receiving external connection requests.)
> > > > > > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven
> (ongoing)
> > > > > > >>>> - ZOOKEEPER-925 Docs generation to Maven
> > > > > > >>>> - ZOOKEEPER-3104 (waiting for backport)
> > > > > > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > > > > >>>>
> > > > > > >>>> The 2 Maven related tickets are no-brainers as well as the
> > > > > backports.
> > > > > > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I
> can
> > > > see,
> > > > > > >>> ZK-1818 is the only one waiting for a volunteer.
> > > > > > >>>> Please correct me if I’ve missed something.
> > > > > > >>>>
> > > > > > >>>> Regards,
> > > > > > >>>> Andor
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > > > > <tamaas@cloudera.com.INVALID
> > > > > > >
> > > > > > >>> wrote:
> > > > > > >>>>> Hi All,
> > > > > > >>>>>
> > > > > > >>>>> I would add ZOOKEEPER-3021
> > > > > > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> > Migrate
> > > > > > project
> > > > > > >>>>> structure to Maven build as a blocker too. Since the
> > migration
> > > > has
> > > > > > >>> started
> > > > > > >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > > > > >>>>>
> > > > > > >>>>> ZOOKEEPER-925 <
> > > > https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > > > > >
> > > > > > >>> replace
> > > > > > >>>>> our forrest site and documentation generation might also
> be a
> > > > good
> > > > > > >>> idea,
> > > > > > >>>>> since then we could deliver the new MarkDown based
> > documentation.
> > > > > > >>>>>
> > > > > > >>>>> Regards, Tamaas
> > > > > > >>>>>
> > > > > > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> > lvfangmin@gmail.com
> > > > >
> > > > > > >>> wrote:
> > > > > > >>>>>> Oh, sorry for the confusion, I should provide more
> context.
> > > > > > >>>>>>
> > > > > > >>>>>> Leader will use on disk txn sync with followers to if the
> > peer
> > > > > zxid
> > > > > > >>> is not
> > > > > > >>>>>> in it's in memory commit logs, the code is here: Leader on
> > disk
> > > > > txn
> > > > > > >>> sync
> > > > > > >>>>>> <
> > > > > > >>>>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > > > > >>>>>>> .
> > > > > > >>>>>> There is bug that potentially there will be gap in the txn
> > > > files,
> > > > > > like
> > > > > > >>>>>> after snap sync, etc, so it's possible the peer will miss
> > txns
> > > > due
> > > > > > to
> > > > > > >>> this.
> > > > > > >>>>>> The option to disable it is snapshotSizeFactor
> > > > > > >>>>>> <
> > > > > > >>>>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > > > > >>>>>>> ,
> > > > > > >>>>>> set it to -1 will disable this feature. On 3.5, it's
> better
> > to
> > > > > have
> > > > > > a
> > > > > > >>> PR to
> > > > > > >>>>>> set this to -1 by default. It might have more SNAP sync,
> but
> > > > from
> > > > > > our
> > > > > > >>> prod
> > > > > > >>>>>> it doesn't seem to be a big problem to me.
> > > > > > >>>>>>
> > > > > > >>>>>> I can send out the diff to disable it by default on 3.5 if
> > you
> > > > > guys
> > > > > > >>> think
> > > > > > >>>>>> this is the right way to do.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>> Fangmin
> > > > > > >>>>>>
> > > > > > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> > andor@apache.org>
> > > > > > >>> wrote:
> > > > > > >>>>>>> What’s needed to turn it off?
> > > > > > >>>>>>> Do we need a PR or it’s just a config option?
> > > > > > >>>>>>> Shall we implement a feature switch for that and turn it
> > off by
> > > > > > >>> default?
> > > > > > >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Andor
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> > lvfangmin@gmail.com>
> > > > > > wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one
> case
> > of
> > > > > > >>>>>>> inconsistency
> > > > > > >>>>>>>> which could caused by on disk txn sync, as I mentioned
> in
> > a
> > > > > newer
> > > > > > >>> JIRA
> > > > > > >>>>>>>> ZOOKEEPER-2846 <
> > > > > > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > > > > >>>>>>> the
> > > > > > >>>>>>>> snap sync or txn sync could also leave txns gap in the
> txn
> > > > file,
> > > > > > >>> which
> > > > > > >>>>>>> is a
> > > > > > >>>>>>>> more common case could trigger this issue.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> I would suggest to turn off the on disk txn sync by
> > default
> > > > for
> > > > > > now
> > > > > > >>> to
> > > > > > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
> > can use
> > > > > > that
> > > > > > >>> to
> > > > > > >>>>>>>> validate the on disk txns during syncing.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks,
> > > > > > >>>>>>>> Fangmin
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > > > lvfangmin@gmail.com
> > > > > >
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>>>> Andor,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest
> checking
> > to
> > > > > help
> > > > > > >>>>>>> detecting
> > > > > > >>>>>>>>> inconsistency, it's a new feature with amounts of code
> > > > change.
> > > > > > I'll
> > > > > > >>>>>>> start
> > > > > > >>>>>>>>> upstream it part by part, but I don't expect it's being
> > > > merged
> > > > > in
> > > > > > >>> the
> > > > > > >>>>>>> next
> > > > > > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
> > not a
> > > > > > block
> > > > > > >>> for
> > > > > > >>>>>>> 3.5.
> > > > > > >>>>>>>>> Thanks,
> > > > > > >>>>>>>>> Fangmin
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > > > andor@apache.org
> > > > > >
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>>>>> Fangmin,
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Sorry, I just noticed that you want to include the
> > > > consistency
> > > > > > >>> fixes
> > > > > > >>>>>> in
> > > > > > >>>>>>>>>> the stable version which is fine. Let’s finish the
> > backports
> > > > > and
> > > > > > >>>>>> we’ll
> > > > > > >>>>>>> be
> > > > > > >>>>>>>>>> done with them.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I
> wouldn’t
> > > > block
> > > > > > 3.5
> > > > > > >>>>>> with
> > > > > > >>>>>>>>>> that. What do you think?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Andor
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> > andor@apache.org
> > > > >
> > > > > > >>> wrote:
> > > > > > >>>>>>>>>>> Cool, thanks for the clarification.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> The updated list is as follows:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > > > protocol)
> > > > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > > > follower
> > > > > > sync
> > > > > > >>>>>> with
> > > > > > >>>>>>>>>> leader and follower receiving external connection
> > requests.)
> > > > > > >>>>>>>>>>> The following are not critical and no blockers for
> the
> > > > stable
> > > > > > >>>>>> release:
> > > > > > >>>>>>>>>>> Waiting for to be ported to 3.5:
> > > > > > >>>>>>>>>>> - ZOOKEEPER-3104
> > > > > > >>>>>>>>>>> - ZOOKEEPER-3125
> > > > > > >>>>>>>>>>> - ZOOKEEPER-3127
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> New feature:
> > > > > > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Regards,
> > > > > > >>>>>>>>>>> Andor
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > > > lvfangmin@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>>>>>>>>>> Hi Andor,
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > > > > internally
> > > > > > >>> after
> > > > > > >>>>>>> we
> > > > > > >>>>>>>>>>>> found the potentially inconsistent issue. The only
> > > > solution
> > > > > we
> > > > > > >>> have
> > > > > > >>>>>>>>>> for now
> > > > > > >>>>>>>>>>>> is waiting for the new digest checking feature I
> > mentioned
> > > > > in
> > > > > > >>>>>>>>>>>> ZOOKEEPER-3114.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I think there are some other critical consistent
> > issues we
> > > > > > just
> > > > > > >>>>>> fixed
> > > > > > >>>>>>>>>> on
> > > > > > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > > > > >>> ZOOKEEPER-3127, I
> > > > > > >>>>>>>>>> think we
> > > > > > >>>>>>>>>>>> should include that in the official 3.5 release as
> > well.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Thanks,
> > > > > > >>>>>>>>>>>> Fangmin
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > > > > andor@apache.org
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>> Hi Jeelani,
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
> > from
> > > > the
> > > > > > >>> list
> > > > > > >>>>>> to
> > > > > > >>>>>>>>>> get
> > > > > > >>>>>>>>>>>>> closer to a stable release. :)
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> What's the feature which can be disabled to avoid
> > data
> > > > > > >>>>>>> inconsistency?
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Andor
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > > > > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> > ignoring
> > > > > > >>>>>>>>>> ZOOKEEPER-2418 as
> > > > > > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature
> can
> > be
> > > > > > >>> disabled.
> > > > > > >>>>>> We
> > > > > > >>>>>>>>>> are
> > > > > > >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > > > > >>>>>>>>>>>>>> Regards,
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Jeelani
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > > > > >>> <andor@cloudera.com.INVALID
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>> Fine.
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
> > have
> > > > the
> > > > > > >>> list
> > > > > > >>>>>>> of:
> > > > > > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> > Broadcast
> > > > > > >>> protocol)
> > > > > > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip
> sending
> > some
> > > > > > >>>>>>>>>>>>> transactions to
> > > > > > >>>>>>>>>>>>>> followers)
> > > > > > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock
> between
> > > > > follower
> > > > > > >>>>>> sync
> > > > > > >>>>>>>>>>>>> with
> > > > > > >>>>>>>>>>>>>> leader and follower receiving external connection
> > > > > requests.)
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
> > 3.5
> > > > > > release,
> > > > > > >>>>>>>>>> hence
> > > > > > >>>>>>>>>>>>> I
> > > > > > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
> > stable
> > > > > > >>>>>> release.
> > > > > > >>>>>>>>>> PR
> > > > > > >>>>>>>>>>>>> has
> > > > > > >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > > > > > >>>>>>>>>>>>>> The rest are also long outstanding issues which
> have
> > > > been
> > > > > > >>> found
> > > > > > >>>>>> in
> > > > > > >>>>>>>>>>>>> the 3.5
> > > > > > >>>>>>>>>>>>>> branch.
> > > > > > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> > fixed in
> > > > > > 3.4,
> > > > > > >>>>>> but
> > > > > > >>>>>>>>>>>>> never has
> > > > > > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> > > > present.
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> I think we should at least run some manual testing
> > and
> > > > see
> > > > > > if
> > > > > > >>> we
> > > > > > >>>>>>>>>>>>> could
> > > > > > >>>>>>>>>>>>>> repro any of these issues before going ahead with
> a
> > > > stable
> > > > > > >>>>>>> release.
> > > > > > >>>>>>>>>>>>>> Regards,
> > > > > > >>>>>>>>>>>>>> Andor
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > > > > hanm@apache.org>
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks
> > like
> > > > > lots
> > > > > > >>> of
> > > > > > >>>>>> the
> > > > > > >>>>>>>>>>>>> JIRA
> > > > > > >>>>>>>>>>>>>>> issues listed in this thread, such as
> > ZOOKEEPER-1549,
> > > > > 2846,
> > > > > > >>> also
> > > > > > >>>>>>>>>>>>> affects
> > > > > > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> I think historically the single outstanding
> > blocking
> > > > > issue
> > > > > > >>> for a
> > > > > > >>>>>>>>>>>>> stable 3.5
> > > > > > >>>>>>>>>>>>>>> release is the reconfig feature and security
> > concerns
> > > > > > around
> > > > > > >>> it
> > > > > > >>>>>>>>>>>>> (somehow
> > > > > > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
> > beta
> > > > > > releases
> > > > > > >>>>>> were
> > > > > > >>>>>>>>>>>>> created
> > > > > > >>>>>>>>>>>>>>> to stabilize that feature.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > > > > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > > > > Something
> > > > > > >>>>>> might
> > > > > > >>>>>>>>>>>>> worth
> > > > > > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
> > 3.4
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> > > > record
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > > > > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or
> no
> > > > flaky
> > > > > > >>> tests
> > > > > > >>>>>> at
> > > > > > >>>>>>>>>>>>> all.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > > > > >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > > > > >>>>>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> > appreciate
> > > > > it.
> > > > > > >>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>> Andor
> > > > > > >>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> >
>

Re: ZooKeeper 3.5 blocker issues

Posted by Tamas Penzes <ta...@cloudera.com.INVALID>.
Hi All,

For assembly task I would promote the way how HBase works.
They create a pure source and a bin tarball separately. Please see how they
create a release here:
https://github.com/apache/hbase/blob/master/dev-support/make_rc.sh
We could probably use the well known "copy+paste technology" to have it
within ZooKeeper the same way. ;-)

Regards, Tamaas

On Wed, Dec 19, 2018 at 2:28 PM Enrico Olivelli <eo...@gmail.com> wrote:

> Great work Norbert
> I you want I can help,especially for rat, findbugs (need to switch to
> spotbugs anyway) and OWASP stuff (recently I started using Maven
> Plugin in other projects)
> But I am not sure how can I help you concretely if we do not commit your
> work.
> We could commit the work as it is now, leaving "ant" as official build
> method, but having the poms committed will ease collaboration.
>
> We will also have to work on CI jobs, I can help on that part as well
>
> Enrico
>
> Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
> <nk...@cloudera.com.invalid> ha scritto:
> >
> > Hi everyone,
> >
> > Some update on the maven migration: I had a few bumps here and there
> (just
> > looking at the latest patch Andor linked -
> > https://github.com/apache/zookeeper/pull/708 - you can see on the
> commits).
> > Current state is that the build works, tests run, but reports like
> > findbugs, clover etc. are not yet implemented. Maven has plugins for them
> > usually, but it's not always trivial, especially with the C client. The
> > assembly is also left to be done, but it should be fairly easy to do a
> > similar tarball then ant does (although this will be also an interesting
> > task, as ant does some strange things, like duplicated sources of most
> > contrib projects).
> >
> > I had a seperate jira to do the recipes and contrib maven build. I do not
> > have open PR for it, but recipes is done and I am now working on the
> > contrib projects. Most of them is manually build and never gets called
> from
> > the main build.xml. I will not integrate these either to the maven build.
> > The reason is that there are plans to remove some of them from ZK repo
> > anyway. The other reason is that for starters, we want to replicate the
> ant
> > build as closely as possible, without doing any nasty workarounds in
> maven
> > to achieve that. And from there, we can improve, use maven's advantages
> to
> > shape the build of ZooKeeper. Once it is stable and proven to have all
> the
> > functionality required for build and release.
> >
> > Right now, I am trying to stabilize the build as much as possible. Andor
> > also fixed some flaky C tests that for some strange reasons, become
> > extremely flaky with the maven build:
> > https://github.com/apache/zookeeper/pull/740
> >
> > Regards,
> > Norbert
> >
> > On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar <andor@cloudera.com.invalid
> >
> > wrote:
> >
> > > Sure, good point. Let's put it on the list.
> > >
> > > Andor
> > >
> > >
> > > On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org>
> wrote:
> > >
> > > > Are folks OK to wait on that OWASP issue I documented over the
> weekend?
> > > > afaict we are not affected but it would be good to get another pair
> of
> > > eyes
> > > > on it.
> > > >
> > > > Patrick
> > > >
> > > > On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org>
> wrote:
> > > >
> > > > > Hi team,
> > > > >
> > > > >
> > > > > I'm proudly announce that thanks to the joint effort from the
> > > community,
> > > > > the 3.5 blockers list has become empty:
> > > > >
> > > > > "project = ZooKeeper AND resolution = Unresolved AND fixVersion =
> 3.5.5
> > > > > AND priority in (blocker, critical) ORDER BY priority DESC, key
> ASC"
> > > > >
> > > > >
> > > > > Well... almost. All the blocker issues have gone, but we still
> have the
> > > > > Maven migration to complete before the stable release. If you have
> some
> > > > > free cycles, please join us testing the Maven build on this PR:
> > > > >
> > > > > https://github.com/apache/zookeeper/pull/708
> > > > >
> > > > > I hope we can merge it pretty soon.
> > > > >
> > > > >
> > > > > In terms of the builds, the weather at 3.5 branch is quite sunny
> > > > nowadays:
> > > > >
> > > > > https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > > > >
> > > > > The Java 11 build is still having some difficulties, which
> hopefully I
> > > > > can address before the holidays:
> > > > >
> > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > > > >
> > > > >
> > > > > If you happen to know about something which is important from 3.5's
> > > > > perspective and missing from the above, please don't hesitate to
> share.
> > > > >
> > > > >
> > > > > Happy ZooKeeping!
> > > > >
> > > > > Andor
> > > > >
> > > > >
> > > > >
> > > > > On 11/2/18 21:12, Fangmin Lv wrote:
> > > > > > Andor,
> > > > > >
> > > > > > Here is the PR to port ZK-3104 from master to 3.4:
> > > > > > https://github.com/apache/zookeeper/pull/685.
> > > > > >
> > > > > > Fangmin
> > > > > >
> > > > > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > >> Hi Andor,
> > > > > >>
> > > > > >> Is anyone working on ZK-2778? I can pick it up if there is no
> one
> > > > > working
> > > > > >> on it yet.
> > > > > >>
> > > > > >> I'll open a 3.5 PR for ZK-3104 today.
> > > > > >>
> > > > > >> Fangmin
> > > > > >>
> > > > > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org>
> > > > wrote:
> > > > > >>
> > > > > >>> Hi folks,
> > > > > >>>
> > > > > >>> You’ve probably realised lots of update emails coming from
> Jira.
> > > > Please
> > > > > >>> be aware that we’ve updated a bunch of open blocker/critical
> 3.5
> > > > > tickets to
> > > > > >>> reflect to what we discussed in this email.
> > > > > >>>
> > > > > >>> If you open up the following jira filter:
> > > > > >>>
> > > > > >>> project = ZooKeeper and resolution = Unresolved and fixVersion
> =
> > > > 3.5.5
> > > > > >>> AND priority in (blocker, critical) ORDER BY priority DESC,
> key ASC
> > > > > >>>
> > > > > >>> You’ll see the most up-to-date list of tickets which need to be
> > > > > addressed
> > > > > >>> before the stable 3.5 release.
> > > > > >>>
> > > > > >>> Thank you for your efforts to get this done.
> > > > > >>>
> > > > > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has
> already
> > > been
> > > > > >>> resolved. Have you created a separate ticket for the backport
> or
> > > > shall
> > > > > I
> > > > > >>> just reopen it with the right fix versions?
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Andor
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org>
> wrote:
> > > > > >>>>
> > > > > >>>> Hi,
> > > > > >>>>
> > > > > >>>> Let me summarize and give a quick update on the outstanding
> issues
> > > > for
> > > > > >>> 3.5 GA:
> > > > > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> sync
> > > > with
> > > > > >>> leader and follower receiving external connection requests.)
> > > > > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> > > > > >>>> - ZOOKEEPER-925 Docs generation to Maven
> > > > > >>>> - ZOOKEEPER-3104 (waiting for backport)
> > > > > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > > > >>>>
> > > > > >>>> The 2 Maven related tickets are no-brainers as well as the
> > > > backports.
> > > > > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can
> > > see,
> > > > > >>> ZK-1818 is the only one waiting for a volunteer.
> > > > > >>>> Please correct me if I’ve missed something.
> > > > > >>>>
> > > > > >>>> Regards,
> > > > > >>>> Andor
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > > > <tamaas@cloudera.com.INVALID
> > > > > >
> > > > > >>> wrote:
> > > > > >>>>> Hi All,
> > > > > >>>>>
> > > > > >>>>> I would add ZOOKEEPER-3021
> > > > > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021>
> Migrate
> > > > > project
> > > > > >>>>> structure to Maven build as a blocker too. Since the
> migration
> > > has
> > > > > >>> started
> > > > > >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > > > >>>>>
> > > > > >>>>> ZOOKEEPER-925 <
> > > https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > > > >
> > > > > >>> replace
> > > > > >>>>> our forrest site and documentation generation might also be a
> > > good
> > > > > >>> idea,
> > > > > >>>>> since then we could deliver the new MarkDown based
> documentation.
> > > > > >>>>>
> > > > > >>>>> Regards, Tamaas
> > > > > >>>>>
> > > > > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <
> lvfangmin@gmail.com
> > > >
> > > > > >>> wrote:
> > > > > >>>>>> Oh, sorry for the confusion, I should provide more context.
> > > > > >>>>>>
> > > > > >>>>>> Leader will use on disk txn sync with followers to if the
> peer
> > > > zxid
> > > > > >>> is not
> > > > > >>>>>> in it's in memory commit logs, the code is here: Leader on
> disk
> > > > txn
> > > > > >>> sync
> > > > > >>>>>> <
> > > > > >>>>>>
> > > > > >>>
> > > > >
> > > >
> > >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > > > >>>>>>> .
> > > > > >>>>>> There is bug that potentially there will be gap in the txn
> > > files,
> > > > > like
> > > > > >>>>>> after snap sync, etc, so it's possible the peer will miss
> txns
> > > due
> > > > > to
> > > > > >>> this.
> > > > > >>>>>> The option to disable it is snapshotSizeFactor
> > > > > >>>>>> <
> > > > > >>>>>>
> > > > > >>>
> > > > >
> > > >
> > >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > > > >>>>>>> ,
> > > > > >>>>>> set it to -1 will disable this feature. On 3.5, it's better
> to
> > > > have
> > > > > a
> > > > > >>> PR to
> > > > > >>>>>> set this to -1 by default. It might have more SNAP sync, but
> > > from
> > > > > our
> > > > > >>> prod
> > > > > >>>>>> it doesn't seem to be a big problem to me.
> > > > > >>>>>>
> > > > > >>>>>> I can send out the diff to disable it by default on 3.5 if
> you
> > > > guys
> > > > > >>> think
> > > > > >>>>>> this is the right way to do.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>> Fangmin
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <
> andor@apache.org>
> > > > > >>> wrote:
> > > > > >>>>>>> What’s needed to turn it off?
> > > > > >>>>>>> Do we need a PR or it’s just a config option?
> > > > > >>>>>>> Shall we implement a feature switch for that and turn it
> off by
> > > > > >>> default?
> > > > > >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > > > >>>>>>>
> > > > > >>>>>>> Andor
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <
> lvfangmin@gmail.com>
> > > > > wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case
> of
> > > > > >>>>>>> inconsistency
> > > > > >>>>>>>> which could caused by on disk txn sync, as I mentioned in
> a
> > > > newer
> > > > > >>> JIRA
> > > > > >>>>>>>> ZOOKEEPER-2846 <
> > > > > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > > > >>>>>>> the
> > > > > >>>>>>>> snap sync or txn sync could also leave txns gap in the txn
> > > file,
> > > > > >>> which
> > > > > >>>>>>> is a
> > > > > >>>>>>>> more common case could trigger this issue.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I would suggest to turn off the on disk txn sync by
> default
> > > for
> > > > > now
> > > > > >>> to
> > > > > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we
> can use
> > > > > that
> > > > > >>> to
> > > > > >>>>>>>> validate the on disk txns during syncing.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks,
> > > > > >>>>>>>> Fangmin
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > > lvfangmin@gmail.com
> > > > >
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>> Andor,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking
> to
> > > > help
> > > > > >>>>>>> detecting
> > > > > >>>>>>>>> inconsistency, it's a new feature with amounts of code
> > > change.
> > > > > I'll
> > > > > >>>>>>> start
> > > > > >>>>>>>>> upstream it part by part, but I don't expect it's being
> > > merged
> > > > in
> > > > > >>> the
> > > > > >>>>>>> next
> > > > > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely
> not a
> > > > > block
> > > > > >>> for
> > > > > >>>>>>> 3.5.
> > > > > >>>>>>>>> Thanks,
> > > > > >>>>>>>>> Fangmin
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > > andor@apache.org
> > > > >
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>>> Fangmin,
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Sorry, I just noticed that you want to include the
> > > consistency
> > > > > >>> fixes
> > > > > >>>>>> in
> > > > > >>>>>>>>>> the stable version which is fine. Let’s finish the
> backports
> > > > and
> > > > > >>>>>> we’ll
> > > > > >>>>>>> be
> > > > > >>>>>>>>>> done with them.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t
> > > block
> > > > > 3.5
> > > > > >>>>>> with
> > > > > >>>>>>>>>> that. What do you think?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Andor
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <
> andor@apache.org
> > > >
> > > > > >>> wrote:
> > > > > >>>>>>>>>>> Cool, thanks for the clarification.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> The updated list is as follows:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > > protocol)
> > > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > > follower
> > > > > sync
> > > > > >>>>>> with
> > > > > >>>>>>>>>> leader and follower receiving external connection
> requests.)
> > > > > >>>>>>>>>>> The following are not critical and no blockers for the
> > > stable
> > > > > >>>>>> release:
> > > > > >>>>>>>>>>> Waiting for to be ported to 3.5:
> > > > > >>>>>>>>>>> - ZOOKEEPER-3104
> > > > > >>>>>>>>>>> - ZOOKEEPER-3125
> > > > > >>>>>>>>>>> - ZOOKEEPER-3127
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> New feature:
> > > > > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Regards,
> > > > > >>>>>>>>>>> Andor
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > > lvfangmin@gmail.com>
> > > > > >>> wrote:
> > > > > >>>>>>>>>>>> Hi Andor,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > > > internally
> > > > > >>> after
> > > > > >>>>>>> we
> > > > > >>>>>>>>>>>> found the potentially inconsistent issue. The only
> > > solution
> > > > we
> > > > > >>> have
> > > > > >>>>>>>>>> for now
> > > > > >>>>>>>>>>>> is waiting for the new digest checking feature I
> mentioned
> > > > in
> > > > > >>>>>>>>>>>> ZOOKEEPER-3114.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I think there are some other critical consistent
> issues we
> > > > > just
> > > > > >>>>>> fixed
> > > > > >>>>>>>>>> on
> > > > > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > > > >>> ZOOKEEPER-3127, I
> > > > > >>>>>>>>>> think we
> > > > > >>>>>>>>>>>> should include that in the official 3.5 release as
> well.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>> Fangmin
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > > > andor@apache.org
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>> Hi Jeelani,
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it
> from
> > > the
> > > > > >>> list
> > > > > >>>>>> to
> > > > > >>>>>>>>>> get
> > > > > >>>>>>>>>>>>> closer to a stable release. :)
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> What's the feature which can be disabled to avoid
> data
> > > > > >>>>>>> inconsistency?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Andor
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > > > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be
> ignoring
> > > > > >>>>>>>>>> ZOOKEEPER-2418 as
> > > > > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can
> be
> > > > > >>> disabled.
> > > > > >>>>>> We
> > > > > >>>>>>>>>> are
> > > > > >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > > > >>>>>>>>>>>>>> Regards,
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Jeelani
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > > > >>> <andor@cloudera.com.INVALID
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>> Fine.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we
> have
> > > the
> > > > > >>> list
> > > > > >>>>>>> of:
> > > > > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic
> Broadcast
> > > > > >>> protocol)
> > > > > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending
> some
> > > > > >>>>>>>>>>>>> transactions to
> > > > > >>>>>>>>>>>>>> followers)
> > > > > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > > > follower
> > > > > >>>>>> sync
> > > > > >>>>>>>>>>>>> with
> > > > > >>>>>>>>>>>>>> leader and follower receiving external connection
> > > > requests.)
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the
> 3.5
> > > > > release,
> > > > > >>>>>>>>>> hence
> > > > > >>>>>>>>>>>>> I
> > > > > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next
> stable
> > > > > >>>>>> release.
> > > > > >>>>>>>>>> PR
> > > > > >>>>>>>>>>>>> has
> > > > > >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > > > > >>>>>>>>>>>>>> The rest are also long outstanding issues which have
> > > been
> > > > > >>> found
> > > > > >>>>>> in
> > > > > >>>>>>>>>>>>> the 3.5
> > > > > >>>>>>>>>>>>>> branch.
> > > > > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and
> fixed in
> > > > > 3.4,
> > > > > >>>>>> but
> > > > > >>>>>>>>>>>>> never has
> > > > > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> > > present.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I think we should at least run some manual testing
> and
> > > see
> > > > > if
> > > > > >>> we
> > > > > >>>>>>>>>>>>> could
> > > > > >>>>>>>>>>>>>> repro any of these issues before going ahead with a
> > > stable
> > > > > >>>>>>> release.
> > > > > >>>>>>>>>>>>>> Regards,
> > > > > >>>>>>>>>>>>>> Andor
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > > > hanm@apache.org>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks
> like
> > > > lots
> > > > > >>> of
> > > > > >>>>>> the
> > > > > >>>>>>>>>>>>> JIRA
> > > > > >>>>>>>>>>>>>>> issues listed in this thread, such as
> ZOOKEEPER-1549,
> > > > 2846,
> > > > > >>> also
> > > > > >>>>>>>>>>>>> affects
> > > > > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> I think historically the single outstanding
> blocking
> > > > issue
> > > > > >>> for a
> > > > > >>>>>>>>>>>>> stable 3.5
> > > > > >>>>>>>>>>>>>>> release is the reconfig feature and security
> concerns
> > > > > around
> > > > > >>> it
> > > > > >>>>>>>>>>>>> (somehow
> > > > > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and
> beta
> > > > > releases
> > > > > >>>>>> were
> > > > > >>>>>>>>>>>>> created
> > > > > >>>>>>>>>>>>>>> to stabilize that feature.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>
> > > > >
> > > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > > > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > > > Something
> > > > > >>>>>> might
> > > > > >>>>>>>>>>>>> worth
> > > > > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with
> 3.4
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> > > record
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>
> > > > >
> > > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > > > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no
> > > flaky
> > > > > >>> tests
> > > > > >>>>>> at
> > > > > >>>>>>>>>>>>> all.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > > > >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > > > >>>>>>>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I
> appreciate
> > > > it.
> > > > > >>>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>> Andor
> > > > > >>>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>
> > > > > >>>
> > > > >
> > > >
> > >
>

Re: ZooKeeper 3.5 blocker issues

Posted by Enrico Olivelli <eo...@gmail.com>.
Great work Norbert
I you want I can help,especially for rat, findbugs (need to switch to
spotbugs anyway) and OWASP stuff (recently I started using Maven
Plugin in other projects)
But I am not sure how can I help you concretely if we do not commit your work.
We could commit the work as it is now, leaving "ant" as official build
method, but having the poms committed will ease collaboration.

We will also have to work on CI jobs, I can help on that part as well

Enrico

Il giorno mer 19 dic 2018 alle ore 12:26 Norbert Kalmar
<nk...@cloudera.com.invalid> ha scritto:
>
> Hi everyone,
>
> Some update on the maven migration: I had a few bumps here and there (just
> looking at the latest patch Andor linked -
> https://github.com/apache/zookeeper/pull/708 - you can see on the commits).
> Current state is that the build works, tests run, but reports like
> findbugs, clover etc. are not yet implemented. Maven has plugins for them
> usually, but it's not always trivial, especially with the C client. The
> assembly is also left to be done, but it should be fairly easy to do a
> similar tarball then ant does (although this will be also an interesting
> task, as ant does some strange things, like duplicated sources of most
> contrib projects).
>
> I had a seperate jira to do the recipes and contrib maven build. I do not
> have open PR for it, but recipes is done and I am now working on the
> contrib projects. Most of them is manually build and never gets called from
> the main build.xml. I will not integrate these either to the maven build.
> The reason is that there are plans to remove some of them from ZK repo
> anyway. The other reason is that for starters, we want to replicate the ant
> build as closely as possible, without doing any nasty workarounds in maven
> to achieve that. And from there, we can improve, use maven's advantages to
> shape the build of ZooKeeper. Once it is stable and proven to have all the
> functionality required for build and release.
>
> Right now, I am trying to stabilize the build as much as possible. Andor
> also fixed some flaky C tests that for some strange reasons, become
> extremely flaky with the maven build:
> https://github.com/apache/zookeeper/pull/740
>
> Regards,
> Norbert
>
> On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar <an...@cloudera.com.invalid>
> wrote:
>
> > Sure, good point. Let's put it on the list.
> >
> > Andor
> >
> >
> > On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org> wrote:
> >
> > > Are folks OK to wait on that OWASP issue I documented over the weekend?
> > > afaict we are not affected but it would be good to get another pair of
> > eyes
> > > on it.
> > >
> > > Patrick
> > >
> > > On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org> wrote:
> > >
> > > > Hi team,
> > > >
> > > >
> > > > I'm proudly announce that thanks to the joint effort from the
> > community,
> > > > the 3.5 blockers list has become empty:
> > > >
> > > > "project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5
> > > > AND priority in (blocker, critical) ORDER BY priority DESC, key ASC"
> > > >
> > > >
> > > > Well... almost. All the blocker issues have gone, but we still have the
> > > > Maven migration to complete before the stable release. If you have some
> > > > free cycles, please join us testing the Maven build on this PR:
> > > >
> > > > https://github.com/apache/zookeeper/pull/708
> > > >
> > > > I hope we can merge it pretty soon.
> > > >
> > > >
> > > > In terms of the builds, the weather at 3.5 branch is quite sunny
> > > nowadays:
> > > >
> > > > https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > > >
> > > > The Java 11 build is still having some difficulties, which hopefully I
> > > > can address before the holidays:
> > > >
> > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > > >
> > > >
> > > > If you happen to know about something which is important from 3.5's
> > > > perspective and missing from the above, please don't hesitate to share.
> > > >
> > > >
> > > > Happy ZooKeeping!
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > > On 11/2/18 21:12, Fangmin Lv wrote:
> > > > > Andor,
> > > > >
> > > > > Here is the PR to port ZK-3104 from master to 3.4:
> > > > > https://github.com/apache/zookeeper/pull/685.
> > > > >
> > > > > Fangmin
> > > > >
> > > > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com>
> > > wrote:
> > > > >
> > > > >> Hi Andor,
> > > > >>
> > > > >> Is anyone working on ZK-2778? I can pick it up if there is no one
> > > > working
> > > > >> on it yet.
> > > > >>
> > > > >> I'll open a 3.5 PR for ZK-3104 today.
> > > > >>
> > > > >> Fangmin
> > > > >>
> > > > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org>
> > > wrote:
> > > > >>
> > > > >>> Hi folks,
> > > > >>>
> > > > >>> You’ve probably realised lots of update emails coming from Jira.
> > > Please
> > > > >>> be aware that we’ve updated a bunch of open blocker/critical 3.5
> > > > tickets to
> > > > >>> reflect to what we discussed in this email.
> > > > >>>
> > > > >>> If you open up the following jira filter:
> > > > >>>
> > > > >>> project = ZooKeeper and resolution = Unresolved and fixVersion =
> > > 3.5.5
> > > > >>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
> > > > >>>
> > > > >>> You’ll see the most up-to-date list of tickets which need to be
> > > > addressed
> > > > >>> before the stable 3.5 release.
> > > > >>>
> > > > >>> Thank you for your efforts to get this done.
> > > > >>>
> > > > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has already
> > been
> > > > >>> resolved. Have you created a separate ticket for the backport or
> > > shall
> > > > I
> > > > >>> just reopen it with the right fix versions?
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Andor
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
> > > > >>>>
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> Let me summarize and give a quick update on the outstanding issues
> > > for
> > > > >>> 3.5 GA:
> > > > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
> > > with
> > > > >>> leader and follower receiving external connection requests.)
> > > > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> > > > >>>> - ZOOKEEPER-925 Docs generation to Maven
> > > > >>>> - ZOOKEEPER-3104 (waiting for backport)
> > > > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > > >>>>
> > > > >>>> The 2 Maven related tickets are no-brainers as well as the
> > > backports.
> > > > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can
> > see,
> > > > >>> ZK-1818 is the only one waiting for a volunteer.
> > > > >>>> Please correct me if I’ve missed something.
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Andor
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > > <tamaas@cloudera.com.INVALID
> > > > >
> > > > >>> wrote:
> > > > >>>>> Hi All,
> > > > >>>>>
> > > > >>>>> I would add ZOOKEEPER-3021
> > > > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate
> > > > project
> > > > >>>>> structure to Maven build as a blocker too. Since the migration
> > has
> > > > >>> started
> > > > >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > > >>>>>
> > > > >>>>> ZOOKEEPER-925 <
> > https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > > >
> > > > >>> replace
> > > > >>>>> our forrest site and documentation generation might also be a
> > good
> > > > >>> idea,
> > > > >>>>> since then we could deliver the new MarkDown based documentation.
> > > > >>>>>
> > > > >>>>> Regards, Tamaas
> > > > >>>>>
> > > > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lvfangmin@gmail.com
> > >
> > > > >>> wrote:
> > > > >>>>>> Oh, sorry for the confusion, I should provide more context.
> > > > >>>>>>
> > > > >>>>>> Leader will use on disk txn sync with followers to if the peer
> > > zxid
> > > > >>> is not
> > > > >>>>>> in it's in memory commit logs, the code is here: Leader on disk
> > > txn
> > > > >>> sync
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>
> > > >
> > >
> > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > > >>>>>>> .
> > > > >>>>>> There is bug that potentially there will be gap in the txn
> > files,
> > > > like
> > > > >>>>>> after snap sync, etc, so it's possible the peer will miss txns
> > due
> > > > to
> > > > >>> this.
> > > > >>>>>> The option to disable it is snapshotSizeFactor
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>
> > > >
> > >
> > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > > >>>>>>> ,
> > > > >>>>>> set it to -1 will disable this feature. On 3.5, it's better to
> > > have
> > > > a
> > > > >>> PR to
> > > > >>>>>> set this to -1 by default. It might have more SNAP sync, but
> > from
> > > > our
> > > > >>> prod
> > > > >>>>>> it doesn't seem to be a big problem to me.
> > > > >>>>>>
> > > > >>>>>> I can send out the diff to disable it by default on 3.5 if you
> > > guys
> > > > >>> think
> > > > >>>>>> this is the right way to do.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>> Fangmin
> > > > >>>>>>
> > > > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
> > > > >>> wrote:
> > > > >>>>>>> What’s needed to turn it off?
> > > > >>>>>>> Do we need a PR or it’s just a config option?
> > > > >>>>>>> Shall we implement a feature switch for that and turn it off by
> > > > >>> default?
> > > > >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > > >>>>>>>
> > > > >>>>>>> Andor
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com>
> > > > wrote:
> > > > >>>>>>>>
> > > > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
> > > > >>>>>>> inconsistency
> > > > >>>>>>>> which could caused by on disk txn sync, as I mentioned in a
> > > newer
> > > > >>> JIRA
> > > > >>>>>>>> ZOOKEEPER-2846 <
> > > > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > > >>>>>>> the
> > > > >>>>>>>> snap sync or txn sync could also leave txns gap in the txn
> > file,
> > > > >>> which
> > > > >>>>>>> is a
> > > > >>>>>>>> more common case could trigger this issue.
> > > > >>>>>>>>
> > > > >>>>>>>> I would suggest to turn off the on disk txn sync by default
> > for
> > > > now
> > > > >>> to
> > > > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use
> > > > that
> > > > >>> to
> > > > >>>>>>>> validate the on disk txns during syncing.
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks,
> > > > >>>>>>>> Fangmin
> > > > >>>>>>>>
> > > > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> > lvfangmin@gmail.com
> > > >
> > > > >>>>>> wrote:
> > > > >>>>>>>>> Andor,
> > > > >>>>>>>>>
> > > > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to
> > > help
> > > > >>>>>>> detecting
> > > > >>>>>>>>> inconsistency, it's a new feature with amounts of code
> > change.
> > > > I'll
> > > > >>>>>>> start
> > > > >>>>>>>>> upstream it part by part, but I don't expect it's being
> > merged
> > > in
> > > > >>> the
> > > > >>>>>>> next
> > > > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a
> > > > block
> > > > >>> for
> > > > >>>>>>> 3.5.
> > > > >>>>>>>>> Thanks,
> > > > >>>>>>>>> Fangmin
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> > andor@apache.org
> > > >
> > > > >>>>>> wrote:
> > > > >>>>>>>>>> Fangmin,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Sorry, I just noticed that you want to include the
> > consistency
> > > > >>> fixes
> > > > >>>>>> in
> > > > >>>>>>>>>> the stable version which is fine. Let’s finish the backports
> > > and
> > > > >>>>>> we’ll
> > > > >>>>>>> be
> > > > >>>>>>>>>> done with them.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t
> > block
> > > > 3.5
> > > > >>>>>> with
> > > > >>>>>>>>>> that. What do you think?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Andor
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <andor@apache.org
> > >
> > > > >>> wrote:
> > > > >>>>>>>>>>> Cool, thanks for the clarification.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> The updated list is as follows:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > protocol)
> > > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > follower
> > > > sync
> > > > >>>>>> with
> > > > >>>>>>>>>> leader and follower receiving external connection requests.)
> > > > >>>>>>>>>>> The following are not critical and no blockers for the
> > stable
> > > > >>>>>> release:
> > > > >>>>>>>>>>> Waiting for to be ported to 3.5:
> > > > >>>>>>>>>>> - ZOOKEEPER-3104
> > > > >>>>>>>>>>> - ZOOKEEPER-3125
> > > > >>>>>>>>>>> - ZOOKEEPER-3127
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> New feature:
> > > > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Regards,
> > > > >>>>>>>>>>> Andor
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> > lvfangmin@gmail.com>
> > > > >>> wrote:
> > > > >>>>>>>>>>>> Hi Andor,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > > internally
> > > > >>> after
> > > > >>>>>>> we
> > > > >>>>>>>>>>>> found the potentially inconsistent issue. The only
> > solution
> > > we
> > > > >>> have
> > > > >>>>>>>>>> for now
> > > > >>>>>>>>>>>> is waiting for the new digest checking feature I mentioned
> > > in
> > > > >>>>>>>>>>>> ZOOKEEPER-3114.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I think there are some other critical consistent issues we
> > > > just
> > > > >>>>>> fixed
> > > > >>>>>>>>>> on
> > > > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > > >>> ZOOKEEPER-3127, I
> > > > >>>>>>>>>> think we
> > > > >>>>>>>>>>>> should include that in the official 3.5 release as well.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>>> Fangmin
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > > andor@apache.org
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>> Hi Jeelani,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from
> > the
> > > > >>> list
> > > > >>>>>> to
> > > > >>>>>>>>>> get
> > > > >>>>>>>>>>>>> closer to a stable release. :)
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> What's the feature which can be disabled to avoid data
> > > > >>>>>>> inconsistency?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
> > > > >>>>>>>>>> ZOOKEEPER-2418 as
> > > > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be
> > > > >>> disabled.
> > > > >>>>>> We
> > > > >>>>>>>>>> are
> > > > >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > > >>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Jeelani
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > > >>> <andor@cloudera.com.INVALID
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>> Fine.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have
> > the
> > > > >>> list
> > > > >>>>>>> of:
> > > > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > > >>> protocol)
> > > > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
> > > > >>>>>>>>>>>>> transactions to
> > > > >>>>>>>>>>>>>> followers)
> > > > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > > follower
> > > > >>>>>> sync
> > > > >>>>>>>>>>>>> with
> > > > >>>>>>>>>>>>>> leader and follower receiving external connection
> > > requests.)
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5
> > > > release,
> > > > >>>>>>>>>> hence
> > > > >>>>>>>>>>>>> I
> > > > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
> > > > >>>>>> release.
> > > > >>>>>>>>>> PR
> > > > >>>>>>>>>>>>> has
> > > > >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > > > >>>>>>>>>>>>>> The rest are also long outstanding issues which have
> > been
> > > > >>> found
> > > > >>>>>> in
> > > > >>>>>>>>>>>>> the 3.5
> > > > >>>>>>>>>>>>>> branch.
> > > > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in
> > > > 3.4,
> > > > >>>>>> but
> > > > >>>>>>>>>>>>> never has
> > > > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> > present.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> I think we should at least run some manual testing and
> > see
> > > > if
> > > > >>> we
> > > > >>>>>>>>>>>>> could
> > > > >>>>>>>>>>>>>> repro any of these issues before going ahead with a
> > stable
> > > > >>>>>>> release.
> > > > >>>>>>>>>>>>>> Regards,
> > > > >>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > > hanm@apache.org>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks like
> > > lots
> > > > >>> of
> > > > >>>>>> the
> > > > >>>>>>>>>>>>> JIRA
> > > > >>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549,
> > > 2846,
> > > > >>> also
> > > > >>>>>>>>>>>>> affects
> > > > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I think historically the single outstanding blocking
> > > issue
> > > > >>> for a
> > > > >>>>>>>>>>>>> stable 3.5
> > > > >>>>>>>>>>>>>>> release is the reconfig feature and security concerns
> > > > around
> > > > >>> it
> > > > >>>>>>>>>>>>> (somehow
> > > > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta
> > > > releases
> > > > >>>>>> were
> > > > >>>>>>>>>>>>> created
> > > > >>>>>>>>>>>>>>> to stabilize that feature.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>
> > > >
> > >
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > > Something
> > > > >>>>>> might
> > > > >>>>>>>>>>>>> worth
> > > > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> > record
> > > > >>>>>>>>>>>>>>>
> > > > >>>
> > > >
> > >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no
> > flaky
> > > > >>> tests
> > > > >>>>>> at
> > > > >>>>>>>>>>>>> all.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > > >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > > >>>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate
> > > it.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Andor
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>
> > > > >>>
> > > >
> > >
> >

Re: ZooKeeper 3.5 blocker issues

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Hi everyone,

Some update on the maven migration: I had a few bumps here and there (just
looking at the latest patch Andor linked -
https://github.com/apache/zookeeper/pull/708 - you can see on the commits).
Current state is that the build works, tests run, but reports like
findbugs, clover etc. are not yet implemented. Maven has plugins for them
usually, but it's not always trivial, especially with the C client. The
assembly is also left to be done, but it should be fairly easy to do a
similar tarball then ant does (although this will be also an interesting
task, as ant does some strange things, like duplicated sources of most
contrib projects).

I had a seperate jira to do the recipes and contrib maven build. I do not
have open PR for it, but recipes is done and I am now working on the
contrib projects. Most of them is manually build and never gets called from
the main build.xml. I will not integrate these either to the maven build.
The reason is that there are plans to remove some of them from ZK repo
anyway. The other reason is that for starters, we want to replicate the ant
build as closely as possible, without doing any nasty workarounds in maven
to achieve that. And from there, we can improve, use maven's advantages to
shape the build of ZooKeeper. Once it is stable and proven to have all the
functionality required for build and release.

Right now, I am trying to stabilize the build as much as possible. Andor
also fixed some flaky C tests that for some strange reasons, become
extremely flaky with the maven build:
https://github.com/apache/zookeeper/pull/740

Regards,
Norbert

On Tue, Dec 18, 2018 at 9:52 AM Andor Molnar <an...@cloudera.com.invalid>
wrote:

> Sure, good point. Let's put it on the list.
>
> Andor
>
>
> On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org> wrote:
>
> > Are folks OK to wait on that OWASP issue I documented over the weekend?
> > afaict we are not affected but it would be good to get another pair of
> eyes
> > on it.
> >
> > Patrick
> >
> > On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org> wrote:
> >
> > > Hi team,
> > >
> > >
> > > I'm proudly announce that thanks to the joint effort from the
> community,
> > > the 3.5 blockers list has become empty:
> > >
> > > "project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5
> > > AND priority in (blocker, critical) ORDER BY priority DESC, key ASC"
> > >
> > >
> > > Well... almost. All the blocker issues have gone, but we still have the
> > > Maven migration to complete before the stable release. If you have some
> > > free cycles, please join us testing the Maven build on this PR:
> > >
> > > https://github.com/apache/zookeeper/pull/708
> > >
> > > I hope we can merge it pretty soon.
> > >
> > >
> > > In terms of the builds, the weather at 3.5 branch is quite sunny
> > nowadays:
> > >
> > > https://builds.apache.org/view/S-Z/view/ZooKeeper/
> > >
> > > The Java 11 build is still having some difficulties, which hopefully I
> > > can address before the holidays:
> > >
> > > https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> > >
> > >
> > > If you happen to know about something which is important from 3.5's
> > > perspective and missing from the above, please don't hesitate to share.
> > >
> > >
> > > Happy ZooKeeping!
> > >
> > > Andor
> > >
> > >
> > >
> > > On 11/2/18 21:12, Fangmin Lv wrote:
> > > > Andor,
> > > >
> > > > Here is the PR to port ZK-3104 from master to 3.4:
> > > > https://github.com/apache/zookeeper/pull/685.
> > > >
> > > > Fangmin
> > > >
> > > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com>
> > wrote:
> > > >
> > > >> Hi Andor,
> > > >>
> > > >> Is anyone working on ZK-2778? I can pick it up if there is no one
> > > working
> > > >> on it yet.
> > > >>
> > > >> I'll open a 3.5 PR for ZK-3104 today.
> > > >>
> > > >> Fangmin
> > > >>
> > > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org>
> > wrote:
> > > >>
> > > >>> Hi folks,
> > > >>>
> > > >>> You’ve probably realised lots of update emails coming from Jira.
> > Please
> > > >>> be aware that we’ve updated a bunch of open blocker/critical 3.5
> > > tickets to
> > > >>> reflect to what we discussed in this email.
> > > >>>
> > > >>> If you open up the following jira filter:
> > > >>>
> > > >>> project = ZooKeeper and resolution = Unresolved and fixVersion =
> > 3.5.5
> > > >>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
> > > >>>
> > > >>> You’ll see the most up-to-date list of tickets which need to be
> > > addressed
> > > >>> before the stable 3.5 release.
> > > >>>
> > > >>> Thank you for your efforts to get this done.
> > > >>>
> > > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has already
> been
> > > >>> resolved. Have you created a separate ticket for the backport or
> > shall
> > > I
> > > >>> just reopen it with the right fix versions?
> > > >>>
> > > >>> Thanks,
> > > >>> Andor
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
> > > >>>>
> > > >>>> Hi,
> > > >>>>
> > > >>>> Let me summarize and give a quick update on the outstanding issues
> > for
> > > >>> 3.5 GA:
> > > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
> > with
> > > >>> leader and follower receiving external connection requests.)
> > > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> > > >>>> - ZOOKEEPER-925 Docs generation to Maven
> > > >>>> - ZOOKEEPER-3104 (waiting for backport)
> > > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > > >>>>
> > > >>>> The 2 Maven related tickets are no-brainers as well as the
> > backports.
> > > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can
> see,
> > > >>> ZK-1818 is the only one waiting for a volunteer.
> > > >>>> Please correct me if I’ve missed something.
> > > >>>>
> > > >>>> Regards,
> > > >>>> Andor
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> > <tamaas@cloudera.com.INVALID
> > > >
> > > >>> wrote:
> > > >>>>> Hi All,
> > > >>>>>
> > > >>>>> I would add ZOOKEEPER-3021
> > > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate
> > > project
> > > >>>>> structure to Maven build as a blocker too. Since the migration
> has
> > > >>> started
> > > >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > > >>>>>
> > > >>>>> ZOOKEEPER-925 <
> https://issues.apache.org/jira/browse/ZOOKEEPER-925
> > >
> > > >>> replace
> > > >>>>> our forrest site and documentation generation might also be a
> good
> > > >>> idea,
> > > >>>>> since then we could deliver the new MarkDown based documentation.
> > > >>>>>
> > > >>>>> Regards, Tamaas
> > > >>>>>
> > > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lvfangmin@gmail.com
> >
> > > >>> wrote:
> > > >>>>>> Oh, sorry for the confusion, I should provide more context.
> > > >>>>>>
> > > >>>>>> Leader will use on disk txn sync with followers to if the peer
> > zxid
> > > >>> is not
> > > >>>>>> in it's in memory commit logs, the code is here: Leader on disk
> > txn
> > > >>> sync
> > > >>>>>> <
> > > >>>>>>
> > > >>>
> > >
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > > >>>>>>> .
> > > >>>>>> There is bug that potentially there will be gap in the txn
> files,
> > > like
> > > >>>>>> after snap sync, etc, so it's possible the peer will miss txns
> due
> > > to
> > > >>> this.
> > > >>>>>> The option to disable it is snapshotSizeFactor
> > > >>>>>> <
> > > >>>>>>
> > > >>>
> > >
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > > >>>>>>> ,
> > > >>>>>> set it to -1 will disable this feature. On 3.5, it's better to
> > have
> > > a
> > > >>> PR to
> > > >>>>>> set this to -1 by default. It might have more SNAP sync, but
> from
> > > our
> > > >>> prod
> > > >>>>>> it doesn't seem to be a big problem to me.
> > > >>>>>>
> > > >>>>>> I can send out the diff to disable it by default on 3.5 if you
> > guys
> > > >>> think
> > > >>>>>> this is the right way to do.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> Fangmin
> > > >>>>>>
> > > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
> > > >>> wrote:
> > > >>>>>>> What’s needed to turn it off?
> > > >>>>>>> Do we need a PR or it’s just a config option?
> > > >>>>>>> Shall we implement a feature switch for that and turn it off by
> > > >>> default?
> > > >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > > >>>>>>>
> > > >>>>>>> Andor
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com>
> > > wrote:
> > > >>>>>>>>
> > > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
> > > >>>>>>> inconsistency
> > > >>>>>>>> which could caused by on disk txn sync, as I mentioned in a
> > newer
> > > >>> JIRA
> > > >>>>>>>> ZOOKEEPER-2846 <
> > > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > > >>>>>>> the
> > > >>>>>>>> snap sync or txn sync could also leave txns gap in the txn
> file,
> > > >>> which
> > > >>>>>>> is a
> > > >>>>>>>> more common case could trigger this issue.
> > > >>>>>>>>
> > > >>>>>>>> I would suggest to turn off the on disk txn sync by default
> for
> > > now
> > > >>> to
> > > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use
> > > that
> > > >>> to
> > > >>>>>>>> validate the on disk txns during syncing.
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Fangmin
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <
> lvfangmin@gmail.com
> > >
> > > >>>>>> wrote:
> > > >>>>>>>>> Andor,
> > > >>>>>>>>>
> > > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to
> > help
> > > >>>>>>> detecting
> > > >>>>>>>>> inconsistency, it's a new feature with amounts of code
> change.
> > > I'll
> > > >>>>>>> start
> > > >>>>>>>>> upstream it part by part, but I don't expect it's being
> merged
> > in
> > > >>> the
> > > >>>>>>> next
> > > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a
> > > block
> > > >>> for
> > > >>>>>>> 3.5.
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> Fangmin
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <
> andor@apache.org
> > >
> > > >>>>>> wrote:
> > > >>>>>>>>>> Fangmin,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Sorry, I just noticed that you want to include the
> consistency
> > > >>> fixes
> > > >>>>>> in
> > > >>>>>>>>>> the stable version which is fine. Let’s finish the backports
> > and
> > > >>>>>> we’ll
> > > >>>>>>> be
> > > >>>>>>>>>> done with them.
> > > >>>>>>>>>>
> > > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t
> block
> > > 3.5
> > > >>>>>> with
> > > >>>>>>>>>> that. What do you think?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Andor
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <andor@apache.org
> >
> > > >>> wrote:
> > > >>>>>>>>>>> Cool, thanks for the clarification.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> The updated list is as follows:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > protocol)
> > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> follower
> > > sync
> > > >>>>>> with
> > > >>>>>>>>>> leader and follower receiving external connection requests.)
> > > >>>>>>>>>>> The following are not critical and no blockers for the
> stable
> > > >>>>>> release:
> > > >>>>>>>>>>> Waiting for to be ported to 3.5:
> > > >>>>>>>>>>> - ZOOKEEPER-3104
> > > >>>>>>>>>>> - ZOOKEEPER-3125
> > > >>>>>>>>>>> - ZOOKEEPER-3127
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> New feature:
> > > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Regards,
> > > >>>>>>>>>>> Andor
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <
> lvfangmin@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>>>>> Hi Andor,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> That's the on disk txn feature, which was disabled
> > internally
> > > >>> after
> > > >>>>>>> we
> > > >>>>>>>>>>>> found the potentially inconsistent issue. The only
> solution
> > we
> > > >>> have
> > > >>>>>>>>>> for now
> > > >>>>>>>>>>>> is waiting for the new digest checking feature I mentioned
> > in
> > > >>>>>>>>>>>> ZOOKEEPER-3114.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I think there are some other critical consistent issues we
> > > just
> > > >>>>>> fixed
> > > >>>>>>>>>> on
> > > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > > >>> ZOOKEEPER-3127, I
> > > >>>>>>>>>> think we
> > > >>>>>>>>>>>> should include that in the official 3.5 release as well.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>> Fangmin
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > > andor@apache.org
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>> Hi Jeelani,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from
> the
> > > >>> list
> > > >>>>>> to
> > > >>>>>>>>>> get
> > > >>>>>>>>>>>>> closer to a stable release. :)
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> What's the feature which can be disabled to avoid data
> > > >>>>>>> inconsistency?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
> > > >>>>>>>>>> ZOOKEEPER-2418 as
> > > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be
> > > >>> disabled.
> > > >>>>>> We
> > > >>>>>>>>>> are
> > > >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > > >>>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Jeelani
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > > >>> <andor@cloudera.com.INVALID
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>> Fine.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have
> the
> > > >>> list
> > > >>>>>>> of:
> > > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > > >>> protocol)
> > > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
> > > >>>>>>>>>>>>> transactions to
> > > >>>>>>>>>>>>>> followers)
> > > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> > follower
> > > >>>>>> sync
> > > >>>>>>>>>>>>> with
> > > >>>>>>>>>>>>>> leader and follower receiving external connection
> > requests.)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5
> > > release,
> > > >>>>>>>>>> hence
> > > >>>>>>>>>>>>> I
> > > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
> > > >>>>>> release.
> > > >>>>>>>>>> PR
> > > >>>>>>>>>>>>> has
> > > >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > > >>>>>>>>>>>>>> The rest are also long outstanding issues which have
> been
> > > >>> found
> > > >>>>>> in
> > > >>>>>>>>>>>>> the 3.5
> > > >>>>>>>>>>>>>> branch.
> > > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in
> > > 3.4,
> > > >>>>>> but
> > > >>>>>>>>>>>>> never has
> > > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still
> present.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I think we should at least run some manual testing and
> see
> > > if
> > > >>> we
> > > >>>>>>>>>>>>> could
> > > >>>>>>>>>>>>>> repro any of these issues before going ahead with a
> stable
> > > >>>>>>> release.
> > > >>>>>>>>>>>>>> Regards,
> > > >>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > > hanm@apache.org>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks like
> > lots
> > > >>> of
> > > >>>>>> the
> > > >>>>>>>>>>>>> JIRA
> > > >>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549,
> > 2846,
> > > >>> also
> > > >>>>>>>>>>>>> affects
> > > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I think historically the single outstanding blocking
> > issue
> > > >>> for a
> > > >>>>>>>>>>>>> stable 3.5
> > > >>>>>>>>>>>>>>> release is the reconfig feature and security concerns
> > > around
> > > >>> it
> > > >>>>>>>>>>>>> (somehow
> > > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta
> > > releases
> > > >>>>>> were
> > > >>>>>>>>>>>>> created
> > > >>>>>>>>>>>>>>> to stabilize that feature.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> > Something
> > > >>>>>> might
> > > >>>>>>>>>>>>> worth
> > > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the
> record
> > > >>>>>>>>>>>>>>>
> > > >>>
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no
> flaky
> > > >>> tests
> > > >>>>>> at
> > > >>>>>>>>>>>>> all.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > > >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate
> > it.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Andor
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>
> > > >>>
> > >
> >
>

Re: ZooKeeper 3.5 blocker issues

Posted by Andor Molnar <an...@cloudera.com.INVALID>.
Sure, good point. Let's put it on the list.

Andor


On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org> wrote:

> Are folks OK to wait on that OWASP issue I documented over the weekend?
> afaict we are not affected but it would be good to get another pair of eyes
> on it.
>
> Patrick
>
> On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org> wrote:
>
> > Hi team,
> >
> >
> > I'm proudly announce that thanks to the joint effort from the community,
> > the 3.5 blockers list has become empty:
> >
> > "project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5
> > AND priority in (blocker, critical) ORDER BY priority DESC, key ASC"
> >
> >
> > Well... almost. All the blocker issues have gone, but we still have the
> > Maven migration to complete before the stable release. If you have some
> > free cycles, please join us testing the Maven build on this PR:
> >
> > https://github.com/apache/zookeeper/pull/708
> >
> > I hope we can merge it pretty soon.
> >
> >
> > In terms of the builds, the weather at 3.5 branch is quite sunny
> nowadays:
> >
> > https://builds.apache.org/view/S-Z/view/ZooKeeper/
> >
> > The Java 11 build is still having some difficulties, which hopefully I
> > can address before the holidays:
> >
> > https://issues.apache.org/jira/browse/ZOOKEEPER-3204
> >
> >
> > If you happen to know about something which is important from 3.5's
> > perspective and missing from the above, please don't hesitate to share.
> >
> >
> > Happy ZooKeeping!
> >
> > Andor
> >
> >
> >
> > On 11/2/18 21:12, Fangmin Lv wrote:
> > > Andor,
> > >
> > > Here is the PR to port ZK-3104 from master to 3.4:
> > > https://github.com/apache/zookeeper/pull/685.
> > >
> > > Fangmin
> > >
> > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com>
> wrote:
> > >
> > >> Hi Andor,
> > >>
> > >> Is anyone working on ZK-2778? I can pick it up if there is no one
> > working
> > >> on it yet.
> > >>
> > >> I'll open a 3.5 PR for ZK-3104 today.
> > >>
> > >> Fangmin
> > >>
> > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org>
> wrote:
> > >>
> > >>> Hi folks,
> > >>>
> > >>> You’ve probably realised lots of update emails coming from Jira.
> Please
> > >>> be aware that we’ve updated a bunch of open blocker/critical 3.5
> > tickets to
> > >>> reflect to what we discussed in this email.
> > >>>
> > >>> If you open up the following jira filter:
> > >>>
> > >>> project = ZooKeeper and resolution = Unresolved and fixVersion =
> 3.5.5
> > >>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
> > >>>
> > >>> You’ll see the most up-to-date list of tickets which need to be
> > addressed
> > >>> before the stable 3.5 release.
> > >>>
> > >>> Thank you for your efforts to get this done.
> > >>>
> > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has already been
> > >>> resolved. Have you created a separate ticket for the backport or
> shall
> > I
> > >>> just reopen it with the right fix versions?
> > >>>
> > >>> Thanks,
> > >>> Andor
> > >>>
> > >>>
> > >>>
> > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>> Let me summarize and give a quick update on the outstanding issues
> for
> > >>> 3.5 GA:
> > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
> with
> > >>> leader and follower receiving external connection requests.)
> > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> > >>>> - ZOOKEEPER-925 Docs generation to Maven
> > >>>> - ZOOKEEPER-3104 (waiting for backport)
> > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> > >>>>
> > >>>> The 2 Maven related tickets are no-brainers as well as the
> backports.
> > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see,
> > >>> ZK-1818 is the only one waiting for a volunteer.
> > >>>> Please correct me if I’ve missed something.
> > >>>>
> > >>>> Regards,
> > >>>> Andor
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes
> <tamaas@cloudera.com.INVALID
> > >
> > >>> wrote:
> > >>>>> Hi All,
> > >>>>>
> > >>>>> I would add ZOOKEEPER-3021
> > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate
> > project
> > >>>>> structure to Maven build as a blocker too. Since the migration has
> > >>> started
> > >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> > >>>>>
> > >>>>> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925
> >
> > >>> replace
> > >>>>> our forrest site and documentation generation might also be a good
> > >>> idea,
> > >>>>> since then we could deliver the new MarkDown based documentation.
> > >>>>>
> > >>>>> Regards, Tamaas
> > >>>>>
> > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lv...@gmail.com>
> > >>> wrote:
> > >>>>>> Oh, sorry for the confusion, I should provide more context.
> > >>>>>>
> > >>>>>> Leader will use on disk txn sync with followers to if the peer
> zxid
> > >>> is not
> > >>>>>> in it's in memory commit logs, the code is here: Leader on disk
> txn
> > >>> sync
> > >>>>>> <
> > >>>>>>
> > >>>
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> > >>>>>>> .
> > >>>>>> There is bug that potentially there will be gap in the txn files,
> > like
> > >>>>>> after snap sync, etc, so it's possible the peer will miss txns due
> > to
> > >>> this.
> > >>>>>> The option to disable it is snapshotSizeFactor
> > >>>>>> <
> > >>>>>>
> > >>>
> >
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> > >>>>>>> ,
> > >>>>>> set it to -1 will disable this feature. On 3.5, it's better to
> have
> > a
> > >>> PR to
> > >>>>>> set this to -1 by default. It might have more SNAP sync, but from
> > our
> > >>> prod
> > >>>>>> it doesn't seem to be a big problem to me.
> > >>>>>>
> > >>>>>> I can send out the diff to disable it by default on 3.5 if you
> guys
> > >>> think
> > >>>>>> this is the right way to do.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Fangmin
> > >>>>>>
> > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
> > >>> wrote:
> > >>>>>>> What’s needed to turn it off?
> > >>>>>>> Do we need a PR or it’s just a config option?
> > >>>>>>> Shall we implement a feature switch for that and turn it off by
> > >>> default?
> > >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> > >>>>>>>
> > >>>>>>> Andor
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com>
> > wrote:
> > >>>>>>>>
> > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
> > >>>>>>> inconsistency
> > >>>>>>>> which could caused by on disk txn sync, as I mentioned in a
> newer
> > >>> JIRA
> > >>>>>>>> ZOOKEEPER-2846 <
> > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> > >>>>>>> the
> > >>>>>>>> snap sync or txn sync could also leave txns gap in the txn file,
> > >>> which
> > >>>>>>> is a
> > >>>>>>>> more common case could trigger this issue.
> > >>>>>>>>
> > >>>>>>>> I would suggest to turn off the on disk txn sync by default for
> > now
> > >>> to
> > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use
> > that
> > >>> to
> > >>>>>>>> validate the on disk txns during syncing.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Fangmin
> > >>>>>>>>
> > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lvfangmin@gmail.com
> >
> > >>>>>> wrote:
> > >>>>>>>>> Andor,
> > >>>>>>>>>
> > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to
> help
> > >>>>>>> detecting
> > >>>>>>>>> inconsistency, it's a new feature with amounts of code change.
> > I'll
> > >>>>>>> start
> > >>>>>>>>> upstream it part by part, but I don't expect it's being merged
> in
> > >>> the
> > >>>>>>> next
> > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a
> > block
> > >>> for
> > >>>>>>> 3.5.
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Fangmin
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <andor@apache.org
> >
> > >>>>>> wrote:
> > >>>>>>>>>> Fangmin,
> > >>>>>>>>>>
> > >>>>>>>>>> Sorry, I just noticed that you want to include the consistency
> > >>> fixes
> > >>>>>> in
> > >>>>>>>>>> the stable version which is fine. Let’s finish the backports
> and
> > >>>>>> we’ll
> > >>>>>>> be
> > >>>>>>>>>> done with them.
> > >>>>>>>>>>
> > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block
> > 3.5
> > >>>>>> with
> > >>>>>>>>>> that. What do you think?
> > >>>>>>>>>>
> > >>>>>>>>>> Andor
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org>
> > >>> wrote:
> > >>>>>>>>>>> Cool, thanks for the clarification.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The updated list is as follows:
> > >>>>>>>>>>>
> > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> protocol)
> > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> > sync
> > >>>>>> with
> > >>>>>>>>>> leader and follower receiving external connection requests.)
> > >>>>>>>>>>> The following are not critical and no blockers for the stable
> > >>>>>> release:
> > >>>>>>>>>>> Waiting for to be ported to 3.5:
> > >>>>>>>>>>> - ZOOKEEPER-3104
> > >>>>>>>>>>> - ZOOKEEPER-3125
> > >>>>>>>>>>> - ZOOKEEPER-3127
> > >>>>>>>>>>>
> > >>>>>>>>>>> New feature:
> > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Andor
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lv...@gmail.com>
> > >>> wrote:
> > >>>>>>>>>>>> Hi Andor,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> That's the on disk txn feature, which was disabled
> internally
> > >>> after
> > >>>>>>> we
> > >>>>>>>>>>>> found the potentially inconsistent issue. The only solution
> we
> > >>> have
> > >>>>>>>>>> for now
> > >>>>>>>>>>>> is waiting for the new digest checking feature I mentioned
> in
> > >>>>>>>>>>>> ZOOKEEPER-3114.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think there are some other critical consistent issues we
> > just
> > >>>>>> fixed
> > >>>>>>>>>> on
> > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> > >>> ZOOKEEPER-3127, I
> > >>>>>>>>>> think we
> > >>>>>>>>>>>> should include that in the official 3.5 release as well.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> Fangmin
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> > andor@apache.org
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>> Hi Jeelani,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the
> > >>> list
> > >>>>>> to
> > >>>>>>>>>> get
> > >>>>>>>>>>>>> closer to a stable release. :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> What's the feature which can be disabled to avoid data
> > >>>>>>> inconsistency?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
> > >>>>>>>>>> ZOOKEEPER-2418 as
> > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be
> > >>> disabled.
> > >>>>>> We
> > >>>>>>>>>> are
> > >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> > >>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Jeelani
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> > >>> <andor@cloudera.com.INVALID
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> Fine.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the
> > >>> list
> > >>>>>>> of:
> > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> > >>> protocol)
> > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
> > >>>>>>>>>>>>> transactions to
> > >>>>>>>>>>>>>> followers)
> > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between
> follower
> > >>>>>> sync
> > >>>>>>>>>>>>> with
> > >>>>>>>>>>>>>> leader and follower receiving external connection
> requests.)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5
> > release,
> > >>>>>>>>>> hence
> > >>>>>>>>>>>>> I
> > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
> > >>>>>> release.
> > >>>>>>>>>> PR
> > >>>>>>>>>>>>> has
> > >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> > >>>>>>>>>>>>>> The rest are also long outstanding issues which have been
> > >>> found
> > >>>>>> in
> > >>>>>>>>>>>>> the 3.5
> > >>>>>>>>>>>>>> branch.
> > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in
> > 3.4,
> > >>>>>> but
> > >>>>>>>>>>>>> never has
> > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I think we should at least run some manual testing and see
> > if
> > >>> we
> > >>>>>>>>>>>>> could
> > >>>>>>>>>>>>>> repro any of these issues before going ahead with a stable
> > >>>>>>> release.
> > >>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> > hanm@apache.org>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks like
> lots
> > >>> of
> > >>>>>> the
> > >>>>>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549,
> 2846,
> > >>> also
> > >>>>>>>>>>>>> affects
> > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I think historically the single outstanding blocking
> issue
> > >>> for a
> > >>>>>>>>>>>>> stable 3.5
> > >>>>>>>>>>>>>>> release is the reconfig feature and security concerns
> > around
> > >>> it
> > >>>>>>>>>>>>> (somehow
> > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta
> > releases
> > >>>>>> were
> > >>>>>>>>>>>>> created
> > >>>>>>>>>>>>>>> to stabilize that feature.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> So it looks like we are in good shape to release.
> Something
> > >>>>>> might
> > >>>>>>>>>>>>> worth
> > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record
> > >>>>>>>>>>>>>>>
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky
> > >>> tests
> > >>>>>> at
> > >>>>>>>>>>>>> all.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> > >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate
> it.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Andor
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>
> > >>>
> >
>

Re: ZooKeeper 3.5 blocker issues

Posted by Patrick Hunt <ph...@apache.org>.
Are folks OK to wait on that OWASP issue I documented over the weekend?
afaict we are not affected but it would be good to get another pair of eyes
on it.

Patrick

On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org> wrote:

> Hi team,
>
>
> I'm proudly announce that thanks to the joint effort from the community,
> the 3.5 blockers list has become empty:
>
> "project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5
> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC"
>
>
> Well... almost. All the blocker issues have gone, but we still have the
> Maven migration to complete before the stable release. If you have some
> free cycles, please join us testing the Maven build on this PR:
>
> https://github.com/apache/zookeeper/pull/708
>
> I hope we can merge it pretty soon.
>
>
> In terms of the builds, the weather at 3.5 branch is quite sunny nowadays:
>
> https://builds.apache.org/view/S-Z/view/ZooKeeper/
>
> The Java 11 build is still having some difficulties, which hopefully I
> can address before the holidays:
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
>
>
> If you happen to know about something which is important from 3.5's
> perspective and missing from the above, please don't hesitate to share.
>
>
> Happy ZooKeeping!
>
> Andor
>
>
>
> On 11/2/18 21:12, Fangmin Lv wrote:
> > Andor,
> >
> > Here is the PR to port ZK-3104 from master to 3.4:
> > https://github.com/apache/zookeeper/pull/685.
> >
> > Fangmin
> >
> > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com> wrote:
> >
> >> Hi Andor,
> >>
> >> Is anyone working on ZK-2778? I can pick it up if there is no one
> working
> >> on it yet.
> >>
> >> I'll open a 3.5 PR for ZK-3104 today.
> >>
> >> Fangmin
> >>
> >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org> wrote:
> >>
> >>> Hi folks,
> >>>
> >>> You’ve probably realised lots of update emails coming from Jira. Please
> >>> be aware that we’ve updated a bunch of open blocker/critical 3.5
> tickets to
> >>> reflect to what we discussed in this email.
> >>>
> >>> If you open up the following jira filter:
> >>>
> >>> project = ZooKeeper and resolution = Unresolved and fixVersion = 3.5.5
> >>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
> >>>
> >>> You’ll see the most up-to-date list of tickets which need to be
> addressed
> >>> before the stable 3.5 release.
> >>>
> >>> Thank you for your efforts to get this done.
> >>>
> >>> Fangmin, ZK-3104 is waiting for backport, but ticket has already been
> >>> resolved. Have you created a separate ticket for the backport or shall
> I
> >>> just reopen it with the right fix versions?
> >>>
> >>> Thanks,
> >>> Andor
> >>>
> >>>
> >>>
> >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> Let me summarize and give a quick update on the outstanding issues for
> >>> 3.5 GA:
> >>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync with
> >>> leader and follower receiving external connection requests.)
> >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
> >>>> - ZOOKEEPER-925 Docs generation to Maven
> >>>> - ZOOKEEPER-3104 (waiting for backport)
> >>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
> >>>>
> >>>> The 2 Maven related tickets are no-brainers as well as the backports.
> >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see,
> >>> ZK-1818 is the only one waiting for a volunteer.
> >>>> Please correct me if I’ve missed something.
> >>>>
> >>>> Regards,
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes <tamaas@cloudera.com.INVALID
> >
> >>> wrote:
> >>>>> Hi All,
> >>>>>
> >>>>> I would add ZOOKEEPER-3021
> >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate
> project
> >>>>> structure to Maven build as a blocker too. Since the migration has
> >>> started
> >>>>> it would be good to finish before releasing ZK 3.5.x GA.
> >>>>>
> >>>>> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925>
> >>> replace
> >>>>> our forrest site and documentation generation might also be a good
> >>> idea,
> >>>>> since then we could deliver the new MarkDown based documentation.
> >>>>>
> >>>>> Regards, Tamaas
> >>>>>
> >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lv...@gmail.com>
> >>> wrote:
> >>>>>> Oh, sorry for the confusion, I should provide more context.
> >>>>>>
> >>>>>> Leader will use on disk txn sync with followers to if the peer zxid
> >>> is not
> >>>>>> in it's in memory commit logs, the code is here: Leader on disk txn
> >>> sync
> >>>>>> <
> >>>>>>
> >>>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
> >>>>>>> .
> >>>>>> There is bug that potentially there will be gap in the txn files,
> like
> >>>>>> after snap sync, etc, so it's possible the peer will miss txns due
> to
> >>> this.
> >>>>>> The option to disable it is snapshotSizeFactor
> >>>>>> <
> >>>>>>
> >>>
> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
> >>>>>>> ,
> >>>>>> set it to -1 will disable this feature. On 3.5, it's better to have
> a
> >>> PR to
> >>>>>> set this to -1 by default. It might have more SNAP sync, but from
> our
> >>> prod
> >>>>>> it doesn't seem to be a big problem to me.
> >>>>>>
> >>>>>> I can send out the diff to disable it by default on 3.5 if you guys
> >>> think
> >>>>>> this is the right way to do.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Fangmin
> >>>>>>
> >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
> >>> wrote:
> >>>>>>> What’s needed to turn it off?
> >>>>>>> Do we need a PR or it’s just a config option?
> >>>>>>> Shall we implement a feature switch for that and turn it off by
> >>> default?
> >>>>>>> Sorry I don’t have too much insight on disk txn sync.
> >>>>>>>
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com>
> wrote:
> >>>>>>>>
> >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
> >>>>>>> inconsistency
> >>>>>>>> which could caused by on disk txn sync, as I mentioned in a newer
> >>> JIRA
> >>>>>>>> ZOOKEEPER-2846 <
> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
> >>>>>>> the
> >>>>>>>> snap sync or txn sync could also leave txns gap in the txn file,
> >>> which
> >>>>>>> is a
> >>>>>>>> more common case could trigger this issue.
> >>>>>>>>
> >>>>>>>> I would suggest to turn off the on disk txn sync by default for
> now
> >>> to
> >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use
> that
> >>> to
> >>>>>>>> validate the on disk txns during syncing.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Fangmin
> >>>>>>>>
> >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lv...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>> Andor,
> >>>>>>>>>
> >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to help
> >>>>>>> detecting
> >>>>>>>>> inconsistency, it's a new feature with amounts of code change.
> I'll
> >>>>>>> start
> >>>>>>>>> upstream it part by part, but I don't expect it's being merged in
> >>> the
> >>>>>>> next
> >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a
> block
> >>> for
> >>>>>>> 3.5.
> >>>>>>>>> Thanks,
> >>>>>>>>> Fangmin
> >>>>>>>>>
> >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <an...@apache.org>
> >>>>>> wrote:
> >>>>>>>>>> Fangmin,
> >>>>>>>>>>
> >>>>>>>>>> Sorry, I just noticed that you want to include the consistency
> >>> fixes
> >>>>>> in
> >>>>>>>>>> the stable version which is fine. Let’s finish the backports and
> >>>>>> we’ll
> >>>>>>> be
> >>>>>>>>>> done with them.
> >>>>>>>>>>
> >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block
> 3.5
> >>>>>> with
> >>>>>>>>>> that. What do you think?
> >>>>>>>>>>
> >>>>>>>>>> Andor
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org>
> >>> wrote:
> >>>>>>>>>>> Cool, thanks for the clarification.
> >>>>>>>>>>>
> >>>>>>>>>>> The updated list is as follows:
> >>>>>>>>>>>
> >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast protocol)
> >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> sync
> >>>>>> with
> >>>>>>>>>> leader and follower receiving external connection requests.)
> >>>>>>>>>>> The following are not critical and no blockers for the stable
> >>>>>> release:
> >>>>>>>>>>> Waiting for to be ported to 3.5:
> >>>>>>>>>>> - ZOOKEEPER-3104
> >>>>>>>>>>> - ZOOKEEPER-3125
> >>>>>>>>>>> - ZOOKEEPER-3127
> >>>>>>>>>>>
> >>>>>>>>>>> New feature:
> >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Andor
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lv...@gmail.com>
> >>> wrote:
> >>>>>>>>>>>> Hi Andor,
> >>>>>>>>>>>>
> >>>>>>>>>>>> That's the on disk txn feature, which was disabled internally
> >>> after
> >>>>>>> we
> >>>>>>>>>>>> found the potentially inconsistent issue. The only solution we
> >>> have
> >>>>>>>>>> for now
> >>>>>>>>>>>> is waiting for the new digest checking feature I mentioned in
> >>>>>>>>>>>> ZOOKEEPER-3114.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think there are some other critical consistent issues we
> just
> >>>>>> fixed
> >>>>>>>>>> on
> >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
> >>> ZOOKEEPER-3127, I
> >>>>>>>>>> think we
> >>>>>>>>>>>> should include that in the official 3.5 release as well.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Fangmin
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <
> andor@apache.org
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>> Hi Jeelani,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the
> >>> list
> >>>>>> to
> >>>>>>>>>> get
> >>>>>>>>>>>>> closer to a stable release. :)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What's the feature which can be disabled to avoid data
> >>>>>>> inconsistency?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
> >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
> >>>>>>>>>> ZOOKEEPER-2418 as
> >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be
> >>> disabled.
> >>>>>> We
> >>>>>>>>>> are
> >>>>>>>>>>>>> working on a longer term fix for it in 3.6.
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Jeelani
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
> >>> <andor@cloudera.com.INVALID
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Fine.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the
> >>> list
> >>>>>>> of:
> >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
> >>> protocol)
> >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
> >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
> >>>>>>>>>>>>> transactions to
> >>>>>>>>>>>>>> followers)
> >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
> >>>>>> sync
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>>> leader and follower receiving external connection requests.)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5
> release,
> >>>>>>>>>> hence
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
> >>>>>> release.
> >>>>>>>>>> PR
> >>>>>>>>>>>>> has
> >>>>>>>>>>>>>> been out for a long time, get on reviewing please.
> >>>>>>>>>>>>>> The rest are also long outstanding issues which have been
> >>> found
> >>>>>> in
> >>>>>>>>>>>>> the 3.5
> >>>>>>>>>>>>>> branch.
> >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in
> 3.4,
> >>>>>> but
> >>>>>>>>>>>>> never has
> >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think we should at least run some manual testing and see
> if
> >>> we
> >>>>>>>>>>>>> could
> >>>>>>>>>>>>>> repro any of these issues before going ahead with a stable
> >>>>>>> release.
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <
> hanm@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> I haven't went through the entire list, but looks like lots
> >>> of
> >>>>>> the
> >>>>>>>>>>>>> JIRA
> >>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549, 2846,
> >>> also
> >>>>>>>>>>>>> affects
> >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think historically the single outstanding blocking issue
> >>> for a
> >>>>>>>>>>>>> stable 3.5
> >>>>>>>>>>>>>>> release is the reconfig feature and security concerns
> around
> >>> it
> >>>>>>>>>>>>> (somehow
> >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta
> releases
> >>>>>> were
> >>>>>>>>>>>>> created
> >>>>>>>>>>>>>>> to stabilize that feature.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
> >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> So it looks like we are in good shape to release. Something
> >>>>>> might
> >>>>>>>>>>>>> worth
> >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record
> >>>>>>>>>>>>>>>
> >>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
> >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky
> >>> tests
> >>>>>> at
> >>>>>>>>>>>>> all.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
> >>>>>>>>>>>>> <an...@cloudera.com.invalid>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate it.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Andor
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>
>

Re: ZooKeeper 3.5 blocker issues

Posted by Andor Molnár <an...@apache.org>.
Hi team,


I'm proudly announce that thanks to the joint effort from the community,
the 3.5 blockers list has become empty:

"project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5
AND priority in (blocker, critical) ORDER BY priority DESC, key ASC"


Well... almost. All the blocker issues have gone, but we still have the
Maven migration to complete before the stable release. If you have some
free cycles, please join us testing the Maven build on this PR:

https://github.com/apache/zookeeper/pull/708

I hope we can merge it pretty soon.


In terms of the builds, the weather at 3.5 branch is quite sunny nowadays:

https://builds.apache.org/view/S-Z/view/ZooKeeper/

The Java 11 build is still having some difficulties, which hopefully I
can address before the holidays:

https://issues.apache.org/jira/browse/ZOOKEEPER-3204


If you happen to know about something which is important from 3.5's
perspective and missing from the above, please don't hesitate to share.


Happy ZooKeeping!

Andor



On 11/2/18 21:12, Fangmin Lv wrote:
> Andor,
>
> Here is the PR to port ZK-3104 from master to 3.4:
> https://github.com/apache/zookeeper/pull/685.
>
> Fangmin
>
> On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com> wrote:
>
>> Hi Andor,
>>
>> Is anyone working on ZK-2778? I can pick it up if there is no one working
>> on it yet.
>>
>> I'll open a 3.5 PR for ZK-3104 today.
>>
>> Fangmin
>>
>> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org> wrote:
>>
>>> Hi folks,
>>>
>>> You’ve probably realised lots of update emails coming from Jira. Please
>>> be aware that we’ve updated a bunch of open blocker/critical 3.5 tickets to
>>> reflect to what we discussed in this email.
>>>
>>> If you open up the following jira filter:
>>>
>>> project = ZooKeeper and resolution = Unresolved and fixVersion = 3.5.5
>>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
>>>
>>> You’ll see the most up-to-date list of tickets which need to be addressed
>>> before the stable 3.5 release.
>>>
>>> Thank you for your efforts to get this done.
>>>
>>> Fangmin, ZK-3104 is waiting for backport, but ticket has already been
>>> resolved. Have you created a separate ticket for the backport or shall I
>>> just reopen it with the right fix versions?
>>>
>>> Thanks,
>>> Andor
>>>
>>>
>>>
>>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Let me summarize and give a quick update on the outstanding issues for
>>> 3.5 GA:
>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync with
>>> leader and follower receiving external connection requests.)
>>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
>>>> - ZOOKEEPER-925 Docs generation to Maven
>>>> - ZOOKEEPER-3104 (waiting for backport)
>>>> - ZOOKEEPER-3125 (waiting for backport PR #647)
>>>>
>>>> The 2 Maven related tickets are no-brainers as well as the backports.
>>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see,
>>> ZK-1818 is the only one waiting for a volunteer.
>>>> Please correct me if I’ve missed something.
>>>>
>>>> Regards,
>>>> Andor
>>>>
>>>>
>>>>
>>>>
>>>>> On 2018. Sep 28., at 18:32, Tamas Penzes <ta...@cloudera.com.INVALID>
>>> wrote:
>>>>> Hi All,
>>>>>
>>>>> I would add ZOOKEEPER-3021
>>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate project
>>>>> structure to Maven build as a blocker too. Since the migration has
>>> started
>>>>> it would be good to finish before releasing ZK 3.5.x GA.
>>>>>
>>>>> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925>
>>> replace
>>>>> our forrest site and documentation generation might also be a good
>>> idea,
>>>>> since then we could deliver the new MarkDown based documentation.
>>>>>
>>>>> Regards, Tamaas
>>>>>
>>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lv...@gmail.com>
>>> wrote:
>>>>>> Oh, sorry for the confusion, I should provide more context.
>>>>>>
>>>>>> Leader will use on disk txn sync with followers to if the peer zxid
>>> is not
>>>>>> in it's in memory commit logs, the code is here: Leader on disk txn
>>> sync
>>>>>> <
>>>>>>
>>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
>>>>>>> .
>>>>>> There is bug that potentially there will be gap in the txn files, like
>>>>>> after snap sync, etc, so it's possible the peer will miss txns due to
>>> this.
>>>>>> The option to disable it is snapshotSizeFactor
>>>>>> <
>>>>>>
>>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
>>>>>>> ,
>>>>>> set it to -1 will disable this feature. On 3.5, it's better to have a
>>> PR to
>>>>>> set this to -1 by default. It might have more SNAP sync, but from our
>>> prod
>>>>>> it doesn't seem to be a big problem to me.
>>>>>>
>>>>>> I can send out the diff to disable it by default on 3.5 if you guys
>>> think
>>>>>> this is the right way to do.
>>>>>>
>>>>>> Thanks,
>>>>>> Fangmin
>>>>>>
>>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
>>> wrote:
>>>>>>> What’s needed to turn it off?
>>>>>>> Do we need a PR or it’s just a config option?
>>>>>>> Shall we implement a feature switch for that and turn it off by
>>> default?
>>>>>>> Sorry I don’t have too much insight on disk txn sync.
>>>>>>>
>>>>>>> Andor
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
>>>>>>> inconsistency
>>>>>>>> which could caused by on disk txn sync, as I mentioned in a newer
>>> JIRA
>>>>>>>> ZOOKEEPER-2846 <
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
>>>>>>> the
>>>>>>>> snap sync or txn sync could also leave txns gap in the txn file,
>>> which
>>>>>>> is a
>>>>>>>> more common case could trigger this issue.
>>>>>>>>
>>>>>>>> I would suggest to turn off the on disk txn sync by default for now
>>> to
>>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use that
>>> to
>>>>>>>> validate the on disk txns during syncing.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Fangmin
>>>>>>>>
>>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lv...@gmail.com>
>>>>>> wrote:
>>>>>>>>> Andor,
>>>>>>>>>
>>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to help
>>>>>>> detecting
>>>>>>>>> inconsistency, it's a new feature with amounts of code change. I'll
>>>>>>> start
>>>>>>>>> upstream it part by part, but I don't expect it's being merged in
>>> the
>>>>>>> next
>>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a block
>>> for
>>>>>>> 3.5.
>>>>>>>>> Thanks,
>>>>>>>>> Fangmin
>>>>>>>>>
>>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <an...@apache.org>
>>>>>> wrote:
>>>>>>>>>> Fangmin,
>>>>>>>>>>
>>>>>>>>>> Sorry, I just noticed that you want to include the consistency
>>> fixes
>>>>>> in
>>>>>>>>>> the stable version which is fine. Let’s finish the backports and
>>>>>> we’ll
>>>>>>> be
>>>>>>>>>> done with them.
>>>>>>>>>>
>>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block 3.5
>>>>>> with
>>>>>>>>>> that. What do you think?
>>>>>>>>>>
>>>>>>>>>> Andor
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org>
>>> wrote:
>>>>>>>>>>> Cool, thanks for the clarification.
>>>>>>>>>>>
>>>>>>>>>>> The updated list is as follows:
>>>>>>>>>>>
>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast protocol)
>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
>>>>>> with
>>>>>>>>>> leader and follower receiving external connection requests.)
>>>>>>>>>>> The following are not critical and no blockers for the stable
>>>>>> release:
>>>>>>>>>>> Waiting for to be ported to 3.5:
>>>>>>>>>>> - ZOOKEEPER-3104
>>>>>>>>>>> - ZOOKEEPER-3125
>>>>>>>>>>> - ZOOKEEPER-3127
>>>>>>>>>>>
>>>>>>>>>>> New feature:
>>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Andor
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lv...@gmail.com>
>>> wrote:
>>>>>>>>>>>> Hi Andor,
>>>>>>>>>>>>
>>>>>>>>>>>> That's the on disk txn feature, which was disabled internally
>>> after
>>>>>>> we
>>>>>>>>>>>> found the potentially inconsistent issue. The only solution we
>>> have
>>>>>>>>>> for now
>>>>>>>>>>>> is waiting for the new digest checking feature I mentioned in
>>>>>>>>>>>> ZOOKEEPER-3114.
>>>>>>>>>>>>
>>>>>>>>>>>> I think there are some other critical consistent issues we just
>>>>>> fixed
>>>>>>>>>> on
>>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
>>> ZOOKEEPER-3127, I
>>>>>>>>>> think we
>>>>>>>>>>>> should include that in the official 3.5 release as well.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Fangmin
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <andor@apache.org
>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi Jeelani,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the
>>> list
>>>>>> to
>>>>>>>>>> get
>>>>>>>>>>>>> closer to a stable release. :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> What's the feature which can be disabled to avoid data
>>>>>>> inconsistency?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
>>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
>>>>>>>>>> ZOOKEEPER-2418 as
>>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be
>>> disabled.
>>>>>> We
>>>>>>>>>> are
>>>>>>>>>>>>> working on a longer term fix for it in 3.6.
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jeelani
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
>>> <andor@cloudera.com.INVALID
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the
>>> list
>>>>>>> of:
>>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
>>> protocol)
>>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
>>>>>>>>>>>>> transactions to
>>>>>>>>>>>>>> followers)
>>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
>>>>>> sync
>>>>>>>>>>>>> with
>>>>>>>>>>>>>> leader and follower receiving external connection requests.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5 release,
>>>>>>>>>> hence
>>>>>>>>>>>>> I
>>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
>>>>>> release.
>>>>>>>>>> PR
>>>>>>>>>>>>> has
>>>>>>>>>>>>>> been out for a long time, get on reviewing please.
>>>>>>>>>>>>>> The rest are also long outstanding issues which have been
>>> found
>>>>>> in
>>>>>>>>>>>>> the 3.5
>>>>>>>>>>>>>> branch.
>>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in 3.4,
>>>>>> but
>>>>>>>>>>>>> never has
>>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think we should at least run some manual testing and see if
>>> we
>>>>>>>>>>>>> could
>>>>>>>>>>>>>> repro any of these issues before going ahead with a stable
>>>>>>> release.
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <ha...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> I haven't went through the entire list, but looks like lots
>>> of
>>>>>> the
>>>>>>>>>>>>> JIRA
>>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549, 2846,
>>> also
>>>>>>>>>>>>> affects
>>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think historically the single outstanding blocking issue
>>> for a
>>>>>>>>>>>>> stable 3.5
>>>>>>>>>>>>>>> release is the reconfig feature and security concerns around
>>> it
>>>>>>>>>>>>> (somehow
>>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta releases
>>>>>> were
>>>>>>>>>>>>> created
>>>>>>>>>>>>>>> to stabilize that feature.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
>>>>>>>>>>>>>>> SSL-release-date-tt7581744.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So it looks like we are in good shape to release. Something
>>>>>> might
>>>>>>>>>>>>> worth
>>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record
>>>>>>>>>>>>>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
>>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky
>>> tests
>>>>>> at
>>>>>>>>>>>>> all.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
>>>>>>>>>>>>> <an...@cloudera.com.invalid>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Andor
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>

Re: ZooKeeper 3.5 blocker issues

Posted by Fangmin Lv <lv...@gmail.com>.
Andor,

Here is the PR to port ZK-3104 from master to 3.4:
https://github.com/apache/zookeeper/pull/685.

Fangmin

On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lv...@gmail.com> wrote:

> Hi Andor,
>
> Is anyone working on ZK-2778? I can pick it up if there is no one working
> on it yet.
>
> I'll open a 3.5 PR for ZK-3104 today.
>
> Fangmin
>
> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org> wrote:
>
>> Hi folks,
>>
>> You’ve probably realised lots of update emails coming from Jira. Please
>> be aware that we’ve updated a bunch of open blocker/critical 3.5 tickets to
>> reflect to what we discussed in this email.
>>
>> If you open up the following jira filter:
>>
>> project = ZooKeeper and resolution = Unresolved and fixVersion = 3.5.5
>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC
>>
>> You’ll see the most up-to-date list of tickets which need to be addressed
>> before the stable 3.5 release.
>>
>> Thank you for your efforts to get this done.
>>
>> Fangmin, ZK-3104 is waiting for backport, but ticket has already been
>> resolved. Have you created a separate ticket for the backport or shall I
>> just reopen it with the right fix versions?
>>
>> Thanks,
>> Andor
>>
>>
>>
>> > On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote:
>> >
>> > Hi,
>> >
>> > Let me summarize and give a quick update on the outstanding issues for
>> 3.5 GA:
>> >
>> > - ZOOKEEPER-1818 (Fix don't care for trunk)
>> > - ZOOKEEPER-2778 (Potential server deadlock between follower sync with
>> leader and follower receiving external connection requests.)
>> > - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing)
>> > - ZOOKEEPER-925 Docs generation to Maven
>> > - ZOOKEEPER-3104 (waiting for backport)
>> > - ZOOKEEPER-3125 (waiting for backport PR #647)
>> >
>> > The 2 Maven related tickets are no-brainers as well as the backports.
>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see,
>> ZK-1818 is the only one waiting for a volunteer.
>> >
>> > Please correct me if I’ve missed something.
>> >
>> > Regards,
>> > Andor
>> >
>> >
>> >
>> >
>> >> On 2018. Sep 28., at 18:32, Tamas Penzes <ta...@cloudera.com.INVALID>
>> wrote:
>> >>
>> >> Hi All,
>> >>
>> >> I would add ZOOKEEPER-3021
>> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate project
>> >> structure to Maven build as a blocker too. Since the migration has
>> started
>> >> it would be good to finish before releasing ZK 3.5.x GA.
>> >>
>> >> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925>
>> replace
>> >> our forrest site and documentation generation might also be a good
>> idea,
>> >> since then we could deliver the new MarkDown based documentation.
>> >>
>> >> Regards, Tamaas
>> >>
>> >> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lv...@gmail.com>
>> wrote:
>> >>
>> >>> Oh, sorry for the confusion, I should provide more context.
>> >>>
>> >>> Leader will use on disk txn sync with followers to if the peer zxid
>> is not
>> >>> in it's in memory commit logs, the code is here: Leader on disk txn
>> sync
>> >>> <
>> >>>
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774
>> >>>> .
>> >>> There is bug that potentially there will be gap in the txn files, like
>> >>> after snap sync, etc, so it's possible the peer will miss txns due to
>> this.
>> >>>
>> >>> The option to disable it is snapshotSizeFactor
>> >>> <
>> >>>
>> https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81
>> >>>> ,
>> >>> set it to -1 will disable this feature. On 3.5, it's better to have a
>> PR to
>> >>> set this to -1 by default. It might have more SNAP sync, but from our
>> prod
>> >>> it doesn't seem to be a big problem to me.
>> >>>
>> >>> I can send out the diff to disable it by default on 3.5 if you guys
>> think
>> >>> this is the right way to do.
>> >>>
>> >>> Thanks,
>> >>> Fangmin
>> >>>
>> >>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org>
>> wrote:
>> >>>
>> >>>> What’s needed to turn it off?
>> >>>> Do we need a PR or it’s just a config option?
>> >>>> Shall we implement a feature switch for that and turn it off by
>> default?
>> >>>>
>> >>>> Sorry I don’t have too much insight on disk txn sync.
>> >>>>
>> >>>> Andor
>> >>>>
>> >>>>
>> >>>>
>> >>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lv...@gmail.com> wrote:
>> >>>>>
>> >>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of
>> >>>> inconsistency
>> >>>>> which could caused by on disk txn sync, as I mentioned in a newer
>> JIRA
>> >>>>> ZOOKEEPER-2846 <
>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>,
>> >>>> the
>> >>>>> snap sync or txn sync could also leave txns gap in the txn file,
>> which
>> >>>> is a
>> >>>>> more common case could trigger this issue.
>> >>>>>
>> >>>>> I would suggest to turn off the on disk txn sync by default for now
>> to
>> >>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use that
>> to
>> >>>>> validate the on disk txns during syncing.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Fangmin
>> >>>>>
>> >>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lv...@gmail.com>
>> >>> wrote:
>> >>>>>
>> >>>>>> Andor,
>> >>>>>>
>> >>>>>> ZOOKEEPER-3114 is about adding real time digest checking to help
>> >>>> detecting
>> >>>>>> inconsistency, it's a new feature with amounts of code change. I'll
>> >>>> start
>> >>>>>> upstream it part by part, but I don't expect it's being merged in
>> the
>> >>>> next
>> >>>>>> few weeks. So yes, it's a nice to have, but definitely not a block
>> for
>> >>>> 3.5.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Fangmin
>> >>>>>>
>> >>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <an...@apache.org>
>> >>> wrote:
>> >>>>>>
>> >>>>>>> Fangmin,
>> >>>>>>>
>> >>>>>>> Sorry, I just noticed that you want to include the consistency
>> fixes
>> >>> in
>> >>>>>>> the stable version which is fine. Let’s finish the backports and
>> >>> we’ll
>> >>>> be
>> >>>>>>> done with them.
>> >>>>>>>
>> >>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block 3.5
>> >>> with
>> >>>>>>> that. What do you think?
>> >>>>>>>
>> >>>>>>> Andor
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> Cool, thanks for the clarification.
>> >>>>>>>>
>> >>>>>>>> The updated list is as follows:
>> >>>>>>>>
>> >>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast protocol)
>> >>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>> >>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync
>> >>> with
>> >>>>>>> leader and follower receiving external connection requests.)
>> >>>>>>>>
>> >>>>>>>> The following are not critical and no blockers for the stable
>> >>> release:
>> >>>>>>>>
>> >>>>>>>> Waiting for to be ported to 3.5:
>> >>>>>>>> - ZOOKEEPER-3104
>> >>>>>>>> - ZOOKEEPER-3125
>> >>>>>>>> - ZOOKEEPER-3127
>> >>>>>>>>
>> >>>>>>>> New feature:
>> >>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too)
>> >>>>>>>>
>> >>>>>>>> Regards,
>> >>>>>>>> Andor
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lv...@gmail.com>
>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi Andor,
>> >>>>>>>>>
>> >>>>>>>>> That's the on disk txn feature, which was disabled internally
>> after
>> >>>> we
>> >>>>>>>>> found the potentially inconsistent issue. The only solution we
>> have
>> >>>>>>> for now
>> >>>>>>>>> is waiting for the new digest checking feature I mentioned in
>> >>>>>>>>> ZOOKEEPER-3114.
>> >>>>>>>>>
>> >>>>>>>>> I think there are some other critical consistent issues we just
>> >>> fixed
>> >>>>>>> on
>> >>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125,
>> ZOOKEEPER-3127, I
>> >>>>>>> think we
>> >>>>>>>>> should include that in the official 3.5 release as well.
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Fangmin
>> >>>>>>>>>
>> >>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár <andor@apache.org
>> >
>> >>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi Jeelani,
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the
>> list
>> >>> to
>> >>>>>>> get
>> >>>>>>>>>> closer to a stable release. :)
>> >>>>>>>>>>
>> >>>>>>>>>> What's the feature which can be disabled to avoid data
>> >>>> inconsistency?
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Andor
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote:
>> >>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring
>> >>>>>>> ZOOKEEPER-2418 as
>> >>>>>>>>>> well? This exists in 3.4 as well and the feature can be
>> disabled.
>> >>> We
>> >>>>>>> are
>> >>>>>>>>>> working on a longer term fix for it in 3.6.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Regards,
>> >>>>>>>>>>>
>> >>>>>>>>>>> Jeelani
>> >>>>>>>>>>>
>> >>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar"
>> <andor@cloudera.com.INVALID
>> >>>>
>> >>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> Fine.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the
>> list
>> >>>> of:
>> >>>>>>>>>>>
>> >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast
>> protocol)
>> >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk)
>> >>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some
>> >>>>>>>>>> transactions to
>> >>>>>>>>>>> followers)
>> >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower
>> >>> sync
>> >>>>>>>>>> with
>> >>>>>>>>>>> leader and follower receiving external connection requests.)
>> >>>>>>>>>>>
>> >>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5 release,
>> >>>>>>> hence
>> >>>>>>>>>> I
>> >>>>>>>>>>> wouldn't leave it out or postpone it for the next stable
>> >>> release.
>> >>>>>>> PR
>> >>>>>>>>>> has
>> >>>>>>>>>>> been out for a long time, get on reviewing please.
>> >>>>>>>>>>> The rest are also long outstanding issues which have been
>> found
>> >>> in
>> >>>>>>>>>> the 3.5
>> >>>>>>>>>>> branch.
>> >>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in 3.4,
>> >>> but
>> >>>>>>>>>> never has
>> >>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I think we should at least run some manual testing and see if
>> we
>> >>>>>>>>>> could
>> >>>>>>>>>>> repro any of these issues before going ahead with a stable
>> >>>> release.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Regards,
>> >>>>>>>>>>> Andor
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han <ha...@apache.org>
>> >>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> I haven't went through the entire list, but looks like lots
>> of
>> >>> the
>> >>>>>>>>>> JIRA
>> >>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549, 2846,
>> also
>> >>>>>>>>>> affects
>> >>>>>>>>>>>> 3.4 releases. Should we scope these issues out?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I think historically the single outstanding blocking issue
>> for a
>> >>>>>>>>>> stable 3.5
>> >>>>>>>>>>>> release is the reconfig feature and security concerns around
>> it
>> >>>>>>>>>> (somehow
>> >>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta releases
>> >>> were
>> >>>>>>>>>> created
>> >>>>>>>>>>>> to stabilize that feature.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>
>> >>>>
>> >>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e=
>> >>>>>>>>>>>> SSL-release-date-tt7581744.html
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> So it looks like we are in good shape to release. Something
>> >>> might
>> >>>>>>>>>> worth
>> >>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record
>> >>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>
>> >>>>
>> >>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e=
>> >>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky
>> tests
>> >>> at
>> >>>>>>>>>> all.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar
>> >>>>>>>>>> <an...@cloudera.com.invalid>
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate it.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Andor
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>
>> >>>>
>> >
>>
>>