You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Jihoon Son <ji...@apache.org> on 2018/07/09 22:25:44 UTC

Druid 0.12.2-rc1 vote

Hi all,

We have no open issues and PRs for 0.12.2 (
https://github.com/apache/incubator-druid/milestone/27). The 0.12.2 branch
is already available and all PRs for 0.12.2 have merged into that branch.

Let's vote on releasing RC1. Here is my +1.

This is a non-ASF release.

Best,
Jihoon

Re: Druid 0.12.2-rc1 vote

Posted by Jihoon Son <ji...@apache.org>.
Hi guys,

I think we're ready for releasing 0.12.2.
I'm closing this vote and creating a new one.

Best,
Jihoon

On Wed, Jul 11, 2018 at 1:43 PM Gian Merlino <gi...@apache.org> wrote:

> Well, it's never good if a WTH?! message actually gets logged. They are
> usually meant to be things that should "never" happen. I am ok with holding
> off 0.12.2-rc1 until this fix is in.
>
> On Wed, Jul 11, 2018 at 1:04 PM Jihoon Son <ji...@apache.org> wrote:
>
> > Thanks everyone for voting.
> >
> > Unfortunately, I found another bug in Kafka indexing service (
> > https://github.com/apache/incubator-druid/issues/5992). I think it's
> worth
> > to include 0.12.2.
> > I'm currently working on that issue and can probably finish at least by
> > this week.
> >
> > Can we add it to 0.12.2 and vote again once a patch to fix is merged?
> >
> > Jihoon
> >
> > On Wed, Jul 11, 2018 at 10:02 AM Jonathan Wei <jo...@apache.org> wrote:
> >
> > > +1
> > >
> > > On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino <gi...@apache.org> wrote:
> > >
> > > > +1 from me too!
> > > >
> > > > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cr...@apache.org>
> > > wrote:
> > > >
> > > > > That is very helpful, thank you!
> > > > >
> > > > > +1 for continuing with 0.12.2-RC1
> > > > >
> > > > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io>
> > > > wrote:
> > > > >
> > > > > > Heya, sorry for the delay (and missing the sync, i'll try to get
> > > better
> > > > > > about showing up). I've fixed a handful of coordinator bugs post
> > > 0.12.0
> > > > > > (and
> > > > > > not backported to 0.12.1), some of these issues go far back, some
> > > back
> > > > to
> > > > > > when segment assignment priority for different tiers of
> historicals
> > > was
> > > > > > introduced, some are just some oddities on the behavior of the
> > > balancer
> > > > > > that I am unsure when were introduced. This is the complete list
> of
> > > > fixes
> > > > > > that are currently in 0.12.2 afaik, with a small description (see
> > PRs
> > > > and
> > > > > > associated issues for more details)
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5528 fixed an
> issue
> > > > that
> > > > > > movement did not drop the segment from the server the segment was
> > > being
> > > > > > moved from (this one goes waaaay back, to batch segment
> > > announcements)
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5529 changed
> > behavior
> > > > of
> > > > > > drop to use the balancer to choose where to drop segments from,
> > based
> > > > on
> > > > > > behavior observed caused by the issue of 5528
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5532 fixes an
> issue
> > > > where
> > > > > > primary assignment during load rule processing would assign an
> > > > > unavailable
> > > > > > segment to every server with capacity until at least 1 historical
> > had
> > > > the
> > > > > > segment (and drop it from all the others if they all loaded at
> the
> > > same
> > > > > > time), choking load queues from doing useful things
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5555 fixed a way
> > for
> > > > http
> > > > > > based coordinator to get stuck loading or dropping segments and a
> > > > > companion
> > > > > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > > > > https://github.com/apache/incubator-druid/pull/5591
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5888 makes
> > balancing
> > > > > honor
> > > > > > a
> > > > > > load rule max load queue depth setting to help prevent movement
> > from
> > > > > > starving loading
> > > > > >
> > > > > > https://github.com/apache/incubator-druid/pull/5928 doesn't
> really
> > > fix
> > > > > > anything, just does an early return to avoid doing pointless work
> > > > > >
> > > > > > Additionally, there are a couple of pairs of PRs that are not
> > > currently
> > > > > in
> > > > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > > > > https://github.com/apache/incubator-druid/pull/5929 and their
> > > > respective
> > > > > > fixes which have yet to be merged, but have been performing well
> on
> > > our
> > > > > > test cluster,
> https://github.com/apache/incubator-druid/pull/5987
> > > and
> > > > > > https://github.com/apache/incubator-druid/pull/5988. One of them
> > > makes
> > > > > > balancing behave in a way more consistent with expectations by
> > always
> > > > > > trying to move maxSegmentsToMove and more correctly tracking what
> > the
> > > > > > balancer is doing, and one just adds better logging (without much
> > > extra
> > > > > log
> > > > > > volume) due to frustrations I had chasing down all these other
> > > issues.
> > > > > Both
> > > > > > of these were slated for 0.12.2 but were pulled out because of
> the
> > > > issues
> > > > > > (which the open PRs fix afaict). I would be in favor of sliding
> > them
> > > in
> > > > > > there, pending review of the fixes, but understand if they won't
> > make
> > > > the
> > > > > > cut since they maybe fall a bit more on the cosmetic side of
> > things.
> > > > I'm
> > > > > > pretty happy of the state of things on our test cluster right
> now,
> > > but
> > > > > > without these 4 patches things should still be operating more
> > > correctly
> > > > > > than they were before, just the differences being with balancing
> > > moving
> > > > > > somewhere between 0 and max, and less useful logging making
> future
> > > > issues
> > > > > > (which I have no doubts still lurk) harder to diagnose.
> > > > > >
> > > > > > Cheers,
> > > > > > Clint
> > > > > >
> > > > > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <
> > crallen@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Brought this up in the dev sync:
> > > > > > >
> > > > > > > I saw a lot of PRs and fixes for Coordinator segment balancing
> > > > related
> > > > > to
> > > > > > > some regressions that happened in 0.12.x . Is anyone able to
> > give a
> > > > > > rundown
> > > > > > > of the state of coordinator segment management for the 0.12.2
> RC?
> > > > > > >
> > > > > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > > > > > > nbangarwa@hortonworks.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > --
> > > > > > > > Nishant Bangarwa
> > > > > > > >
> > > > > > > > Hortonworks
> > > > > > > >
> > > > > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org>
> > wrote:
> > > > > > > >
> > > > > > > >     Related thread:
> > > > > > > >
> > > > > > > > https://lists.apache.org/thread.html/
> > > > 76755aecfddb1210fcc3f08b1d4631
> > > > > > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > > > > > > >     .
> > > > > > > >
> > > > > > > >     Jihoon
> > > > > > > >
> > > > > > > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <
> > > > jihoonson@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >     > Hi all,
> > > > > > > >     >
> > > > > > > >     > We have no open issues and PRs for 0.12.2 (
> > > > > > > >     > https://github.com/apache/incubator-druid/milestone/27
> ).
> > > The
> > > > > > > 0.12.2
> > > > > > > >     > branch is already available and all PRs for 0.12.2 have
> > > > merged
> > > > > > into
> > > > > > > > that
> > > > > > > >     > branch.
> > > > > > > >     >
> > > > > > > >     > Let's vote on releasing RC1. Here is my +1.
> > > > > > > >     >
> > > > > > > >     > This is a non-ASF release.
> > > > > > > >     >
> > > > > > > >     > Best,
> > > > > > > >     > Jihoon
> > > > > > > >     >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Gian Merlino <gi...@apache.org>.
Well, it's never good if a WTH?! message actually gets logged. They are
usually meant to be things that should "never" happen. I am ok with holding
off 0.12.2-rc1 until this fix is in.

On Wed, Jul 11, 2018 at 1:04 PM Jihoon Son <ji...@apache.org> wrote:

> Thanks everyone for voting.
>
> Unfortunately, I found another bug in Kafka indexing service (
> https://github.com/apache/incubator-druid/issues/5992). I think it's worth
> to include 0.12.2.
> I'm currently working on that issue and can probably finish at least by
> this week.
>
> Can we add it to 0.12.2 and vote again once a patch to fix is merged?
>
> Jihoon
>
> On Wed, Jul 11, 2018 at 10:02 AM Jonathan Wei <jo...@apache.org> wrote:
>
> > +1
> >
> > On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino <gi...@apache.org> wrote:
> >
> > > +1 from me too!
> > >
> > > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cr...@apache.org>
> > wrote:
> > >
> > > > That is very helpful, thank you!
> > > >
> > > > +1 for continuing with 0.12.2-RC1
> > > >
> > > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io>
> > > wrote:
> > > >
> > > > > Heya, sorry for the delay (and missing the sync, i'll try to get
> > better
> > > > > about showing up). I've fixed a handful of coordinator bugs post
> > 0.12.0
> > > > > (and
> > > > > not backported to 0.12.1), some of these issues go far back, some
> > back
> > > to
> > > > > when segment assignment priority for different tiers of historicals
> > was
> > > > > introduced, some are just some oddities on the behavior of the
> > balancer
> > > > > that I am unsure when were introduced. This is the complete list of
> > > fixes
> > > > > that are currently in 0.12.2 afaik, with a small description (see
> PRs
> > > and
> > > > > associated issues for more details)
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue
> > > that
> > > > > movement did not drop the segment from the server the segment was
> > being
> > > > > moved from (this one goes waaaay back, to batch segment
> > announcements)
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5529 changed
> behavior
> > > of
> > > > > drop to use the balancer to choose where to drop segments from,
> based
> > > on
> > > > > behavior observed caused by the issue of 5528
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue
> > > where
> > > > > primary assignment during load rule processing would assign an
> > > > unavailable
> > > > > segment to every server with capacity until at least 1 historical
> had
> > > the
> > > > > segment (and drop it from all the others if they all loaded at the
> > same
> > > > > time), choking load queues from doing useful things
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5555 fixed a way
> for
> > > http
> > > > > based coordinator to get stuck loading or dropping segments and a
> > > > companion
> > > > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > > > https://github.com/apache/incubator-druid/pull/5591
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5888 makes
> balancing
> > > > honor
> > > > > a
> > > > > load rule max load queue depth setting to help prevent movement
> from
> > > > > starving loading
> > > > >
> > > > > https://github.com/apache/incubator-druid/pull/5928 doesn't really
> > fix
> > > > > anything, just does an early return to avoid doing pointless work
> > > > >
> > > > > Additionally, there are a couple of pairs of PRs that are not
> > currently
> > > > in
> > > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > > > https://github.com/apache/incubator-druid/pull/5929 and their
> > > respective
> > > > > fixes which have yet to be merged, but have been performing well on
> > our
> > > > > test cluster, https://github.com/apache/incubator-druid/pull/5987
> > and
> > > > > https://github.com/apache/incubator-druid/pull/5988. One of them
> > makes
> > > > > balancing behave in a way more consistent with expectations by
> always
> > > > > trying to move maxSegmentsToMove and more correctly tracking what
> the
> > > > > balancer is doing, and one just adds better logging (without much
> > extra
> > > > log
> > > > > volume) due to frustrations I had chasing down all these other
> > issues.
> > > > Both
> > > > > of these were slated for 0.12.2 but were pulled out because of the
> > > issues
> > > > > (which the open PRs fix afaict). I would be in favor of sliding
> them
> > in
> > > > > there, pending review of the fixes, but understand if they won't
> make
> > > the
> > > > > cut since they maybe fall a bit more on the cosmetic side of
> things.
> > > I'm
> > > > > pretty happy of the state of things on our test cluster right now,
> > but
> > > > > without these 4 patches things should still be operating more
> > correctly
> > > > > than they were before, just the differences being with balancing
> > moving
> > > > > somewhere between 0 and max, and less useful logging making future
> > > issues
> > > > > (which I have no doubts still lurk) harder to diagnose.
> > > > >
> > > > > Cheers,
> > > > > Clint
> > > > >
> > > > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <
> crallen@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Brought this up in the dev sync:
> > > > > >
> > > > > > I saw a lot of PRs and fixes for Coordinator segment balancing
> > > related
> > > > to
> > > > > > some regressions that happened in 0.12.x . Is anyone able to
> give a
> > > > > rundown
> > > > > > of the state of coordinator segment management for the 0.12.2 RC?
> > > > > >
> > > > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > > > > > nbangarwa@hortonworks.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > --
> > > > > > > Nishant Bangarwa
> > > > > > >
> > > > > > > Hortonworks
> > > > > > >
> > > > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org>
> wrote:
> > > > > > >
> > > > > > >     Related thread:
> > > > > > >
> > > > > > > https://lists.apache.org/thread.html/
> > > 76755aecfddb1210fcc3f08b1d4631
> > > > > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > > > > > >     .
> > > > > > >
> > > > > > >     Jihoon
> > > > > > >
> > > > > > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <
> > > jihoonson@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > >     > Hi all,
> > > > > > >     >
> > > > > > >     > We have no open issues and PRs for 0.12.2 (
> > > > > > >     > https://github.com/apache/incubator-druid/milestone/27).
> > The
> > > > > > 0.12.2
> > > > > > >     > branch is already available and all PRs for 0.12.2 have
> > > merged
> > > > > into
> > > > > > > that
> > > > > > >     > branch.
> > > > > > >     >
> > > > > > >     > Let's vote on releasing RC1. Here is my +1.
> > > > > > >     >
> > > > > > >     > This is a non-ASF release.
> > > > > > >     >
> > > > > > >     > Best,
> > > > > > >     > Jihoon
> > > > > > >     >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Jihoon Son <ji...@apache.org>.
Thanks everyone for voting.

Unfortunately, I found another bug in Kafka indexing service (
https://github.com/apache/incubator-druid/issues/5992). I think it's worth
to include 0.12.2.
I'm currently working on that issue and can probably finish at least by
this week.

Can we add it to 0.12.2 and vote again once a patch to fix is merged?

Jihoon

On Wed, Jul 11, 2018 at 10:02 AM Jonathan Wei <jo...@apache.org> wrote:

> +1
>
> On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino <gi...@apache.org> wrote:
>
> > +1 from me too!
> >
> > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cr...@apache.org>
> wrote:
> >
> > > That is very helpful, thank you!
> > >
> > > +1 for continuing with 0.12.2-RC1
> > >
> > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io>
> > wrote:
> > >
> > > > Heya, sorry for the delay (and missing the sync, i'll try to get
> better
> > > > about showing up). I've fixed a handful of coordinator bugs post
> 0.12.0
> > > > (and
> > > > not backported to 0.12.1), some of these issues go far back, some
> back
> > to
> > > > when segment assignment priority for different tiers of historicals
> was
> > > > introduced, some are just some oddities on the behavior of the
> balancer
> > > > that I am unsure when were introduced. This is the complete list of
> > fixes
> > > > that are currently in 0.12.2 afaik, with a small description (see PRs
> > and
> > > > associated issues for more details)
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue
> > that
> > > > movement did not drop the segment from the server the segment was
> being
> > > > moved from (this one goes waaaay back, to batch segment
> announcements)
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5529 changed behavior
> > of
> > > > drop to use the balancer to choose where to drop segments from, based
> > on
> > > > behavior observed caused by the issue of 5528
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue
> > where
> > > > primary assignment during load rule processing would assign an
> > > unavailable
> > > > segment to every server with capacity until at least 1 historical had
> > the
> > > > segment (and drop it from all the others if they all loaded at the
> same
> > > > time), choking load queues from doing useful things
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5555 fixed a way for
> > http
> > > > based coordinator to get stuck loading or dropping segments and a
> > > companion
> > > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > > https://github.com/apache/incubator-druid/pull/5591
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5888 makes balancing
> > > honor
> > > > a
> > > > load rule max load queue depth setting to help prevent movement from
> > > > starving loading
> > > >
> > > > https://github.com/apache/incubator-druid/pull/5928 doesn't really
> fix
> > > > anything, just does an early return to avoid doing pointless work
> > > >
> > > > Additionally, there are a couple of pairs of PRs that are not
> currently
> > > in
> > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > > https://github.com/apache/incubator-druid/pull/5929 and their
> > respective
> > > > fixes which have yet to be merged, but have been performing well on
> our
> > > > test cluster, https://github.com/apache/incubator-druid/pull/5987
> and
> > > > https://github.com/apache/incubator-druid/pull/5988. One of them
> makes
> > > > balancing behave in a way more consistent with expectations by always
> > > > trying to move maxSegmentsToMove and more correctly tracking what the
> > > > balancer is doing, and one just adds better logging (without much
> extra
> > > log
> > > > volume) due to frustrations I had chasing down all these other
> issues.
> > > Both
> > > > of these were slated for 0.12.2 but were pulled out because of the
> > issues
> > > > (which the open PRs fix afaict). I would be in favor of sliding them
> in
> > > > there, pending review of the fixes, but understand if they won't make
> > the
> > > > cut since they maybe fall a bit more on the cosmetic side of things.
> > I'm
> > > > pretty happy of the state of things on our test cluster right now,
> but
> > > > without these 4 patches things should still be operating more
> correctly
> > > > than they were before, just the differences being with balancing
> moving
> > > > somewhere between 0 and max, and less useful logging making future
> > issues
> > > > (which I have no doubts still lurk) harder to diagnose.
> > > >
> > > > Cheers,
> > > > Clint
> > > >
> > > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cr...@apache.org>
> > > > wrote:
> > > >
> > > > > Brought this up in the dev sync:
> > > > >
> > > > > I saw a lot of PRs and fixes for Coordinator segment balancing
> > related
> > > to
> > > > > some regressions that happened in 0.12.x . Is anyone able to give a
> > > > rundown
> > > > > of the state of coordinator segment management for the 0.12.2 RC?
> > > > >
> > > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > > > > nbangarwa@hortonworks.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > --
> > > > > > Nishant Bangarwa
> > > > > >
> > > > > > Hortonworks
> > > > > >
> > > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
> > > > > >
> > > > > >     Related thread:
> > > > > >
> > > > > > https://lists.apache.org/thread.html/
> > 76755aecfddb1210fcc3f08b1d4631
> > > > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > > > > >     .
> > > > > >
> > > > > >     Jihoon
> > > > > >
> > > > > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <
> > jihoonson@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > >     > Hi all,
> > > > > >     >
> > > > > >     > We have no open issues and PRs for 0.12.2 (
> > > > > >     > https://github.com/apache/incubator-druid/milestone/27).
> The
> > > > > 0.12.2
> > > > > >     > branch is already available and all PRs for 0.12.2 have
> > merged
> > > > into
> > > > > > that
> > > > > >     > branch.
> > > > > >     >
> > > > > >     > Let's vote on releasing RC1. Here is my +1.
> > > > > >     >
> > > > > >     > This is a non-ASF release.
> > > > > >     >
> > > > > >     > Best,
> > > > > >     > Jihoon
> > > > > >     >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Jonathan Wei <jo...@apache.org>.
+1

On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino <gi...@apache.org> wrote:

> +1 from me too!
>
> On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cr...@apache.org> wrote:
>
> > That is very helpful, thank you!
> >
> > +1 for continuing with 0.12.2-RC1
> >
> > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io>
> wrote:
> >
> > > Heya, sorry for the delay (and missing the sync, i'll try to get better
> > > about showing up). I've fixed a handful of coordinator bugs post 0.12.0
> > > (and
> > > not backported to 0.12.1), some of these issues go far back, some back
> to
> > > when segment assignment priority for different tiers of historicals was
> > > introduced, some are just some oddities on the behavior of the balancer
> > > that I am unsure when were introduced. This is the complete list of
> fixes
> > > that are currently in 0.12.2 afaik, with a small description (see PRs
> and
> > > associated issues for more details)
> > >
> > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue
> that
> > > movement did not drop the segment from the server the segment was being
> > > moved from (this one goes waaaay back, to batch segment announcements)
> > >
> > > https://github.com/apache/incubator-druid/pull/5529 changed behavior
> of
> > > drop to use the balancer to choose where to drop segments from, based
> on
> > > behavior observed caused by the issue of 5528
> > >
> > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue
> where
> > > primary assignment during load rule processing would assign an
> > unavailable
> > > segment to every server with capacity until at least 1 historical had
> the
> > > segment (and drop it from all the others if they all loaded at the same
> > > time), choking load queues from doing useful things
> > >
> > > https://github.com/apache/incubator-druid/pull/5555 fixed a way for
> http
> > > based coordinator to get stuck loading or dropping segments and a
> > companion
> > > PR that fixed a lambda that wasn't friendly to older jvm versions
> > > https://github.com/apache/incubator-druid/pull/5591
> > >
> > > https://github.com/apache/incubator-druid/pull/5888 makes balancing
> > honor
> > > a
> > > load rule max load queue depth setting to help prevent movement from
> > > starving loading
> > >
> > > https://github.com/apache/incubator-druid/pull/5928 doesn't really fix
> > > anything, just does an early return to avoid doing pointless work
> > >
> > > Additionally, there are a couple of pairs of PRs that are not currently
> > in
> > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > > https://github.com/apache/incubator-druid/pull/5929 and their
> respective
> > > fixes which have yet to be merged, but have been performing well on our
> > > test cluster, https://github.com/apache/incubator-druid/pull/5987 and
> > > https://github.com/apache/incubator-druid/pull/5988. One of them makes
> > > balancing behave in a way more consistent with expectations by always
> > > trying to move maxSegmentsToMove and more correctly tracking what the
> > > balancer is doing, and one just adds better logging (without much extra
> > log
> > > volume) due to frustrations I had chasing down all these other issues.
> > Both
> > > of these were slated for 0.12.2 but were pulled out because of the
> issues
> > > (which the open PRs fix afaict). I would be in favor of sliding them in
> > > there, pending review of the fixes, but understand if they won't make
> the
> > > cut since they maybe fall a bit more on the cosmetic side of things.
> I'm
> > > pretty happy of the state of things on our test cluster right now, but
> > > without these 4 patches things should still be operating more correctly
> > > than they were before, just the differences being with balancing moving
> > > somewhere between 0 and max, and less useful logging making future
> issues
> > > (which I have no doubts still lurk) harder to diagnose.
> > >
> > > Cheers,
> > > Clint
> > >
> > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cr...@apache.org>
> > > wrote:
> > >
> > > > Brought this up in the dev sync:
> > > >
> > > > I saw a lot of PRs and fixes for Coordinator segment balancing
> related
> > to
> > > > some regressions that happened in 0.12.x . Is anyone able to give a
> > > rundown
> > > > of the state of coordinator segment management for the 0.12.2 RC?
> > > >
> > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > > > nbangarwa@hortonworks.com>
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > --
> > > > > Nishant Bangarwa
> > > > >
> > > > > Hortonworks
> > > > >
> > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
> > > > >
> > > > >     Related thread:
> > > > >
> > > > > https://lists.apache.org/thread.html/
> 76755aecfddb1210fcc3f08b1d4631
> > > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > > > >     .
> > > > >
> > > > >     Jihoon
> > > > >
> > > > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <
> jihoonson@apache.org>
> > > > > wrote:
> > > > >
> > > > >     > Hi all,
> > > > >     >
> > > > >     > We have no open issues and PRs for 0.12.2 (
> > > > >     > https://github.com/apache/incubator-druid/milestone/27). The
> > > > 0.12.2
> > > > >     > branch is already available and all PRs for 0.12.2 have
> merged
> > > into
> > > > > that
> > > > >     > branch.
> > > > >     >
> > > > >     > Let's vote on releasing RC1. Here is my +1.
> > > > >     >
> > > > >     > This is a non-ASF release.
> > > > >     >
> > > > >     > Best,
> > > > >     > Jihoon
> > > > >     >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Gian Merlino <gi...@apache.org>.
+1 from me too!

On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cr...@apache.org> wrote:

> That is very helpful, thank you!
>
> +1 for continuing with 0.12.2-RC1
>
> On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io> wrote:
>
> > Heya, sorry for the delay (and missing the sync, i'll try to get better
> > about showing up). I've fixed a handful of coordinator bugs post 0.12.0
> > (and
> > not backported to 0.12.1), some of these issues go far back, some back to
> > when segment assignment priority for different tiers of historicals was
> > introduced, some are just some oddities on the behavior of the balancer
> > that I am unsure when were introduced. This is the complete list of fixes
> > that are currently in 0.12.2 afaik, with a small description (see PRs and
> > associated issues for more details)
> >
> > https://github.com/apache/incubator-druid/pull/5528 fixed an issue that
> > movement did not drop the segment from the server the segment was being
> > moved from (this one goes waaaay back, to batch segment announcements)
> >
> > https://github.com/apache/incubator-druid/pull/5529 changed behavior of
> > drop to use the balancer to choose where to drop segments from, based on
> > behavior observed caused by the issue of 5528
> >
> > https://github.com/apache/incubator-druid/pull/5532 fixes an issue where
> > primary assignment during load rule processing would assign an
> unavailable
> > segment to every server with capacity until at least 1 historical had the
> > segment (and drop it from all the others if they all loaded at the same
> > time), choking load queues from doing useful things
> >
> > https://github.com/apache/incubator-druid/pull/5555 fixed a way for http
> > based coordinator to get stuck loading or dropping segments and a
> companion
> > PR that fixed a lambda that wasn't friendly to older jvm versions
> > https://github.com/apache/incubator-druid/pull/5591
> >
> > https://github.com/apache/incubator-druid/pull/5888 makes balancing
> honor
> > a
> > load rule max load queue depth setting to help prevent movement from
> > starving loading
> >
> > https://github.com/apache/incubator-druid/pull/5928 doesn't really fix
> > anything, just does an early return to avoid doing pointless work
> >
> > Additionally, there are a couple of pairs of PRs that are not currently
> in
> > 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> > https://github.com/apache/incubator-druid/pull/5929 and their respective
> > fixes which have yet to be merged, but have been performing well on our
> > test cluster, https://github.com/apache/incubator-druid/pull/5987 and
> > https://github.com/apache/incubator-druid/pull/5988. One of them makes
> > balancing behave in a way more consistent with expectations by always
> > trying to move maxSegmentsToMove and more correctly tracking what the
> > balancer is doing, and one just adds better logging (without much extra
> log
> > volume) due to frustrations I had chasing down all these other issues.
> Both
> > of these were slated for 0.12.2 but were pulled out because of the issues
> > (which the open PRs fix afaict). I would be in favor of sliding them in
> > there, pending review of the fixes, but understand if they won't make the
> > cut since they maybe fall a bit more on the cosmetic side of things. I'm
> > pretty happy of the state of things on our test cluster right now, but
> > without these 4 patches things should still be operating more correctly
> > than they were before, just the differences being with balancing moving
> > somewhere between 0 and max, and less useful logging making future issues
> > (which I have no doubts still lurk) harder to diagnose.
> >
> > Cheers,
> > Clint
> >
> > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cr...@apache.org>
> > wrote:
> >
> > > Brought this up in the dev sync:
> > >
> > > I saw a lot of PRs and fixes for Coordinator segment balancing related
> to
> > > some regressions that happened in 0.12.x . Is anyone able to give a
> > rundown
> > > of the state of coordinator segment management for the 0.12.2 RC?
> > >
> > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > > nbangarwa@hortonworks.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > --
> > > > Nishant Bangarwa
> > > >
> > > > Hortonworks
> > > >
> > > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
> > > >
> > > >     Related thread:
> > > >
> > > > https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631
> > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > > >     .
> > > >
> > > >     Jihoon
> > > >
> > > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org>
> > > > wrote:
> > > >
> > > >     > Hi all,
> > > >     >
> > > >     > We have no open issues and PRs for 0.12.2 (
> > > >     > https://github.com/apache/incubator-druid/milestone/27). The
> > > 0.12.2
> > > >     > branch is already available and all PRs for 0.12.2 have merged
> > into
> > > > that
> > > >     > branch.
> > > >     >
> > > >     > Let's vote on releasing RC1. Here is my +1.
> > > >     >
> > > >     > This is a non-ASF release.
> > > >     >
> > > >     > Best,
> > > >     > Jihoon
> > > >     >
> > > >
> > > >
> > > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Charles Allen <cr...@apache.org>.
That is very helpful, thank you!

+1 for continuing with 0.12.2-RC1

On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <cl...@imply.io> wrote:

> Heya, sorry for the delay (and missing the sync, i'll try to get better
> about showing up). I've fixed a handful of coordinator bugs post 0.12.0
> (and
> not backported to 0.12.1), some of these issues go far back, some back to
> when segment assignment priority for different tiers of historicals was
> introduced, some are just some oddities on the behavior of the balancer
> that I am unsure when were introduced. This is the complete list of fixes
> that are currently in 0.12.2 afaik, with a small description (see PRs and
> associated issues for more details)
>
> https://github.com/apache/incubator-druid/pull/5528 fixed an issue that
> movement did not drop the segment from the server the segment was being
> moved from (this one goes waaaay back, to batch segment announcements)
>
> https://github.com/apache/incubator-druid/pull/5529 changed behavior of
> drop to use the balancer to choose where to drop segments from, based on
> behavior observed caused by the issue of 5528
>
> https://github.com/apache/incubator-druid/pull/5532 fixes an issue where
> primary assignment during load rule processing would assign an unavailable
> segment to every server with capacity until at least 1 historical had the
> segment (and drop it from all the others if they all loaded at the same
> time), choking load queues from doing useful things
>
> https://github.com/apache/incubator-druid/pull/5555 fixed a way for http
> based coordinator to get stuck loading or dropping segments and a companion
> PR that fixed a lambda that wasn't friendly to older jvm versions
> https://github.com/apache/incubator-druid/pull/5591
>
> https://github.com/apache/incubator-druid/pull/5888 makes balancing honor
> a
> load rule max load queue depth setting to help prevent movement from
> starving loading
>
> https://github.com/apache/incubator-druid/pull/5928 doesn't really fix
> anything, just does an early return to avoid doing pointless work
>
> Additionally, there are a couple of pairs of PRs that are not currently in
> 0.12.2: https://github.com/druid-io/druid/pull/5927 and
> https://github.com/apache/incubator-druid/pull/5929 and their respective
> fixes which have yet to be merged, but have been performing well on our
> test cluster, https://github.com/apache/incubator-druid/pull/5987 and
> https://github.com/apache/incubator-druid/pull/5988. One of them makes
> balancing behave in a way more consistent with expectations by always
> trying to move maxSegmentsToMove and more correctly tracking what the
> balancer is doing, and one just adds better logging (without much extra log
> volume) due to frustrations I had chasing down all these other issues. Both
> of these were slated for 0.12.2 but were pulled out because of the issues
> (which the open PRs fix afaict). I would be in favor of sliding them in
> there, pending review of the fixes, but understand if they won't make the
> cut since they maybe fall a bit more on the cosmetic side of things. I'm
> pretty happy of the state of things on our test cluster right now, but
> without these 4 patches things should still be operating more correctly
> than they were before, just the differences being with balancing moving
> somewhere between 0 and max, and less useful logging making future issues
> (which I have no doubts still lurk) harder to diagnose.
>
> Cheers,
> Clint
>
> On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cr...@apache.org>
> wrote:
>
> > Brought this up in the dev sync:
> >
> > I saw a lot of PRs and fixes for Coordinator segment balancing related to
> > some regressions that happened in 0.12.x . Is anyone able to give a
> rundown
> > of the state of coordinator segment management for the 0.12.2 RC?
> >
> > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> > nbangarwa@hortonworks.com>
> > wrote:
> >
> > > +1
> > >
> > > --
> > > Nishant Bangarwa
> > >
> > > Hortonworks
> > >
> > > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
> > >
> > >     Related thread:
> > >
> > > https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631
> > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> > >     .
> > >
> > >     Jihoon
> > >
> > >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org>
> > > wrote:
> > >
> > >     > Hi all,
> > >     >
> > >     > We have no open issues and PRs for 0.12.2 (
> > >     > https://github.com/apache/incubator-druid/milestone/27). The
> > 0.12.2
> > >     > branch is already available and all PRs for 0.12.2 have merged
> into
> > > that
> > >     > branch.
> > >     >
> > >     > Let's vote on releasing RC1. Here is my +1.
> > >     >
> > >     > This is a non-ASF release.
> > >     >
> > >     > Best,
> > >     > Jihoon
> > >     >
> > >
> > >
> > >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Clint Wylie <cl...@imply.io>.
Heya, sorry for the delay (and missing the sync, i'll try to get better
about showing up). I've fixed a handful of coordinator bugs post 0.12.0 (and
not backported to 0.12.1), some of these issues go far back, some back to
when segment assignment priority for different tiers of historicals was
introduced, some are just some oddities on the behavior of the balancer
that I am unsure when were introduced. This is the complete list of fixes
that are currently in 0.12.2 afaik, with a small description (see PRs and
associated issues for more details)

https://github.com/apache/incubator-druid/pull/5528 fixed an issue that
movement did not drop the segment from the server the segment was being
moved from (this one goes waaaay back, to batch segment announcements)

https://github.com/apache/incubator-druid/pull/5529 changed behavior of
drop to use the balancer to choose where to drop segments from, based on
behavior observed caused by the issue of 5528

https://github.com/apache/incubator-druid/pull/5532 fixes an issue where
primary assignment during load rule processing would assign an unavailable
segment to every server with capacity until at least 1 historical had the
segment (and drop it from all the others if they all loaded at the same
time), choking load queues from doing useful things

https://github.com/apache/incubator-druid/pull/5555 fixed a way for http
based coordinator to get stuck loading or dropping segments and a companion
PR that fixed a lambda that wasn't friendly to older jvm versions
https://github.com/apache/incubator-druid/pull/5591

https://github.com/apache/incubator-druid/pull/5888 makes balancing honor a
load rule max load queue depth setting to help prevent movement from
starving loading

https://github.com/apache/incubator-druid/pull/5928 doesn't really fix
anything, just does an early return to avoid doing pointless work

Additionally, there are a couple of pairs of PRs that are not currently in
0.12.2: https://github.com/druid-io/druid/pull/5927 and
https://github.com/apache/incubator-druid/pull/5929 and their respective
fixes which have yet to be merged, but have been performing well on our
test cluster, https://github.com/apache/incubator-druid/pull/5987 and
https://github.com/apache/incubator-druid/pull/5988. One of them makes
balancing behave in a way more consistent with expectations by always
trying to move maxSegmentsToMove and more correctly tracking what the
balancer is doing, and one just adds better logging (without much extra log
volume) due to frustrations I had chasing down all these other issues. Both
of these were slated for 0.12.2 but were pulled out because of the issues
(which the open PRs fix afaict). I would be in favor of sliding them in
there, pending review of the fixes, but understand if they won't make the
cut since they maybe fall a bit more on the cosmetic side of things. I'm
pretty happy of the state of things on our test cluster right now, but
without these 4 patches things should still be operating more correctly
than they were before, just the differences being with balancing moving
somewhere between 0 and max, and less useful logging making future issues
(which I have no doubts still lurk) harder to diagnose.

Cheers,
Clint

On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cr...@apache.org> wrote:

> Brought this up in the dev sync:
>
> I saw a lot of PRs and fixes for Coordinator segment balancing related to
> some regressions that happened in 0.12.x . Is anyone able to give a rundown
> of the state of coordinator segment management for the 0.12.2 RC?
>
> On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <
> nbangarwa@hortonworks.com>
> wrote:
>
> > +1
> >
> > --
> > Nishant Bangarwa
> >
> > Hortonworks
> >
> > On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
> >
> >     Related thread:
> >
> > https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631
> 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
> >     .
> >
> >     Jihoon
> >
> >     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org>
> > wrote:
> >
> >     > Hi all,
> >     >
> >     > We have no open issues and PRs for 0.12.2 (
> >     > https://github.com/apache/incubator-druid/milestone/27). The
> 0.12.2
> >     > branch is already available and all PRs for 0.12.2 have merged into
> > that
> >     > branch.
> >     >
> >     > Let's vote on releasing RC1. Here is my +1.
> >     >
> >     > This is a non-ASF release.
> >     >
> >     > Best,
> >     > Jihoon
> >     >
> >
> >
> >
>

Re: Druid 0.12.2-rc1 vote

Posted by Charles Allen <cr...@apache.org>.
Brought this up in the dev sync:

I saw a lot of PRs and fixes for Coordinator segment balancing related to
some regressions that happened in 0.12.x . Is anyone able to give a rundown
of the state of coordinator segment management for the 0.12.2 RC?

On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa <nb...@hortonworks.com>
wrote:

> +1
>
> --
> Nishant Bangarwa
>
> Hortonworks
>
> On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:
>
>     Related thread:
>
> https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
>     .
>
>     Jihoon
>
>     On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org>
> wrote:
>
>     > Hi all,
>     >
>     > We have no open issues and PRs for 0.12.2 (
>     > https://github.com/apache/incubator-druid/milestone/27). The 0.12.2
>     > branch is already available and all PRs for 0.12.2 have merged into
> that
>     > branch.
>     >
>     > Let's vote on releasing RC1. Here is my +1.
>     >
>     > This is a non-ASF release.
>     >
>     > Best,
>     > Jihoon
>     >
>
>
>

Re: Druid 0.12.2-rc1 vote

Posted by Nishant Bangarwa <nb...@hortonworks.com>.
+1

--
Nishant Bangarwa

Hortonworks

On 7/10/18, 3:57 AM, "Jihoon Son" <ji...@apache.org> wrote:

    Related thread:
    https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
    .
    
    Jihoon
    
    On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org> wrote:
    
    > Hi all,
    >
    > We have no open issues and PRs for 0.12.2 (
    > https://github.com/apache/incubator-druid/milestone/27). The 0.12.2
    > branch is already available and all PRs for 0.12.2 have merged into that
    > branch.
    >
    > Let's vote on releasing RC1. Here is my +1.
    >
    > This is a non-ASF release.
    >
    > Best,
    > Jihoon
    >
    


Re: Druid 0.12.2-rc1 vote

Posted by Jihoon Son <ji...@apache.org>.
Related thread:
https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E
.

Jihoon

On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <ji...@apache.org> wrote:

> Hi all,
>
> We have no open issues and PRs for 0.12.2 (
> https://github.com/apache/incubator-druid/milestone/27). The 0.12.2
> branch is already available and all PRs for 0.12.2 have merged into that
> branch.
>
> Let's vote on releasing RC1. Here is my +1.
>
> This is a non-ASF release.
>
> Best,
> Jihoon
>