You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by Quanlong Huang <hu...@gmail.com> on 2019/02/01 05:00:48 UTC

Re: Move forward branch-2.x

Sure. I think "fine-grained privileges" always introduce small changes in
behaviors, i.e. unprivileged users used to be able to do something but they
can't do so after an upgrade. We accept it since it's reasonable.

There're incompatible changes too in the previous releases:
https://www.cloudera.com/documentation/enterprise/release-notes/topics/impala_incompatible_changes.html.
What we need is to document it well :)

I just moved forward and start GVO job for the patch. Thanks!



On Thu, Jan 31, 2019 at 2:00 AM Bharath Vissapragada <bh...@cloudera.com>
wrote:

> On Wed, Jan 30, 2019 at 12:21 AM Quanlong Huang <hu...@gmail.com>
> wrote:
>
> > I'm afraid the difference between branch-2.x and Cloudera's branch is
> > larger than the difference between branch-2.x and master branch.
> Cloudera's
> > branch already ignored lots of commits, which causes the gap. I've tried
> > cherry-pick from master or Cloudera's branch and found it's much easier
> to
> > pick from master branch.
> > If https://gerrit.cloudera.org/c/12292/ is merged, I can easily pick 40+
> > commits into branch-2.x with few conflicts to resolve!! See the first
> > column in the sheet:
> >
> >
> https://docs.google.com/spreadsheets/d/12h1rTAPS1gm0vhlDGxeOXjnRD7rrOcoqzX4rjRRCyBg
> > The result is here:
> > https://github.com/stiga-huang/incubator-impala/tree/future-2.x
>
>
> Sure whatever works best for you. You are right that the Cloudera branch
> was selective in cherry-picking stuff. We mostly focussed on "fetch
> on-demand metadata" changes and "finer grained privileges" and ignored the
> rest. If that is what you are looking for, it is easier to cherry-pick from
> Cloudera's branch. Otherwise probably better to replay commits from the
> master branch.
>
>
> >
> > We can restart the cherrypick-2.x-and-test Jenkins job. Each time
> > there're conflicts, I'll come to resolve it. If the job keeps running,
> it's
> > possible for branch-2.x to catch up the master branch!
> >
> > Besides, https://gerrit.cloudera.org/c/12292/ is about DESCRIBE behavior
> > in
> > FGP(Fine-grained privileges). I think it's reasonable and not
> > compatibility breaking. Does anyone have more thoughts about this?
> >
>
> Hmm, I see what you are saying.  Definitely helps to include it to make
> future cherry-picks easier.
>
> Technically it is still a behavioral change, especially if someone upgrades
> to a version with this fix and we typically try to avoid that (describe
> that worked before doesn't work after upgrade). I can't speak about why we
> included it in the Cloudera branch since that was an internal decision but
> I don't know if we have any policies here around backporting such stuff
> into older branches. Maybe good to know what others think.
>
> Anyway, I don't feel too strongly about this and since it is blocking your
> work, I removed my -1 on the code review but this is something to keep in
> mind when backporting such patches.
>
>
> >
> > On Wed, Jan 30, 2019 at 9:41 AM Fredy Wijaya <fw...@cloudera.com>
> wrote:
> >
> > > Due to the way we build 2.x where it can't use the pinned versions of
> CDH
> > > dependencies, it may be better to cherry-pick all commits in
> > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0, which also
> > > includes that DESCRIBE commit to avoid further integration issues later
> > on.
> > >
> > > On Tue, Jan 29, 2019 at 5:05 PM Quanlong Huang <
> huangquanlong@gmail.com>
> > > wrote:
> > >
> > > > Hi Bharath,
> > > >
> > > > Thank you a lot for your notice! However, I've gone through the
> commits
> > > of
> > > > cdh branch before and found that this patch is also picked:
> > > >
> > > >
> > >
> >
> https://github.com/cloudera/Impala/commit/f8a318d4f75e22a963b9cf4786ef07d2cd6bd93c
> > > > .
> > > > Is this really a compatibility breaking change?
> > > >
> > > > I'm also concern that the TestDescribeTableResults it introduced is
> too
> > > > strictly that may cause troubles. However, I found two later commits
> > > > (IMPALA-7143 and IMPALA-7144) would fix this. I'm going to
> cherry-pick
> > > > these two and IMPALA-7676 (thanks Fredy's advise too!) right after
> > > > https://gerrit.cloudera.org/c/12292/ is merged.
> > > >
> > > > Please let me know if this will go astray. Thanks!
> > > >
> > > >
> > > > On Wed, Jan 30, 2019 at 12:36 AM Bharath Vissapragada <
> > > > bharathv@cloudera.com>
> > > > wrote:
> > > >
> > > > > On Mon, Jan 28, 2019 at 11:36 PM Quanlong Huang <
> > > huangquanlong@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > >>For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
> > > > > interfaces
> > > > > > for Db, View, Table, Partition", the cherry-pick conflicts is due
> > to
> > > > the
> > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
> > IMPALA-6479
> > > > > being
> > > > > > picked back. Does anyone know why we revert it? (I also comment
> in
> > > the
> > > > > > JIRA).
> > > > > > >
> > > > > > >There are test failures. I guess it's the reason. Hopefully,
> > > > > > cdh-5.16.1-release already picked up this patch, which provides
> > some
> > > > > > pointers :)
> > > > > >
> > > > > > I fix the test failures and create a review at
> > > > > > https://gerrit.cloudera.org/c/12292/
> > > > > > Waiting for Jenkins maintenance to finish and then run a GVO.
> Hopes
> > > > > someone
> > > > > > can join and have a look!
> > > > > >
> > > > > > On Tue, Jan 29, 2019 at 7:39 AM Quanlong Huang <
> > > > huangquanlong@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > >For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
> > > > > interfaces
> > > > > > > for Db, View, Table, Partition", the cherry-pick conflicts is
> due
> > > to
> > > > > the
> > > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
> > > IMPALA-6479
> > > > > > being
> > > > > > > picked back. Does anyone know why we revert it? (I also comment
> > in
> > > > the
> > > > > > > JIRA).
> > > > > >
> > > > >
> > > > > It was reverted because it is a compatibility breaking change. We
> > > > typically
> > > > > try not to introduce such behavioral changes in the same major
> > version
> > > > line
> > > > > as that can cause upgrade issues.
> > > > >
> > > > >
> > > > > > >
> > > > > > > There are test failures. I guess it's the reason. Hopefully,
> > > > > > > cdh-5.16.1-release already picked up this patch, which provides
> > > some
> > > > > > > pointers :)
> > > > > >
> > > > >
> > > > > I work at Cloudera and we've gone through this exercise before. It
> is
> > > > > annoying to resolve the conflicts, so you can reuse our work and
> save
> > > > some
> > > > > time.
> > > > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0
> > > > >
> > > > >
> > > > > > >
> > > > > > > On Mon, Jan 28, 2019 at 10:51 PM Quanlong Huang <
> > > > > huangquanlong@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Yes, there are two discussion threads before that are relative
> > to
> > > > > this.
> > > > > > >> One for stopping the cherrypick-2.x-and-test jenkins job:
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/2b4b62d4c07661b27a5203618cb0425a429f6460f2eb505acbcd26c6@%3Cdev.impala.apache.org%3E
> > > > > > >>
> > > > > > >> The other for removing support for hadoop 2 in master branch:
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/49f9b68ed3d6d2c0fdee16a877b259922545e4824e1233479227a657@%3Cdev.impala.apache.org%3E
> > > > > > >>
> > > > > > >> I'm +1 with the second thread that we only support Hadoop 2 in
> > > > > > branch-2.x
> > > > > > >> and support Hadoop 3 in the master branch to be more focused.
> > I'm
> > > > also
> > > > > > >> agree with Paul's concern. It's such a dilemma that if we skip
> > > some
> > > > > > >> commits, things will be harder and harder as we moving
> forward;
> > if
> > > > we
> > > > > > >> cherry-pick, review, and test the commits one by one,
> branch-2.x
> > > > will
> > > > > > never
> > > > > > >> catch up the master branch, which is an obstacle if someone
> > (like
> > > > me)
> > > > > > wants
> > > > > > >> to backport his/her new patch to branch-2.x but waits too long
> > and
> > > > > > finally
> > > > > > >> fogets details of the patch.
> > > > > > >>
> > > > > > >> I roughly investigated how other systems deal with multiple
> > > > branches.
> > > > > > The
> > > > > > >> efforts to backport a patch could be the same for the original
> > > > patch.
> > > > > > It's
> > > > > > >> not a easy go, so the Hive community declares that
> > > > > > >> "The decision to port a feature from master to branch-1 is at
> > the
> > > > > > >> discretion of the contributor and committer. However no
> features
> > > > that
> > > > > > break
> > > > > > >> backwards compatibility will be accepted on branch-1."
> > > > > > >>
> > > > > > >> I think it's a chance to understand more parts of Impala by
> > > learning
> > > > > and
> > > > > > >> backporting the patches, since they have execellent commit
> > > messages
> > > > > and
> > > > > > >> were strictly reviewed. So I volunteer for the job to move
> > forward
> > > > the
> > > > > > >> branch-2.x. Hopes patch authors could give some pointers when
> > I'm
> > > > > > blocked!
> > > > > > >> I'll try approach (b) first and switch to (a) when (b) becomes
> > > > > > impossible
> > > > > > >> after too many commits are skipped. I'll confirm with the
> author
> > > if
> > > > I
> > > > > > think
> > > > > > >> a patch should be skipped.
> > > > > > >>
> > > > > > >> For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
> > > > > interfaces
> > > > > > >> for Db, View, Table, Partition", the cherry-pick conflicts is
> > due
> > > to
> > > > > the
> > > > > > >> revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
> > > > IMPALA-6479
> > > > > > being
> > > > > > >> picked back. Does anyone know why we revert it? (I also
> comment
> > in
> > > > the
> > > > > > >> JIRA).
> > > > > > >>
> > > > > > >> On Mon, Jan 28, 2019 at 12:43 PM Philip Zeyliger <
> > > > philip@cloudera.com
> > > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> As for Quanlong's question, I think the answer is however the
> > > folks
> > > > > who
> > > > > > >>> want to do the work prefer to do it. As you noticed in the
> CDH
> > > > > > >>> changelists,
> > > > > > >>> Cloudera's distribution has opted for something more like
> > > approach
> > > > > (a),
> > > > > > >>> choosing to backport individual features. For a while, we
> were
> > > > doing
> > > > > > >>> automation for cherry-picking things automatically, and it
> got
> > > > > tedious
> > > > > > >>> enough that we decided to turn it off.
> > > > > > >>>
> > > > > > >>> On Sun, Jan 27, 2019 at 7:37 PM Paul Rogers <
> > > progers@cloudera.com>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>> > Hi Quanlong,
> > > > > > >>> >
> > > > > > >>> > Thanks for the suggestion. I wonder if there is a third
> > > strategy:
> > > > > > >>> >
> > > > > > >>> > c) Isolate the Hadoop 2.x/3.x differences into
> > clearly-defined
> > > > > driver
> > > > > > >>> > layer so that basically all of 3.x can be applied to the
> 2.x
> > > > > branch.
> > > > > > >>> Said
> > > > > > >>> > another way, a single source base can work against either
> > > Hadoop
> > > > > 2.x
> > > > > > or
> > > > > > >>> > 3.x, with the build (C++) or runtime (Java) choosing the
> > proper
> > > > > > >>> “driver”
> > > > > > >>> > classes.
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>> We had such a layer for a while, where Impala master could be
> > > built
> > > > > > >>> against
> > > > > > >>> either Hadoop3 or Hadoop2. We decided to clean it up in
> commit
> > > > > > >>> e4ae605b083ab536c68552e37ca3c46f6bff4c76.
> > > > > > >>>
> > > > > > >>> commit e4ae605b083ab536c68552e37ca3c46f6bff4c76
> > > > > > >>> Author: Fredy Wijaya <fw...@cloudera.com>
> > > > > > >>> Date:   Thu Jul 12 17:01:13 2018 -0700
> > > > > > >>>
> > > > > > >>>     IMPALA-7295: Remove IMPALA_MINICLUSTER_PROFILE=2
> > > > > > >>>
> > > > > > >>>     This patch removes the use of IMPALA_MINICLUSTER_PROFILE.
> > The
> > > > > code
> > > > > > >>> that
> > > > > > >>>     uses IMPALA_MINICLUSTER_PROFILE=2 is removed and it
> > defaults
> > > to
> > > > > > code
> > > > > > >>> from
> > > > > > >>>     IMPALA_MINICLUSTER_PROFILE=3. In order to reduce having
> too
> > > > many
> > > > > > code
> > > > > > >>>     changes in this patch, there is no code change for the
> > shims.
> > > > The
> > > > > > >>> shims
> > > > > > >>>     for IMPALA_MINICLUSTER_PROFILE=3 automatically become the
> > > > default
> > > > > > >>>     implementation.
> > > > > > >>>
> > > > > > >>>     Testing:
> > > > > > >>>     - Ran core and exhaustive tests
> > > > > > >>>
> > > > > > >>>     Change-Id: Iba4a81165b3d2012dc04d4115454372c41e39f08
> > > > > > >>>     Reviewed-on: http://gerrit.cloudera.org:8080/10940
> > > > > > >>>     Reviewed-by: Impala Public Jenkins <
> > > > > > >>> impala-public-jenkins@cloudera.com>
> > > > > > >>>     Tested-by: Impala Public Jenkins <
> > > > > > impala-public-jenkins@cloudera.com
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Move forward branch-2.x

Posted by Quanlong Huang <hu...@gmail.com>.
Great thanks to guys that help to review the previous patches!

Now we are at this patch: https://gerrit.cloudera.org/c/12546/. It's a
document patch so I think everyone can help to confirm it. Could anyone
help to have a look?

Thanks,
Quanlong

On Sat, Feb 9, 2019 at 10:44 PM Quanlong Huang <hu...@gmail.com>
wrote:

> Hi all,
>
> We've moved to the next patch: https://gerrit.cloudera.org/c/12345/. It's
> the first step to add LocalCatalog to branch-2.x. Anyone could give it
> a +2? Or anyone has objections for it?
>
> Thanks,
> Quanlong
>
> On Thu, Jan 31, 2019 at 9:00 PM Quanlong Huang <hu...@gmail.com>
> wrote:
>
>> Sure. I think "fine-grained privileges" always introduce small changes in
>> behaviors, i.e. unprivileged users used to be able to do something but they
>> can't do so after an upgrade. We accept it since it's reasonable.
>>
>> There're incompatible changes too in the previous releases:
>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/impala_incompatible_changes.html.
>> What we need is to document it well :)
>>
>> I just moved forward and start GVO job for the patch. Thanks!
>>
>>
>>
>> On Thu, Jan 31, 2019 at 2:00 AM Bharath Vissapragada <
>> bharathv@cloudera.com> wrote:
>>
>>> On Wed, Jan 30, 2019 at 12:21 AM Quanlong Huang <huangquanlong@gmail.com
>>> >
>>> wrote:
>>>
>>> > I'm afraid the difference between branch-2.x and Cloudera's branch is
>>> > larger than the difference between branch-2.x and master branch.
>>> Cloudera's
>>> > branch already ignored lots of commits, which causes the gap. I've
>>> tried
>>> > cherry-pick from master or Cloudera's branch and found it's much
>>> easier to
>>> > pick from master branch.
>>> > If https://gerrit.cloudera.org/c/12292/ is merged, I can easily pick
>>> 40+
>>> > commits into branch-2.x with few conflicts to resolve!! See the first
>>> > column in the sheet:
>>> >
>>> >
>>> https://docs.google.com/spreadsheets/d/12h1rTAPS1gm0vhlDGxeOXjnRD7rrOcoqzX4rjRRCyBg
>>> > The result is here:
>>> > https://github.com/stiga-huang/incubator-impala/tree/future-2.x
>>>
>>>
>>> Sure whatever works best for you. You are right that the Cloudera branch
>>> was selective in cherry-picking stuff. We mostly focussed on "fetch
>>> on-demand metadata" changes and "finer grained privileges" and ignored
>>> the
>>> rest. If that is what you are looking for, it is easier to cherry-pick
>>> from
>>> Cloudera's branch. Otherwise probably better to replay commits from the
>>> master branch.
>>>
>>>
>>> >
>>> > We can restart the cherrypick-2.x-and-test Jenkins job. Each time
>>> > there're conflicts, I'll come to resolve it. If the job keeps running,
>>> it's
>>> > possible for branch-2.x to catch up the master branch!
>>> >
>>> > Besides, https://gerrit.cloudera.org/c/12292/ is about DESCRIBE
>>> behavior
>>> > in
>>> > FGP(Fine-grained privileges). I think it's reasonable and not
>>> > compatibility breaking. Does anyone have more thoughts about this?
>>> >
>>>
>>> Hmm, I see what you are saying.  Definitely helps to include it to make
>>> future cherry-picks easier.
>>>
>>> Technically it is still a behavioral change, especially if someone
>>> upgrades
>>> to a version with this fix and we typically try to avoid that (describe
>>> that worked before doesn't work after upgrade). I can't speak about why
>>> we
>>> included it in the Cloudera branch since that was an internal decision
>>> but
>>> I don't know if we have any policies here around backporting such stuff
>>> into older branches. Maybe good to know what others think.
>>>
>>> Anyway, I don't feel too strongly about this and since it is blocking
>>> your
>>> work, I removed my -1 on the code review but this is something to keep in
>>> mind when backporting such patches.
>>>
>>>
>>> >
>>> > On Wed, Jan 30, 2019 at 9:41 AM Fredy Wijaya <fw...@cloudera.com>
>>> wrote:
>>> >
>>> > > Due to the way we build 2.x where it can't use the pinned versions
>>> of CDH
>>> > > dependencies, it may be better to cherry-pick all commits in
>>> > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0, which
>>> also
>>> > > includes that DESCRIBE commit to avoid further integration issues
>>> later
>>> > on.
>>> > >
>>> > > On Tue, Jan 29, 2019 at 5:05 PM Quanlong Huang <
>>> huangquanlong@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hi Bharath,
>>> > > >
>>> > > > Thank you a lot for your notice! However, I've gone through the
>>> commits
>>> > > of
>>> > > > cdh branch before and found that this patch is also picked:
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://github.com/cloudera/Impala/commit/f8a318d4f75e22a963b9cf4786ef07d2cd6bd93c
>>> > > > .
>>> > > > Is this really a compatibility breaking change?
>>> > > >
>>> > > > I'm also concern that the TestDescribeTableResults it introduced
>>> is too
>>> > > > strictly that may cause troubles. However, I found two later
>>> commits
>>> > > > (IMPALA-7143 and IMPALA-7144) would fix this. I'm going to
>>> cherry-pick
>>> > > > these two and IMPALA-7676 (thanks Fredy's advise too!) right after
>>> > > > https://gerrit.cloudera.org/c/12292/ is merged.
>>> > > >
>>> > > > Please let me know if this will go astray. Thanks!
>>> > > >
>>> > > >
>>> > > > On Wed, Jan 30, 2019 at 12:36 AM Bharath Vissapragada <
>>> > > > bharathv@cloudera.com>
>>> > > > wrote:
>>> > > >
>>> > > > > On Mon, Jan 28, 2019 at 11:36 PM Quanlong Huang <
>>> > > huangquanlong@gmail.com
>>> > > > >
>>> > > > > wrote:
>>> > > > >
>>> > > > > > >>For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
>>> > > > > interfaces
>>> > > > > > for Db, View, Table, Partition", the cherry-pick conflicts is
>>> due
>>> > to
>>> > > > the
>>> > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>>> > IMPALA-6479
>>> > > > > being
>>> > > > > > picked back. Does anyone know why we revert it? (I also
>>> comment in
>>> > > the
>>> > > > > > JIRA).
>>> > > > > > >
>>> > > > > > >There are test failures. I guess it's the reason. Hopefully,
>>> > > > > > cdh-5.16.1-release already picked up this patch, which provides
>>> > some
>>> > > > > > pointers :)
>>> > > > > >
>>> > > > > > I fix the test failures and create a review at
>>> > > > > > https://gerrit.cloudera.org/c/12292/
>>> > > > > > Waiting for Jenkins maintenance to finish and then run a GVO.
>>> Hopes
>>> > > > > someone
>>> > > > > > can join and have a look!
>>> > > > > >
>>> > > > > > On Tue, Jan 29, 2019 at 7:39 AM Quanlong Huang <
>>> > > > huangquanlong@gmail.com>
>>> > > > > > wrote:
>>> > > > > >
>>> > > > > > > >For the first patch, "0b540b025 IMPALA-7128 (part 1)
>>> Refactor
>>> > > > > interfaces
>>> > > > > > > for Db, View, Table, Partition", the cherry-pick conflicts
>>> is due
>>> > > to
>>> > > > > the
>>> > > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>>> > > IMPALA-6479
>>> > > > > > being
>>> > > > > > > picked back. Does anyone know why we revert it? (I also
>>> comment
>>> > in
>>> > > > the
>>> > > > > > > JIRA).
>>> > > > > >
>>> > > > >
>>> > > > > It was reverted because it is a compatibility breaking change. We
>>> > > > typically
>>> > > > > try not to introduce such behavioral changes in the same major
>>> > version
>>> > > > line
>>> > > > > as that can cause upgrade issues.
>>> > > > >
>>> > > > >
>>> > > > > > >
>>> > > > > > > There are test failures. I guess it's the reason. Hopefully,
>>> > > > > > > cdh-5.16.1-release already picked up this patch, which
>>> provides
>>> > > some
>>> > > > > > > pointers :)
>>> > > > > >
>>> > > > >
>>> > > > > I work at Cloudera and we've gone through this exercise before.
>>> It is
>>> > > > > annoying to resolve the conflicts, so you can reuse our work and
>>> save
>>> > > > some
>>> > > > > time.
>>> > > > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0
>>> > > > >
>>> > > > >
>>> > > > > > >
>>> > > > > > > On Mon, Jan 28, 2019 at 10:51 PM Quanlong Huang <
>>> > > > > huangquanlong@gmail.com
>>> > > > > > >
>>> > > > > > > wrote:
>>> > > > > > >
>>> > > > > > >> Yes, there are two discussion threads before that are
>>> relative
>>> > to
>>> > > > > this.
>>> > > > > > >> One for stopping the cherrypick-2.x-and-test jenkins job:
>>> > > > > > >>
>>> > > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://lists.apache.org/thread.html/2b4b62d4c07661b27a5203618cb0425a429f6460f2eb505acbcd26c6@%3Cdev.impala.apache.org%3E
>>> > > > > > >>
>>> > > > > > >> The other for removing support for hadoop 2 in master
>>> branch:
>>> > > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://lists.apache.org/thread.html/49f9b68ed3d6d2c0fdee16a877b259922545e4824e1233479227a657@%3Cdev.impala.apache.org%3E
>>> > > > > > >>
>>> > > > > > >> I'm +1 with the second thread that we only support Hadoop 2
>>> in
>>> > > > > > branch-2.x
>>> > > > > > >> and support Hadoop 3 in the master branch to be more
>>> focused.
>>> > I'm
>>> > > > also
>>> > > > > > >> agree with Paul's concern. It's such a dilemma that if we
>>> skip
>>> > > some
>>> > > > > > >> commits, things will be harder and harder as we moving
>>> forward;
>>> > if
>>> > > > we
>>> > > > > > >> cherry-pick, review, and test the commits one by one,
>>> branch-2.x
>>> > > > will
>>> > > > > > never
>>> > > > > > >> catch up the master branch, which is an obstacle if someone
>>> > (like
>>> > > > me)
>>> > > > > > wants
>>> > > > > > >> to backport his/her new patch to branch-2.x but waits too
>>> long
>>> > and
>>> > > > > > finally
>>> > > > > > >> fogets details of the patch.
>>> > > > > > >>
>>> > > > > > >> I roughly investigated how other systems deal with multiple
>>> > > > branches.
>>> > > > > > The
>>> > > > > > >> efforts to backport a patch could be the same for the
>>> original
>>> > > > patch.
>>> > > > > > It's
>>> > > > > > >> not a easy go, so the Hive community declares that
>>> > > > > > >> "The decision to port a feature from master to branch-1 is
>>> at
>>> > the
>>> > > > > > >> discretion of the contributor and committer. However no
>>> features
>>> > > > that
>>> > > > > > break
>>> > > > > > >> backwards compatibility will be accepted on branch-1."
>>> > > > > > >>
>>> > > > > > >> I think it's a chance to understand more parts of Impala by
>>> > > learning
>>> > > > > and
>>> > > > > > >> backporting the patches, since they have execellent commit
>>> > > messages
>>> > > > > and
>>> > > > > > >> were strictly reviewed. So I volunteer for the job to move
>>> > forward
>>> > > > the
>>> > > > > > >> branch-2.x. Hopes patch authors could give some pointers
>>> when
>>> > I'm
>>> > > > > > blocked!
>>> > > > > > >> I'll try approach (b) first and switch to (a) when (b)
>>> becomes
>>> > > > > > impossible
>>> > > > > > >> after too many commits are skipped. I'll confirm with the
>>> author
>>> > > if
>>> > > > I
>>> > > > > > think
>>> > > > > > >> a patch should be skipped.
>>> > > > > > >>
>>> > > > > > >> For the first patch, "0b540b025 IMPALA-7128 (part 1)
>>> Refactor
>>> > > > > interfaces
>>> > > > > > >> for Db, View, Table, Partition", the cherry-pick conflicts
>>> is
>>> > due
>>> > > to
>>> > > > > the
>>> > > > > > >> revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>>> > > > IMPALA-6479
>>> > > > > > being
>>> > > > > > >> picked back. Does anyone know why we revert it? (I also
>>> comment
>>> > in
>>> > > > the
>>> > > > > > >> JIRA).
>>> > > > > > >>
>>> > > > > > >> On Mon, Jan 28, 2019 at 12:43 PM Philip Zeyliger <
>>> > > > philip@cloudera.com
>>> > > > > >
>>> > > > > > >> wrote:
>>> > > > > > >>
>>> > > > > > >>> As for Quanlong's question, I think the answer is however
>>> the
>>> > > folks
>>> > > > > who
>>> > > > > > >>> want to do the work prefer to do it. As you noticed in the
>>> CDH
>>> > > > > > >>> changelists,
>>> > > > > > >>> Cloudera's distribution has opted for something more like
>>> > > approach
>>> > > > > (a),
>>> > > > > > >>> choosing to backport individual features. For a while, we
>>> were
>>> > > > doing
>>> > > > > > >>> automation for cherry-picking things automatically, and it
>>> got
>>> > > > > tedious
>>> > > > > > >>> enough that we decided to turn it off.
>>> > > > > > >>>
>>> > > > > > >>> On Sun, Jan 27, 2019 at 7:37 PM Paul Rogers <
>>> > > progers@cloudera.com>
>>> > > > > > >>> wrote:
>>> > > > > > >>>
>>> > > > > > >>> > Hi Quanlong,
>>> > > > > > >>> >
>>> > > > > > >>> > Thanks for the suggestion. I wonder if there is a third
>>> > > strategy:
>>> > > > > > >>> >
>>> > > > > > >>> > c) Isolate the Hadoop 2.x/3.x differences into
>>> > clearly-defined
>>> > > > > driver
>>> > > > > > >>> > layer so that basically all of 3.x can be applied to the
>>> 2.x
>>> > > > > branch.
>>> > > > > > >>> Said
>>> > > > > > >>> > another way, a single source base can work against either
>>> > > Hadoop
>>> > > > > 2.x
>>> > > > > > or
>>> > > > > > >>> > 3.x, with the build (C++) or runtime (Java) choosing the
>>> > proper
>>> > > > > > >>> “driver”
>>> > > > > > >>> > classes.
>>> > > > > > >>> >
>>> > > > > > >>>
>>> > > > > > >>> We had such a layer for a while, where Impala master could
>>> be
>>> > > built
>>> > > > > > >>> against
>>> > > > > > >>> either Hadoop3 or Hadoop2. We decided to clean it up in
>>> commit
>>> > > > > > >>> e4ae605b083ab536c68552e37ca3c46f6bff4c76.
>>> > > > > > >>>
>>> > > > > > >>> commit e4ae605b083ab536c68552e37ca3c46f6bff4c76
>>> > > > > > >>> Author: Fredy Wijaya <fw...@cloudera.com>
>>> > > > > > >>> Date:   Thu Jul 12 17:01:13 2018 -0700
>>> > > > > > >>>
>>> > > > > > >>>     IMPALA-7295: Remove IMPALA_MINICLUSTER_PROFILE=2
>>> > > > > > >>>
>>> > > > > > >>>     This patch removes the use of
>>> IMPALA_MINICLUSTER_PROFILE.
>>> > The
>>> > > > > code
>>> > > > > > >>> that
>>> > > > > > >>>     uses IMPALA_MINICLUSTER_PROFILE=2 is removed and it
>>> > defaults
>>> > > to
>>> > > > > > code
>>> > > > > > >>> from
>>> > > > > > >>>     IMPALA_MINICLUSTER_PROFILE=3. In order to reduce
>>> having too
>>> > > > many
>>> > > > > > code
>>> > > > > > >>>     changes in this patch, there is no code change for the
>>> > shims.
>>> > > > The
>>> > > > > > >>> shims
>>> > > > > > >>>     for IMPALA_MINICLUSTER_PROFILE=3 automatically become
>>> the
>>> > > > default
>>> > > > > > >>>     implementation.
>>> > > > > > >>>
>>> > > > > > >>>     Testing:
>>> > > > > > >>>     - Ran core and exhaustive tests
>>> > > > > > >>>
>>> > > > > > >>>     Change-Id: Iba4a81165b3d2012dc04d4115454372c41e39f08
>>> > > > > > >>>     Reviewed-on: http://gerrit.cloudera.org:8080/10940
>>> > > > > > >>>     Reviewed-by: Impala Public Jenkins <
>>> > > > > > >>> impala-public-jenkins@cloudera.com>
>>> > > > > > >>>     Tested-by: Impala Public Jenkins <
>>> > > > > > impala-public-jenkins@cloudera.com
>>> > > > > > >>> >
>>> > > > > > >>>
>>> > > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: Move forward branch-2.x

Posted by Quanlong Huang <hu...@gmail.com>.
Hi all,

We've moved to the next patch: https://gerrit.cloudera.org/c/12345/. It's
the first step to add LocalCatalog to branch-2.x. Anyone could give it
a +2? Or anyone has objections for it?

Thanks,
Quanlong

On Thu, Jan 31, 2019 at 9:00 PM Quanlong Huang <hu...@gmail.com>
wrote:

> Sure. I think "fine-grained privileges" always introduce small changes in
> behaviors, i.e. unprivileged users used to be able to do something but they
> can't do so after an upgrade. We accept it since it's reasonable.
>
> There're incompatible changes too in the previous releases:
> https://www.cloudera.com/documentation/enterprise/release-notes/topics/impala_incompatible_changes.html.
> What we need is to document it well :)
>
> I just moved forward and start GVO job for the patch. Thanks!
>
>
>
> On Thu, Jan 31, 2019 at 2:00 AM Bharath Vissapragada <
> bharathv@cloudera.com> wrote:
>
>> On Wed, Jan 30, 2019 at 12:21 AM Quanlong Huang <hu...@gmail.com>
>> wrote:
>>
>> > I'm afraid the difference between branch-2.x and Cloudera's branch is
>> > larger than the difference between branch-2.x and master branch.
>> Cloudera's
>> > branch already ignored lots of commits, which causes the gap. I've tried
>> > cherry-pick from master or Cloudera's branch and found it's much easier
>> to
>> > pick from master branch.
>> > If https://gerrit.cloudera.org/c/12292/ is merged, I can easily pick
>> 40+
>> > commits into branch-2.x with few conflicts to resolve!! See the first
>> > column in the sheet:
>> >
>> >
>> https://docs.google.com/spreadsheets/d/12h1rTAPS1gm0vhlDGxeOXjnRD7rrOcoqzX4rjRRCyBg
>> > The result is here:
>> > https://github.com/stiga-huang/incubator-impala/tree/future-2.x
>>
>>
>> Sure whatever works best for you. You are right that the Cloudera branch
>> was selective in cherry-picking stuff. We mostly focussed on "fetch
>> on-demand metadata" changes and "finer grained privileges" and ignored the
>> rest. If that is what you are looking for, it is easier to cherry-pick
>> from
>> Cloudera's branch. Otherwise probably better to replay commits from the
>> master branch.
>>
>>
>> >
>> > We can restart the cherrypick-2.x-and-test Jenkins job. Each time
>> > there're conflicts, I'll come to resolve it. If the job keeps running,
>> it's
>> > possible for branch-2.x to catch up the master branch!
>> >
>> > Besides, https://gerrit.cloudera.org/c/12292/ is about DESCRIBE
>> behavior
>> > in
>> > FGP(Fine-grained privileges). I think it's reasonable and not
>> > compatibility breaking. Does anyone have more thoughts about this?
>> >
>>
>> Hmm, I see what you are saying.  Definitely helps to include it to make
>> future cherry-picks easier.
>>
>> Technically it is still a behavioral change, especially if someone
>> upgrades
>> to a version with this fix and we typically try to avoid that (describe
>> that worked before doesn't work after upgrade). I can't speak about why we
>> included it in the Cloudera branch since that was an internal decision but
>> I don't know if we have any policies here around backporting such stuff
>> into older branches. Maybe good to know what others think.
>>
>> Anyway, I don't feel too strongly about this and since it is blocking your
>> work, I removed my -1 on the code review but this is something to keep in
>> mind when backporting such patches.
>>
>>
>> >
>> > On Wed, Jan 30, 2019 at 9:41 AM Fredy Wijaya <fw...@cloudera.com>
>> wrote:
>> >
>> > > Due to the way we build 2.x where it can't use the pinned versions of
>> CDH
>> > > dependencies, it may be better to cherry-pick all commits in
>> > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0, which
>> also
>> > > includes that DESCRIBE commit to avoid further integration issues
>> later
>> > on.
>> > >
>> > > On Tue, Jan 29, 2019 at 5:05 PM Quanlong Huang <
>> huangquanlong@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Bharath,
>> > > >
>> > > > Thank you a lot for your notice! However, I've gone through the
>> commits
>> > > of
>> > > > cdh branch before and found that this patch is also picked:
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/cloudera/Impala/commit/f8a318d4f75e22a963b9cf4786ef07d2cd6bd93c
>> > > > .
>> > > > Is this really a compatibility breaking change?
>> > > >
>> > > > I'm also concern that the TestDescribeTableResults it introduced is
>> too
>> > > > strictly that may cause troubles. However, I found two later commits
>> > > > (IMPALA-7143 and IMPALA-7144) would fix this. I'm going to
>> cherry-pick
>> > > > these two and IMPALA-7676 (thanks Fredy's advise too!) right after
>> > > > https://gerrit.cloudera.org/c/12292/ is merged.
>> > > >
>> > > > Please let me know if this will go astray. Thanks!
>> > > >
>> > > >
>> > > > On Wed, Jan 30, 2019 at 12:36 AM Bharath Vissapragada <
>> > > > bharathv@cloudera.com>
>> > > > wrote:
>> > > >
>> > > > > On Mon, Jan 28, 2019 at 11:36 PM Quanlong Huang <
>> > > huangquanlong@gmail.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > >>For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
>> > > > > interfaces
>> > > > > > for Db, View, Table, Partition", the cherry-pick conflicts is
>> due
>> > to
>> > > > the
>> > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>> > IMPALA-6479
>> > > > > being
>> > > > > > picked back. Does anyone know why we revert it? (I also comment
>> in
>> > > the
>> > > > > > JIRA).
>> > > > > > >
>> > > > > > >There are test failures. I guess it's the reason. Hopefully,
>> > > > > > cdh-5.16.1-release already picked up this patch, which provides
>> > some
>> > > > > > pointers :)
>> > > > > >
>> > > > > > I fix the test failures and create a review at
>> > > > > > https://gerrit.cloudera.org/c/12292/
>> > > > > > Waiting for Jenkins maintenance to finish and then run a GVO.
>> Hopes
>> > > > > someone
>> > > > > > can join and have a look!
>> > > > > >
>> > > > > > On Tue, Jan 29, 2019 at 7:39 AM Quanlong Huang <
>> > > > huangquanlong@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > >For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
>> > > > > interfaces
>> > > > > > > for Db, View, Table, Partition", the cherry-pick conflicts is
>> due
>> > > to
>> > > > > the
>> > > > > > > revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>> > > IMPALA-6479
>> > > > > > being
>> > > > > > > picked back. Does anyone know why we revert it? (I also
>> comment
>> > in
>> > > > the
>> > > > > > > JIRA).
>> > > > > >
>> > > > >
>> > > > > It was reverted because it is a compatibility breaking change. We
>> > > > typically
>> > > > > try not to introduce such behavioral changes in the same major
>> > version
>> > > > line
>> > > > > as that can cause upgrade issues.
>> > > > >
>> > > > >
>> > > > > > >
>> > > > > > > There are test failures. I guess it's the reason. Hopefully,
>> > > > > > > cdh-5.16.1-release already picked up this patch, which
>> provides
>> > > some
>> > > > > > > pointers :)
>> > > > > >
>> > > > >
>> > > > > I work at Cloudera and we've gone through this exercise before.
>> It is
>> > > > > annoying to resolve the conflicts, so you can reuse our work and
>> save
>> > > > some
>> > > > > time.
>> > > > > https://github.com/cloudera/Impala/tree/cdh5-2.12.0_5.16.0
>> > > > >
>> > > > >
>> > > > > > >
>> > > > > > > On Mon, Jan 28, 2019 at 10:51 PM Quanlong Huang <
>> > > > > huangquanlong@gmail.com
>> > > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > >> Yes, there are two discussion threads before that are
>> relative
>> > to
>> > > > > this.
>> > > > > > >> One for stopping the cherrypick-2.x-and-test jenkins job:
>> > > > > > >>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/2b4b62d4c07661b27a5203618cb0425a429f6460f2eb505acbcd26c6@%3Cdev.impala.apache.org%3E
>> > > > > > >>
>> > > > > > >> The other for removing support for hadoop 2 in master branch:
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/49f9b68ed3d6d2c0fdee16a877b259922545e4824e1233479227a657@%3Cdev.impala.apache.org%3E
>> > > > > > >>
>> > > > > > >> I'm +1 with the second thread that we only support Hadoop 2
>> in
>> > > > > > branch-2.x
>> > > > > > >> and support Hadoop 3 in the master branch to be more focused.
>> > I'm
>> > > > also
>> > > > > > >> agree with Paul's concern. It's such a dilemma that if we
>> skip
>> > > some
>> > > > > > >> commits, things will be harder and harder as we moving
>> forward;
>> > if
>> > > > we
>> > > > > > >> cherry-pick, review, and test the commits one by one,
>> branch-2.x
>> > > > will
>> > > > > > never
>> > > > > > >> catch up the master branch, which is an obstacle if someone
>> > (like
>> > > > me)
>> > > > > > wants
>> > > > > > >> to backport his/her new patch to branch-2.x but waits too
>> long
>> > and
>> > > > > > finally
>> > > > > > >> fogets details of the patch.
>> > > > > > >>
>> > > > > > >> I roughly investigated how other systems deal with multiple
>> > > > branches.
>> > > > > > The
>> > > > > > >> efforts to backport a patch could be the same for the
>> original
>> > > > patch.
>> > > > > > It's
>> > > > > > >> not a easy go, so the Hive community declares that
>> > > > > > >> "The decision to port a feature from master to branch-1 is at
>> > the
>> > > > > > >> discretion of the contributor and committer. However no
>> features
>> > > > that
>> > > > > > break
>> > > > > > >> backwards compatibility will be accepted on branch-1."
>> > > > > > >>
>> > > > > > >> I think it's a chance to understand more parts of Impala by
>> > > learning
>> > > > > and
>> > > > > > >> backporting the patches, since they have execellent commit
>> > > messages
>> > > > > and
>> > > > > > >> were strictly reviewed. So I volunteer for the job to move
>> > forward
>> > > > the
>> > > > > > >> branch-2.x. Hopes patch authors could give some pointers when
>> > I'm
>> > > > > > blocked!
>> > > > > > >> I'll try approach (b) first and switch to (a) when (b)
>> becomes
>> > > > > > impossible
>> > > > > > >> after too many commits are skipped. I'll confirm with the
>> author
>> > > if
>> > > > I
>> > > > > > think
>> > > > > > >> a patch should be skipped.
>> > > > > > >>
>> > > > > > >> For the first patch, "0b540b025 IMPALA-7128 (part 1) Refactor
>> > > > > interfaces
>> > > > > > >> for Db, View, Table, Partition", the cherry-pick conflicts is
>> > due
>> > > to
>> > > > > the
>> > > > > > >> revert of IMPALA-6479 in 2.x. I'm testing branch-2.x with
>> > > > IMPALA-6479
>> > > > > > being
>> > > > > > >> picked back. Does anyone know why we revert it? (I also
>> comment
>> > in
>> > > > the
>> > > > > > >> JIRA).
>> > > > > > >>
>> > > > > > >> On Mon, Jan 28, 2019 at 12:43 PM Philip Zeyliger <
>> > > > philip@cloudera.com
>> > > > > >
>> > > > > > >> wrote:
>> > > > > > >>
>> > > > > > >>> As for Quanlong's question, I think the answer is however
>> the
>> > > folks
>> > > > > who
>> > > > > > >>> want to do the work prefer to do it. As you noticed in the
>> CDH
>> > > > > > >>> changelists,
>> > > > > > >>> Cloudera's distribution has opted for something more like
>> > > approach
>> > > > > (a),
>> > > > > > >>> choosing to backport individual features. For a while, we
>> were
>> > > > doing
>> > > > > > >>> automation for cherry-picking things automatically, and it
>> got
>> > > > > tedious
>> > > > > > >>> enough that we decided to turn it off.
>> > > > > > >>>
>> > > > > > >>> On Sun, Jan 27, 2019 at 7:37 PM Paul Rogers <
>> > > progers@cloudera.com>
>> > > > > > >>> wrote:
>> > > > > > >>>
>> > > > > > >>> > Hi Quanlong,
>> > > > > > >>> >
>> > > > > > >>> > Thanks for the suggestion. I wonder if there is a third
>> > > strategy:
>> > > > > > >>> >
>> > > > > > >>> > c) Isolate the Hadoop 2.x/3.x differences into
>> > clearly-defined
>> > > > > driver
>> > > > > > >>> > layer so that basically all of 3.x can be applied to the
>> 2.x
>> > > > > branch.
>> > > > > > >>> Said
>> > > > > > >>> > another way, a single source base can work against either
>> > > Hadoop
>> > > > > 2.x
>> > > > > > or
>> > > > > > >>> > 3.x, with the build (C++) or runtime (Java) choosing the
>> > proper
>> > > > > > >>> “driver”
>> > > > > > >>> > classes.
>> > > > > > >>> >
>> > > > > > >>>
>> > > > > > >>> We had such a layer for a while, where Impala master could
>> be
>> > > built
>> > > > > > >>> against
>> > > > > > >>> either Hadoop3 or Hadoop2. We decided to clean it up in
>> commit
>> > > > > > >>> e4ae605b083ab536c68552e37ca3c46f6bff4c76.
>> > > > > > >>>
>> > > > > > >>> commit e4ae605b083ab536c68552e37ca3c46f6bff4c76
>> > > > > > >>> Author: Fredy Wijaya <fw...@cloudera.com>
>> > > > > > >>> Date:   Thu Jul 12 17:01:13 2018 -0700
>> > > > > > >>>
>> > > > > > >>>     IMPALA-7295: Remove IMPALA_MINICLUSTER_PROFILE=2
>> > > > > > >>>
>> > > > > > >>>     This patch removes the use of
>> IMPALA_MINICLUSTER_PROFILE.
>> > The
>> > > > > code
>> > > > > > >>> that
>> > > > > > >>>     uses IMPALA_MINICLUSTER_PROFILE=2 is removed and it
>> > defaults
>> > > to
>> > > > > > code
>> > > > > > >>> from
>> > > > > > >>>     IMPALA_MINICLUSTER_PROFILE=3. In order to reduce having
>> too
>> > > > many
>> > > > > > code
>> > > > > > >>>     changes in this patch, there is no code change for the
>> > shims.
>> > > > The
>> > > > > > >>> shims
>> > > > > > >>>     for IMPALA_MINICLUSTER_PROFILE=3 automatically become
>> the
>> > > > default
>> > > > > > >>>     implementation.
>> > > > > > >>>
>> > > > > > >>>     Testing:
>> > > > > > >>>     - Ran core and exhaustive tests
>> > > > > > >>>
>> > > > > > >>>     Change-Id: Iba4a81165b3d2012dc04d4115454372c41e39f08
>> > > > > > >>>     Reviewed-on: http://gerrit.cloudera.org:8080/10940
>> > > > > > >>>     Reviewed-by: Impala Public Jenkins <
>> > > > > > >>> impala-public-jenkins@cloudera.com>
>> > > > > > >>>     Tested-by: Impala Public Jenkins <
>> > > > > > impala-public-jenkins@cloudera.com
>> > > > > > >>> >
>> > > > > > >>>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>