You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ying Chen <yi...@gmail.com> on 2017/04/18 22:24:32 UTC

Re: Backward incompatible changes

Hello all -

Just want to know how are things going with the 2.2 release.  Has there
been a candidate release build?
In addition, what is the list of JIRA that has been included?  is it same
as the one we can run a query in for Hive JIRA ?
like:  project = HIVE AND issuetype in standardIssueTypes() AND fixVersion
= 2.2.0

Thanks much.
Ying


On Fri, Mar 24, 2017 at 10:35 PM, Lefty Leverenz <le...@gmail.com>
wrote:

> We'll have to revise some TODOC2.2 labels in JIRA, as well as fix versions.
>
> -- Lefty
>
>
> On Fri, Mar 24, 2017 at 2:19 PM, Ashutosh Chauhan <ha...@apache.org>
> wrote:
>
> > Sounds good to me. I have created Fix versions 3.0 and 2.3 in jira. We
> can
> > use 2.3 for next release from branch-2.
> >
> > On Fri, Mar 24, 2017 at 10:46 AM, Owen O'Malley <om...@apache.org>
> > wrote:
> >
> > > All,
> > >
> > > I'm glad to see a plan to release master before the incompatible
> changes.
> > > It will be great to have a release that uses the separated out ORC code
> > > base.
> > >
> > > As I had previously discussed on list, I've been working on building a
> > > proposed 2.2 branch, which builds on 2.1.1 and cherry picks a bunch of
> > > changes. The list of patches was picked to synergize with the efforts
> of
> > > the QA team at Hortonworks. You can look at the patches on
> > > https://github.com/omalley/hive/tree/branch-2.2 .
> > >
> > > The branch is still pretty rough, but I don't expect many large code
> > > changes. I'd like to propose that we take my branch as branch-2.2 and
> set
> > > up Pengcheng's branch as branch-2.3. We should also consider the
> > packaging
> > > changes (shrouding protobuf, guava, and kyro) that would make
> integration
> > > with Spark easier in branch-2.3.
> > >
> > > Thanks,
> > >    Owen
> > >
> > >
> > > On Thu, Mar 23, 2017 at 11:15 PM, Ashutosh Chauhan <
> hashutosh@apache.org
> > >
> > > wrote:
> > >
> > > > So, I just pushed branch-2 from current tip.
> > > > Now, lets get 2.2 release out as soon as possible.
> > > >
> > > > Thanks,
> > > > Ashutosh
> > > >
> > > > On Thu, Mar 23, 2017 at 9:46 PM, Ashutosh Chauhan <
> > hashutosh@apache.org>
> > > > wrote:
> > > >
> > > > > Cutting a branch should not slow down a 2.2 release. If any thing,
> > this
> > > > > should help in achieving stabilization faster for release since
> > branch
> > > > > won't get any new potentially destabilizing changes but only
> patches
> > to
> > > > fix
> > > > > existing known issues.
> > > > >
> > > > > On Thu, Mar 23, 2017 at 8:22 AM, Eugene Koifman <
> > > > ekoifman@hortonworks.com>
> > > > > wrote:
> > > > >
> > > > >> +1 to make a release first
> > > > >>
> > > > >> On 3/22/17, 2:06 PM, "Sergey Shelukhin" <se...@hortonworks.com>
> > > wrote:
> > > > >>
> > > > >>     Hmm.. should we release these first, and then cut branch-2?
> > > > >>     Otherwise during the releases, the patches for 2.2/2.3 will
> need
> > > to
> > > > >> go to
> > > > >>     3 (4?) places (master, branch-2, branch-2.2, branch-2.3?).
> > > > >>     There’s no rush to cut the branch if everything in 2.2/2.3 has
> > to
> > > go
> > > > >> to
> > > > >>     3.0 anyway.
> > > > >>
> > > > >>     On 17/3/22, 13:53, "Pengcheng Xiong" <px...@apache.org>
> wrote:
> > > > >>
> > > > >>     >I would like to work as the Release Manager if possible. As
> > Owen
> > > > >> points
> > > > >>     >out, he is working on 2.2 and I will work on 2.3. Thanks.
> > > > >>     >
> > > > >>     >On Wed, Mar 22, 2017 at 1:32 PM, Ashutosh Chauhan <
> > > > >> hashutosh@apache.org>
> > > > >>     >wrote:
> > > > >>     >
> > > > >>     >> Unless there is more feedback, I plan to cut branch-2 in a
> > day
> > > or
> > > > >> two
> > > > >>     >>from
> > > > >>     >> current master. As multiple people have suggested on this
> > > thread,
> > > > >> we
> > > > >>     >>should
> > > > >>     >> do a 2.2 release soon. Currently there are 177 issues
> > > > >>     >> <https://issues.apache.org/jira/issues/?jql=project%20%
> > > > >>     >> 3D%20HIVE%20AND%20resolution%
> 20%3D%20Unresolved%20AND%20cf%
> > > > >>     >> 5B12310320%5D%20%3D%202.2.0%20ORDER%20BY%20priority%
> 20DESC>
> > > > >>     >> targeted for 2.2 release. We can use branch-2 to land these
> > > > >> patches and
> > > > >>     >>for
> > > > >>     >> additional stabilization efforts. Any volunteer for Release
> > > > Manager
> > > > >>     >>driving
> > > > >>     >> 2.2 release?
> > > > >>     >>
> > > > >>     >> Thanks,
> > > > >>     >> Ashutosh
> > > > >>     >>
> > > > >>     >> On Fri, Mar 10, 2017 at 4:23 PM, Ashutosh Chauhan <
> > > > >> hashutosh@apache.org>
> > > > >>     >> wrote:
> > > > >>     >>
> > > > >>     >> > I hear what you are saying. Lets begin with 3 concerns:
> > > > >>     >> >
> > > > >>     >> > - How will we keep the community motivated on fixing both
> > > > master
> > > > >> and
> > > > >>     >> > branch-2?
> > > > >>     >> > Until we do a stable release from master, stable releases
> > can
> > > > >> come
> > > > >>     >>only
> > > > >>     >> > from branch-2. If a contributor wants to see their fix
> > reach
> > > to
> > > > >> users
> > > > >>     >>on
> > > > >>     >> a
> > > > >>     >> > stable line quickly they would have to have a fix on
> > > branch-2.
> > > > >> Also, a
> > > > >>     >> > release manager can pick whatever fixes she wants, so
> even
> > if
> > > > >>     >>contributor
> > > > >>     >> > doesn't commit it on branch-2, a release manger who wants
> > to
> > > > do a
> > > > >>     >>release
> > > > >>     >> > containing a set of fixes thats always possible.
> > > > >>     >> >
> > > > >>     >> > - *Harder cherry-picks between master and branch-2*.
> > > > >>     >> > That is certainly possible. But hope is we want to keep
> > > > branch-2
> > > > >>     >>stable,
> > > > >>     >> > so we don't backport large features which may run into
> this
> > > > >> issue.
> > > > >>     >> Smaller
> > > > >>     >> > focussed bug fix backport should be possible.
> > > > >>     >> >
> > > > >>     >> >
> > > > >>     >> >    - *Removal of MR2 on the master branch*.
> > > > >>     >> > This is something I personally would like to see. But
> exact
> > > > >> timing of
> > > > >>     >>it
> > > > >>     >> > will be decided by community. I am certainly not saying
> > that
> > > as
> > > > >> soon
> > > > >>     >>as
> > > > >>     >> > branch-2 is created, lets remove MR2 on master.
> > > > >>     >> >
> > > > >>     >> > I would also say that in the end ASF is volunteer
> > > organization,
> > > > >> we
> > > > >>     >>cant
> > > > >>     >> > force people to adopt one branch or another. Its upto the
> > > > >> contributors
> > > > >>     >> what
> > > > >>     >> > jiras they work on and when and where they commit it.
> > > > >>     >> > By not creating a branch-2 only thing we can guarantee is
> > > that
> > > > >> rate of
> > > > >>     >> > development on master to remain slow because we don't
> want
> > to
> > > > >> start
> > > > >>     >>doing
> > > > >>     >> > backward incompatible changes without explicitly
> > > acknowledging
> > > > >> that.
> > > > >>     >> >
> > > > >>     >> > Thanks,
> > > > >>     >> > Ashutosh
> > > > >>     >> >
> > > > >>     >> > On Thu, Mar 9, 2017 at 12:01 PM, Sergio Pena
> > > > >>     >><se...@cloudera.com>
> > > > >>     >> > wrote:
> > > > >>     >> >
> > > > >>     >> >> Hey Ashutosh, thanks for soliciting feedback on this.
> > > > >>     >> >>
> > > > >>     >> >> I like the idea you're proposing; maintaining
> > compatibility
> > > > and
> > > > >> at
> > > > >>     >>the
> > > > >>     >> >> same time adding newer features to
> > > > >>     >> >> Hive consumes a lot of development time and effort.
> > > > >>     >> >>
> > > > >>     >> >> However, I think some users and companies have just
> > started
> > > to
> > > > >> use
> > > > >>     >>Hive
> > > > >>     >> >> 2.x
> > > > >>     >> >> branch as their main major upgrade on Hive
> > > > >>     >> >> (possible due to waiting for stabilization and testing
> > > > >> upgrades), but
> > > > >>     >> >> cutting this major branch that just has 1 year of life
> > > > >>     >> >> might make us look like we will forget about the quality
> > of
> > > > >> Hive 2.x
> > > > >>     >>as
> > > > >>     >> we
> > > > >>     >> >> did with branch-1.
> > > > >>     >> >>
> > > > >>     >> >> Hive 1.x latest version was 1.2, and its development
> > stopped
> > > > >> because
> > > > >>     >>new
> > > > >>     >> >> features on Hive 2.x
> > > > >>     >> >> Hive 2.x latest version is 2.1, and we want to create
> Hive
> > > 3.x
> > > > >>     >>because
> > > > >>     >> of
> > > > >>     >> >> newer features and incompatibilities.
> > > > >>     >> >> Will Hive 3.x have the same future after 3.1 is
> released?
> > > > >>     >> >>
> > > > >>     >> >> What I'm also concerned is about these three things:
> > > > >>     >> >>
> > > > >>     >> >>    - *Branch-2 quality commitment*.
> > > > >>     >> >>    How will we keep the community motivated on fixing
> both
> > > > >> master and
> > > > >>     >> >>    branch-2?
> > > > >>     >> >>    - *Harder cherry-picks between master and branch-2*.
> > > > >>     >> >>    Because master will be incompatible by nature, then
> > > > >> cherry-picks
> > > > >>     >>to
> > > > >>     >> >>    branch-2 will be harder.
> > > > >>     >> >>    - *Removal of MR2 on the master branch*.
> > > > >>     >> >>    This was marked as deprecated just last year, but MR2
> > is
> > > > >> still an
> > > > >>     >> >> engine
> > > > >>     >> >>    that is used by several users.
> > > > >>     >> >>
> > > > >>     >> >> I accept that the end of life of major versions will
> come
> > at
> > > > >> some
> > > > >>     >>point,
> > > > >>     >> >> and these concerns will expire,
> > > > >>     >> >> but Hive 2.x is kind of young, isn't it?
> > > > >>     >> >>
> > > > >>     >> >> Should we try to stabilize the Hive 2.x line first, and
> > > have a
> > > > >> few
> > > > >>     >>more
> > > > >>     >> >> releases before starting to work on Hive 3.0?
> > > > >>     >> >> Should we add more test coverage to Hive jenkins jobs to
> > > > >> validate
> > > > >>     >>Hive
> > > > >>     >> 2.x
> > > > >>     >> >> quality?
> > > > >>     >> >> Should we agree on a date about when we should drop
> > > community
> > > > >>     >>support on
> > > > >>     >> >> Hive versions to let users know about this?
> > > > >>     >> >>
> > > > >>     >> >> Again, I like your proposal, but I'm afraid that users
> who
> > > > just
> > > > >>     >>upgraded
> > > > >>     >> >> to
> > > > >>     >> >> 2.x won't have any more features and improvements
> > > > >>     >> >> because they will be developed on 3.0.
> > > > >>     >> >>
> > > > >>     >> >> - Sergio
> > > > >>     >> >>
> > > > >>     >> >>
> > > > >>     >> >>
> > > > >>     >> >> On Mon, Mar 6, 2017 at 1:24 PM, Ashutosh Chauhan <
> > > > >>     >> >> ashutosh.chauhan@gmail.com
> > > > >>     >> >> > wrote:
> > > > >>     >> >>
> > > > >>     >> >> > The way it helps shedding debt  is because dev can now
> > do
> > > > >>     >>refactoring
> > > > >>     >> >> > without fear of breaking some rarely used features.
> The
> > > way
> > > > >> that
> > > > >>     >>helps
> > > > >>     >> >> for
> > > > >>     >> >> > adding feature faster is since codebase is lean and
> > easier
> > > > to
> > > > >>     >>reason
> > > > >>     >> >> about
> > > > >>     >> >> > its much easier to add new features.
> > > > >>     >> >> >
> > > > >>     >> >> > More importantly though, it also helps users because
> we
> > > are
> > > > >> setting
> > > > >>     >> the
> > > > >>     >> >> > expectation from dev community. They can expect that
> > > future
> > > > >>     >>releases
> > > > >>     >> of
> > > > >>     >> >> 2.x
> > > > >>     >> >> > to be backward compatible. At the same time whenever
> > they
> > > > >> decide to
> > > > >>     >> >> upgrade
> > > > >>     >> >> > they only need to test their application once against
> > 3.x
> > > as
> > > > >>     >>oppose to
> > > > >>     >> >> > continuous breakage of one form or another if we
> > continue
> > > to
> > > > >> make
> > > > >>     >> >> > incompatible changes in master without branching for
> 2.x
> > > > >>     >> >> >
> > > > >>     >> >> > Thanks,
> > > > >>     >> >> > Ashutosh
> > > > >>     >> >> >
> > > > >>     >> >> > On Sat, Mar 4, 2017 at 10:19 AM, Edward Capriolo <
> > > > >>     >> edlinuxguru@gmail.com
> > > > >>     >> >> >
> > > > >>     >> >> > wrote:
> > > > >>     >> >> >
> > > > >>     >> >> > > Also i dont follow how we remove
> > > > >>     >> >> > >
> > > > >>     >> >> > > On Saturday, March 4, 2017, Edward Capriolo
> > > > >>     >><ed...@gmail.com>
> > > > >>     >> >> > wrote:
> > > > >>     >> >> > >
> > > > >>     >> >> > > >
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > On Fri, Mar 3, 2017 at 8:46 PM, Thejas Nair <
> > > > >>     >> thejas.nair@gmail.com
> > > > >>     >> >> > > > <javascript:_e(%7B%7D,'cvml','
> thejas.nair@gmail.com
> > > > ');>>
> > > > >> wrote:
> > > > >>     >> >> > > >
> > > > >>     >> >> > > >> +1
> > > > >>     >> >> > > >> There are some features that are incomplete and
> > what
> > > I
> > > > >> would
> > > > >>     >>not
> > > > >>     >> >> > > recommend
> > > > >>     >> >> > > >> for any real production use.The 'legacy
> > authorization
> > > > >> mode'
> > > > >>     >>is a
> > > > >>     >> >> great
> > > > >>     >> >> > > >> example of that -
> > > > >>     >> >> > > >> https://cwiki.apache.org/confl
> > > > >> uence/display/Hive/Hive+Defaul
> > > > >>     >> >> > > >> t+Authorization+-+Legacy+Mode
> > > > >>     >> >> > > >> . It is inherently insecure mode that nobody
> should
> > > be
> > > > >> using.
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >> There is also potential to cleanup of the thrift
> > api.
> > > > >> However,
> > > > >>     >> >> there
> > > > >>     >> >> > are
> > > > >>     >> >> > > >> many users of this api, we would need to go the
> > > > >> deprecation
> > > > >>     >>then
> > > > >>     >> >> > remove
> > > > >>     >> >> > > >> after couple of releases route or so for that.
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >> I am sure there are many other candidates. We
> will
> > > have
> > > > >> to
> > > > >>     >> evaluate
> > > > >>     >> >> > each
> > > > >>     >> >> > > >> of
> > > > >>     >> >> > > >> those features on the risk/benefit of keeping
> them
> > > and
> > > > >>     >>arriving
> > > > >>     >> at
> > > > >>     >> >> a
> > > > >>     >> >> > > >> decision.
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >> Also, +1 on getting a 2.2 release out before we
> > > branch.
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >> On Fri, Mar 3, 2017 at 1:50 PM, Ashutosh Chauhan
> <
> > > > >>     >> >> > hashutosh@apache.org
> > > > >>     >> >> > > >> <javascript:_e(%7B%7D,'cvml','
> hashutosh@apache.org
> > > > ');>>
> > > > >>     >> >> > > >> wrote:
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >> > Hi all,
> > > > >>     >> >> > > >> >
> > > > >>     >> >> > > >> > Hive project has come a long way. With
> > wide-spread
> > > > >> adoption
> > > > >>     >> also
> > > > >>     >> >> > comes
> > > > >>     >> >> > > >> > expectations. Expectation of being backward
> > > > compatible
> > > > >> and
> > > > >>     >>not
> > > > >>     >> >> > > breaking
> > > > >>     >> >> > > >> > things. However that doesn't come free of cost
> > and
> > > > >> results
> > > > >>     >>in
> > > > >>     >> >> lot of
> > > > >>     >> >> > > >> legacy
> > > > >>     >> >> > > >> > code which can't be refactored without fear of
> > > > breaking
> > > > >>     >>things.
> > > > >>     >> >> As a
> > > > >>     >> >> > > >> result
> > > > >>     >> >> > > >> > project has accumulated lot of debt over time.
> At
> > > the
> > > > >> same
> > > > >>     >>time
> > > > >>     >> >> > there
> > > > >>     >> >> > > >> are
> > > > >>     >> >> > > >> > also lot of features which have seen little
> > uptake.
> > > > We
> > > > >> may
> > > > >>     >>want
> > > > >>     >> >> to
> > > > >>     >> >> > > drop
> > > > >>     >> >> > > >> > some of those.
> > > > >>     >> >> > > >> >
> > > > >>     >> >> > > >> > In order to move forward and shed that debt we
> > may
> > > > >> need a
> > > > >>     >>major
> > > > >>     >> >> > > version
> > > > >>     >> >> > > >> > release which allows us to make backward
> > > incompatible
> > > > >>     >>changes
> > > > >>     >> and
> > > > >>     >> >> > drop
> > > > >>     >> >> > > >> > rarely used features. At the same time there
> are
> > > lots
> > > > >> of
> > > > >>     >>users
> > > > >>     >> >> which
> > > > >>     >> >> > > are
> > > > >>     >> >> > > >> > consuming currently released 2.1 , 2.2 branches
> > and
> > > > >> expect
> > > > >>     >>them
> > > > >>     >> >> to
> > > > >>     >> >> > > stay
> > > > >>     >> >> > > >> on
> > > > >>     >> >> > > >> > it for some time. So, I propose that we create
> > > > >> branch-2 from
> > > > >>     >> >> current
> > > > >>     >> >> > > tip
> > > > >>     >> >> > > >> > and do future 2.x releases from that branch and
> > > keep
> > > > it
> > > > >>     >> backward
> > > > >>     >> >> > > >> > compatible. This will allow devs to land
> breaking
> > > > >> changes on
> > > > >>     >> >> master
> > > > >>     >> >> > > and
> > > > >>     >> >> > > >> > pave way to release hive 3.0 in future.
> > > > >>     >> >> > > >> >
> > > > >>     >> >> > > >> > Ofcourse, each specific incompatible change and
> > > > >> feature drop
> > > > >>     >> >> even
> > > > >>     >> >> > on
> > > > >>     >> >> > > >> > master need to be evaluated on its own merit on
> > > > >>     >>corresponding
> > > > >>     >> >> jira.
> > > > >>     >> >> > > This
> > > > >>     >> >> > > >> > email is just a solicitation of feedback for
> > > creating
> > > > >>     >>branch-2
> > > > >>     >> >> and
> > > > >>     >> >> > > >> allowing
> > > > >>     >> >> > > >> > breaking changes in master. Thoughts?
> > > > >>     >> >> > > >> >
> > > > >>     >> >> > > >> > Thanks,
> > > > >>     >> >> > > >> > Ashutosh
> > > > >>     >> >> > > >> >
> > > > >>     >> >> > > >>
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > One of the challenges of the developers conducting
> > the
> > > > >>     >> risk-benefit
> > > > >>     >> >> > > > analysis are that the developers are mostly
> focused
> > on
> > > > new
> > > > >>     >> features,
> > > > >>     >> >> > but
> > > > >>     >> >> > > > there are deployments of hive that are 5+ years
> old
> > > and
> > > > >> people
> > > > >>     >> that
> > > > >>     >> >> > rely
> > > > >>     >> >> > > on
> > > > >>     >> >> > > > the features are not on the mailing list.
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > For example I developed and use this frequently:
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > https://community.hortonworks.
> > > > >> com/articles/8861/apache-hive-
> > > > >>     >> >> > > > groovy-udf-examples.html
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > My career went away from hive for a while. I was
> > quite
> > > > >>     >>surprised
> > > > >>     >> to
> > > > >>     >> >> > find
> > > > >>     >> >> > > > out the cli->beeline it was more or less decided
> not
> > > to
> > > > >> port
> > > > >>     >>it. I
> > > > >>     >> >> > > learned
> > > > >>     >> >> > > > of this the first time I was forced to work in a
> > hive
> > > > >> server
> > > > >>     >>only
> > > > >>     >> >> > > > environment and it did not work.
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > Now I have to go and spend time adding this back
> so
> > I
> > > > >> don't
> > > > >>     >>have
> > > > >>     >> to
> > > > >>     >> >> > work
> > > > >>     >> >> > > > around it not being there.
> > > > >>     >> >> > > >
> > > > >>     >> >> > > > What we should do continue/doing is making code
> that
> > > is
> > > > >>     >>modular we
> > > > >>     >> >> need
> > > > >>     >> >> > > to
> > > > >>     >> >> > > > break hard dependencies like ThriftSerde or
> OrcSerde
> > > > being
> > > > >>     >> "native"
> > > > >>     >> >> and
> > > > >>     >> >> > > > having to be linked to the metastore move them out
> > > into
> > > > >> proper
> > > > >>     >> >> > > submodules.
> > > > >>     >> >> > > > There is too much code that only works for one
> > > > >> implementation
> > > > >>     >>of a
> > > > >>     >> >> > serde
> > > > >>     >> >> > > > etc.
> > > > >>     >> >> > > >
> > > > >>     >> >> > > >
> > > > >>     >> >> > > >
> > > > >>     >> >> > >
> > > > >>     >> >> > > I would like a timeline to understand this. It
> sounds
> > as
> > > > if
> > > > >>     >>master
> > > > >>     >> is
> > > > >>     >> >> not
> > > > >>     >> >> > > releasable currently, so already broken in a way. We
> > > make
> > > > a
> > > > >>     >>branch
> > > > >>     >> and
> > > > >>     >> >> > > aggreasively break it more?
> > > > >>     >> >> > >
> > > > >>     >> >> > > Im not following what makes this branching policy
> > makes
> > > > >> adding
> > > > >>     >> >> features
> > > > >>     >> >> > > faster or how it helps shed debt faster.
> > > > >>     >> >> > >
> > > > >>     >> >> > >
> > > > >>     >> >> > > --
> > > > >>     >> >> > > Sorry this was sent from mobile. Will do less
> grammar
> > > and
> > > > >> spell
> > > > >>     >> check
> > > > >>     >> >> > than
> > > > >>     >> >> > > usual.
> > > > >>     >> >> > >
> > > > >>     >> >> >
> > > > >>     >> >>
> > > > >>     >> >
> > > > >>     >> >
> > > > >>     >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>