You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Alexander Pivovarov <ap...@gmail.com> on 2015/04/28 20:02:03 UTC

Do we still support hadoop-1.2.x API (-Phadoop-1)?

Hi Everyone

I tried to compile the latest hive with hadoop-1 profile.
It failed because TestLazySimpleFast (164) uses Text.copyBytes() which is
hadoop-2.x API

So, which hadoop API should we use in hive? old hadoop-1.x or new
hadoop-2.x?

Alex

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Szehon Ho <sz...@cloudera.com>.
Yes, agree that topic should be called [Discuss], more people might comment
on this topic which is a big one.

> The question isn't whether there are people running Hadoop 1.x, it is
> whether those people are likely to install a new version of Hive running
on
> their old Hadoop cluster.

Yes, question is whether users want to run latest Hive version on Hadoop
1.x clusters.

> would enable the rest of the Hive community to move forward and take
> advantage of the powerful new features in Hadoop 2.x.

We can still take advantage of powerful Hadoop 2 features without removing
support for Hadoop 1, which we are doing for some time via the Shim layer
(eg, Hdfs encryption, Hdfs extended ACL's, all in this category).  The
question here seems to be code complexity with Hive of the Shim layer.

I am not arguing to keep support for Hadoop-1 indefinitely in newer
versions of Hive and keep complexity forever, but I think its fair to give
users a fair warning via one full release cycle where Hadoop1 is formally
deprecated, instead of immediately removing it next release.  Especially as
Hadoop 2 is GA only a year and a few months.  Thoughts?

Thanks
Szehon


On Tue, Apr 28, 2015 at 9:19 PM, Lefty Leverenz <le...@gmail.com>
wrote:

> This thread needs [DISCUSS] in the subject.
>
> -- Lefty
>
> On Tue, Apr 28, 2015 at 10:58 PM, Owen O'Malley <om...@apache.org>
> wrote:
>
> > On Tue, Apr 28, 2015 at 5:25 PM, Szehon Ho <sz...@cloudera.com> wrote:
> >
> > > Hadoop 2 has been GA for a little over a year, there is still a fairly
> > > significant user base that uses hadoop-1 and would not be happy with
> this
> > > change.
> >
> >
> > The question isn't whether there are people running Hadoop 1.x, it is
> > whether those people are likely to install a new version of Hive running
> on
> > their old Hadoop cluster. As a point of reference, CDH 4 shipped Hadoop
> 2.0
> > and Hive 0.10 and HDP 2.0 shipped Hadoop 2.0 and Hive 0.12.
> >
> >
> > > Perhaps we can declare it deprecated in some future release (perhaps
> > 1.3),
> > > then another release to formally remove it, as was done in HBase.
> >
> >
> > Are you interested in managing a Hadoop 1.x compatible version of Hive?
> > Maybe we should call the new release Hive 2.0 and enable you to maintain
> > the Hive 1.x branch with backwards compatibility with Hadoop 1.x. That
> > would enable the rest of the Hive community to move forward and take
> > advantage of the powerful new features in Hadoop 2.x.
> >
> > .. Owen
> >
>

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Lefty Leverenz <le...@gmail.com>.
This thread needs [DISCUSS] in the subject.

-- Lefty

On Tue, Apr 28, 2015 at 10:58 PM, Owen O'Malley <om...@apache.org> wrote:

> On Tue, Apr 28, 2015 at 5:25 PM, Szehon Ho <sz...@cloudera.com> wrote:
>
> > Hadoop 2 has been GA for a little over a year, there is still a fairly
> > significant user base that uses hadoop-1 and would not be happy with this
> > change.
>
>
> The question isn't whether there are people running Hadoop 1.x, it is
> whether those people are likely to install a new version of Hive running on
> their old Hadoop cluster. As a point of reference, CDH 4 shipped Hadoop 2.0
> and Hive 0.10 and HDP 2.0 shipped Hadoop 2.0 and Hive 0.12.
>
>
> > Perhaps we can declare it deprecated in some future release (perhaps
> 1.3),
> > then another release to formally remove it, as was done in HBase.
>
>
> Are you interested in managing a Hadoop 1.x compatible version of Hive?
> Maybe we should call the new release Hive 2.0 and enable you to maintain
> the Hive 1.x branch with backwards compatibility with Hadoop 1.x. That
> would enable the rest of the Hive community to move forward and take
> advantage of the powerful new features in Hadoop 2.x.
>
> .. Owen
>

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Owen O'Malley <om...@apache.org>.
On Tue, Apr 28, 2015 at 5:25 PM, Szehon Ho <sz...@cloudera.com> wrote:

> Hadoop 2 has been GA for a little over a year, there is still a fairly
> significant user base that uses hadoop-1 and would not be happy with this
> change.


The question isn't whether there are people running Hadoop 1.x, it is
whether those people are likely to install a new version of Hive running on
their old Hadoop cluster. As a point of reference, CDH 4 shipped Hadoop 2.0
and Hive 0.10 and HDP 2.0 shipped Hadoop 2.0 and Hive 0.12.


> Perhaps we can declare it deprecated in some future release (perhaps 1.3),
> then another release to formally remove it, as was done in HBase.


Are you interested in managing a Hadoop 1.x compatible version of Hive?
Maybe we should call the new release Hive 2.0 and enable you to maintain
the Hive 1.x branch with backwards compatibility with Hadoop 1.x. That
would enable the rest of the Hive community to move forward and take
advantage of the powerful new features in Hadoop 2.x.

.. Owen

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Szehon Ho <sz...@cloudera.com>.
Hadoop 2 has been GA for a little over a year, there is still a fairly
significant user base that uses hadoop-1 and would not be happy with this
change.  It should be removed at some point, but I'm not in favor of
removing in next release which would be too soon.

Perhaps we can declare it deprecated in some future release (perhaps 1.3),
then another release to formally remove it, as was done in HBase.  HBase
did the formal removal in a major release (1.0) which is a lot cleaner, not
sure if we have that luxury now that Hive 1.0 is forked.

Thanks,
Szehon

On Tue, Apr 28, 2015 at 5:03 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> I recently filed 5 issues (HIVE-10430, 10431, 10442, 10443, 10444) related
> to build breakage of hadoop-1. There could potentially be more breakage.
> Some patches were added to reduce the number of file systems calls to
> improve performance but with supporting hadoop-1 we cannot directly use
> such APIs. Also we are not enforcing hadoop-1 build checks in hive QA to
> make sure every commit comes out clean on hadoop-1 and hadoop-2. I think it
> will be good if we can focus only hadoop-2. Not only it will simplify
> development but also will reduce the shims layer.
>
> Thanks
> Prasanth
>
>
> > On Apr 28, 2015, at 4:40 PM, Owen O'Malley <om...@apache.org> wrote:
> >
> > It has been three years since Hadoop 2.0.0 was first released and I
> believe
> > that the vast majority of users that want to run Hive 1.x have moved over
> > to Hadoop 2.x already. It will dramatically simplify Hive development if
> we
> > drop backwards compatibility with the old Hadoop 1.x line.
> >
> > Thanks,
> >   Owen
>
>

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
I recently filed 5 issues (HIVE-10430, 10431, 10442, 10443, 10444) related to build breakage of hadoop-1. There could potentially be more breakage. Some patches were added to reduce the number of file systems calls to improve performance but with supporting hadoop-1 we cannot directly use such APIs. Also we are not enforcing hadoop-1 build checks in hive QA to make sure every commit comes out clean on hadoop-1 and hadoop-2. I think it will be good if we can focus only hadoop-2. Not only it will simplify development but also will reduce the shims layer.

Thanks
Prasanth


> On Apr 28, 2015, at 4:40 PM, Owen O'Malley <om...@apache.org> wrote:
> 
> It has been three years since Hadoop 2.0.0 was first released and I believe
> that the vast majority of users that want to run Hive 1.x have moved over
> to Hadoop 2.x already. It will dramatically simplify Hive development if we
> drop backwards compatibility with the old Hadoop 1.x line.
> 
> Thanks,
>   Owen


Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Owen O'Malley <om...@apache.org>.
It has been three years since Hadoop 2.0.0 was first released and I believe
that the vast majority of users that want to run Hive 1.x have moved over
to Hadoop 2.x already. It will dramatically simplify Hive development if we
drop backwards compatibility with the old Hadoop 1.x line.

Thanks,
   Owen

Re: Do we still support hadoop-1.2.x API (-Phadoop-1)?

Posted by Ashutosh Chauhan <ha...@apache.org>.
I think its time to discuss about dropping support for Hadoop-1 line. What
do folks think about Hive-1.2 being last release supporting Hadoop-1 line?

Thanks,
Ashutosh

On Tue, Apr 28, 2015 at 11:02 AM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> Hi Everyone
>
> I tried to compile the latest hive with hadoop-1 profile.
> It failed because TestLazySimpleFast (164) uses Text.copyBytes() which is
> hadoop-2.x API
>
> So, which hadoop API should we use in hive? old hadoop-1.x or new
> hadoop-2.x?
>
> Alex
>