You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Sergiy Matusevych <se...@gmail.com> on 2017/03/27 17:34:54 UTC

Upgrade to Hadoop 2.7.3 for 0.16?

Hi fellow REEF developers,

Our build currently requires Hadoop 2.6. This is fine for all things REEF,
except the new Unmanaged AM feature (plus the related functionality, like
REEF-on-REEF, Spark integration, etc), that requires YARN 2.7.3 or higher
to work properly. Do we want to upgrade our build to Hadoop 2.7.3 in 0.16
to be in sync with the YARN server version that we need?

What do you guys think?
-- Sergiy.

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Byung-Gon Chun <bg...@gmail.com>.
Thanks, Rogan.

Any benefit in moving from 2.6 to 2.7.0?

Thanks.
-Gon

On Thu, Jul 6, 2017 at 6:27 AM, Rogan Carr <ro...@gmail.com> wrote:

> Hi Gon,
>
> > Any compatibility problem with HDInsight?
>
> The latest version of Hadoop on Windows HDInsight is 2.7.0, so 2.7.3 will
> be incompatible.
>
> Best,
> Rogan
>
>
> On Mon, Jul 3, 2017 at 10:22 PM, Byung-Gon Chun <bg...@gmail.com> wrote:
>
> > The 0.16 release is blocked by .NET test failures.
> >
> > Since our release schedule's delayed anyway, how about moving to hadoop
> > version 2.7.3 for the 0.16 release? :)
> > Any compatibility problem with HDInsight?
> >
> > Thanks!
> > -Gon
> >
> > On Wed, Mar 29, 2017 at 1:31 PM, Markus Weimer <ma...@weimo.de> wrote:
> >
> > > Awesome! Can you file JIRAs for this, blocked by the 0.16 release?
> > >
> > > Thanks!
> > >
> > > Markus
> > >
> > > On Tue, Mar 28, 2017 at 7:32 PM, Anupam <an...@gmail.com> wrote:
> > > > Once we move to version 2.7+ we can also deprecate
> > > > Org.Apache.REEF.Client.Yarn.LegacyJobResourceUploader after
> > implementing
> > > > Org.Apache.REEF.IO.FileSystem.Hadoop.HadoopFileSystem.GetFileStatus.
> > > >
> > > >
> > > >
> > > > On 28 March 2017 at 10:29, Markus Weimer <ma...@weimo.de> wrote:
> > > >
> > > >> Hi Sergiy,
> > > >>
> > > >> that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in
> our
> > > >> pom.xml. Maybe you fall victim to some odd escaping issue?
> > > >>
> > > >> Markus
> > > >>
> > > >> On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
> > > >> <se...@gmail.com> wrote:
> > > >> > Hi Markus,
> > > >> >
> > > >> > That sounds like a very good idea! I currently have some problems
> > > running
> > > >> > maven like this (it does not like the dot at "hadoop.version"
> > property
> > > >> > name), but I'll keep trying tomorrow. Worst case, we'll have to
> use
> > > some
> > > >> > other property name.
> > > >> >
> > > >> > P.S. Either way, we *must* document the behavior in the README.
> I'll
> > > do
> > > >> > that once I figure out the exact mvn command line.
> > > >> >
> > > >> > Cheers,
> > > >> > Sergiy.
> > > >> >
> > > >> > On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de>
> > > wrote:
> > > >> >
> > > >> >> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
> > > >> >> <se...@gmail.com> wrote:
> > > >> >> > In other words, it would be much easier to run and debug our
> apps
> > > on
> > > >> cisl
> > > >> >> > cluster if we compile against hadoop 2.7.3; that would also
> make
> > > >> >> Unmanaged
> > > >> >> > AM mode available without caveats.
> > > >> >>
> > > >> >> Can't we add this to the readme and suggest people compile with
> `-D
> > > >> >> haddop.version=2.7.3` if they are interested in these new
> features?
> > > We
> > > >> >> can then require 2.7.3 for the next release.
> > > >> >>
> > > >> >> Markus
> > > >> >>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Anupam
> > > > Bellevue, WA
> > > > Ph: +1 (425)-777-5570
> > >
> >
> >
> >
> > --
> > Byung-Gon Chun
> >
>



-- 
Byung-Gon Chun

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Rogan Carr <ro...@gmail.com>.
Hi Gon,

> Any compatibility problem with HDInsight?

The latest version of Hadoop on Windows HDInsight is 2.7.0, so 2.7.3 will
be incompatible.

Best,
Rogan


On Mon, Jul 3, 2017 at 10:22 PM, Byung-Gon Chun <bg...@gmail.com> wrote:

> The 0.16 release is blocked by .NET test failures.
>
> Since our release schedule's delayed anyway, how about moving to hadoop
> version 2.7.3 for the 0.16 release? :)
> Any compatibility problem with HDInsight?
>
> Thanks!
> -Gon
>
> On Wed, Mar 29, 2017 at 1:31 PM, Markus Weimer <ma...@weimo.de> wrote:
>
> > Awesome! Can you file JIRAs for this, blocked by the 0.16 release?
> >
> > Thanks!
> >
> > Markus
> >
> > On Tue, Mar 28, 2017 at 7:32 PM, Anupam <an...@gmail.com> wrote:
> > > Once we move to version 2.7+ we can also deprecate
> > > Org.Apache.REEF.Client.Yarn.LegacyJobResourceUploader after
> implementing
> > > Org.Apache.REEF.IO.FileSystem.Hadoop.HadoopFileSystem.GetFileStatus.
> > >
> > >
> > >
> > > On 28 March 2017 at 10:29, Markus Weimer <ma...@weimo.de> wrote:
> > >
> > >> Hi Sergiy,
> > >>
> > >> that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in our
> > >> pom.xml. Maybe you fall victim to some odd escaping issue?
> > >>
> > >> Markus
> > >>
> > >> On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
> > >> <se...@gmail.com> wrote:
> > >> > Hi Markus,
> > >> >
> > >> > That sounds like a very good idea! I currently have some problems
> > running
> > >> > maven like this (it does not like the dot at "hadoop.version"
> property
> > >> > name), but I'll keep trying tomorrow. Worst case, we'll have to use
> > some
> > >> > other property name.
> > >> >
> > >> > P.S. Either way, we *must* document the behavior in the README. I'll
> > do
> > >> > that once I figure out the exact mvn command line.
> > >> >
> > >> > Cheers,
> > >> > Sergiy.
> > >> >
> > >> > On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de>
> > wrote:
> > >> >
> > >> >> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
> > >> >> <se...@gmail.com> wrote:
> > >> >> > In other words, it would be much easier to run and debug our apps
> > on
> > >> cisl
> > >> >> > cluster if we compile against hadoop 2.7.3; that would also make
> > >> >> Unmanaged
> > >> >> > AM mode available without caveats.
> > >> >>
> > >> >> Can't we add this to the readme and suggest people compile with `-D
> > >> >> haddop.version=2.7.3` if they are interested in these new features?
> > We
> > >> >> can then require 2.7.3 for the next release.
> > >> >>
> > >> >> Markus
> > >> >>
> > >>
> > >
> > >
> > >
> > > --
> > > Anupam
> > > Bellevue, WA
> > > Ph: +1 (425)-777-5570
> >
>
>
>
> --
> Byung-Gon Chun
>

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Byung-Gon Chun <bg...@gmail.com>.
The 0.16 release is blocked by .NET test failures.

Since our release schedule's delayed anyway, how about moving to hadoop
version 2.7.3 for the 0.16 release? :)
Any compatibility problem with HDInsight?

Thanks!
-Gon

On Wed, Mar 29, 2017 at 1:31 PM, Markus Weimer <ma...@weimo.de> wrote:

> Awesome! Can you file JIRAs for this, blocked by the 0.16 release?
>
> Thanks!
>
> Markus
>
> On Tue, Mar 28, 2017 at 7:32 PM, Anupam <an...@gmail.com> wrote:
> > Once we move to version 2.7+ we can also deprecate
> > Org.Apache.REEF.Client.Yarn.LegacyJobResourceUploader after implementing
> > Org.Apache.REEF.IO.FileSystem.Hadoop.HadoopFileSystem.GetFileStatus.
> >
> >
> >
> > On 28 March 2017 at 10:29, Markus Weimer <ma...@weimo.de> wrote:
> >
> >> Hi Sergiy,
> >>
> >> that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in our
> >> pom.xml. Maybe you fall victim to some odd escaping issue?
> >>
> >> Markus
> >>
> >> On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
> >> <se...@gmail.com> wrote:
> >> > Hi Markus,
> >> >
> >> > That sounds like a very good idea! I currently have some problems
> running
> >> > maven like this (it does not like the dot at "hadoop.version" property
> >> > name), but I'll keep trying tomorrow. Worst case, we'll have to use
> some
> >> > other property name.
> >> >
> >> > P.S. Either way, we *must* document the behavior in the README. I'll
> do
> >> > that once I figure out the exact mvn command line.
> >> >
> >> > Cheers,
> >> > Sergiy.
> >> >
> >> > On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de>
> wrote:
> >> >
> >> >> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
> >> >> <se...@gmail.com> wrote:
> >> >> > In other words, it would be much easier to run and debug our apps
> on
> >> cisl
> >> >> > cluster if we compile against hadoop 2.7.3; that would also make
> >> >> Unmanaged
> >> >> > AM mode available without caveats.
> >> >>
> >> >> Can't we add this to the readme and suggest people compile with `-D
> >> >> haddop.version=2.7.3` if they are interested in these new features?
> We
> >> >> can then require 2.7.3 for the next release.
> >> >>
> >> >> Markus
> >> >>
> >>
> >
> >
> >
> > --
> > Anupam
> > Bellevue, WA
> > Ph: +1 (425)-777-5570
>



-- 
Byung-Gon Chun

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Markus Weimer <ma...@weimo.de>.
Awesome! Can you file JIRAs for this, blocked by the 0.16 release?

Thanks!

Markus

On Tue, Mar 28, 2017 at 7:32 PM, Anupam <an...@gmail.com> wrote:
> Once we move to version 2.7+ we can also deprecate
> Org.Apache.REEF.Client.Yarn.LegacyJobResourceUploader after implementing
> Org.Apache.REEF.IO.FileSystem.Hadoop.HadoopFileSystem.GetFileStatus.
>
>
>
> On 28 March 2017 at 10:29, Markus Weimer <ma...@weimo.de> wrote:
>
>> Hi Sergiy,
>>
>> that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in our
>> pom.xml. Maybe you fall victim to some odd escaping issue?
>>
>> Markus
>>
>> On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
>> <se...@gmail.com> wrote:
>> > Hi Markus,
>> >
>> > That sounds like a very good idea! I currently have some problems running
>> > maven like this (it does not like the dot at "hadoop.version" property
>> > name), but I'll keep trying tomorrow. Worst case, we'll have to use some
>> > other property name.
>> >
>> > P.S. Either way, we *must* document the behavior in the README. I'll do
>> > that once I figure out the exact mvn command line.
>> >
>> > Cheers,
>> > Sergiy.
>> >
>> > On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de> wrote:
>> >
>> >> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
>> >> <se...@gmail.com> wrote:
>> >> > In other words, it would be much easier to run and debug our apps on
>> cisl
>> >> > cluster if we compile against hadoop 2.7.3; that would also make
>> >> Unmanaged
>> >> > AM mode available without caveats.
>> >>
>> >> Can't we add this to the readme and suggest people compile with `-D
>> >> haddop.version=2.7.3` if they are interested in these new features? We
>> >> can then require 2.7.3 for the next release.
>> >>
>> >> Markus
>> >>
>>
>
>
>
> --
> Anupam
> Bellevue, WA
> Ph: +1 (425)-777-5570

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Anupam <an...@gmail.com>.
Once we move to version 2.7+ we can also deprecate
Org.Apache.REEF.Client.Yarn.LegacyJobResourceUploader after implementing
Org.Apache.REEF.IO.FileSystem.Hadoop.HadoopFileSystem.GetFileStatus.



On 28 March 2017 at 10:29, Markus Weimer <ma...@weimo.de> wrote:

> Hi Sergiy,
>
> that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in our
> pom.xml. Maybe you fall victim to some odd escaping issue?
>
> Markus
>
> On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
> <se...@gmail.com> wrote:
> > Hi Markus,
> >
> > That sounds like a very good idea! I currently have some problems running
> > maven like this (it does not like the dot at "hadoop.version" property
> > name), but I'll keep trying tomorrow. Worst case, we'll have to use some
> > other property name.
> >
> > P.S. Either way, we *must* document the behavior in the README. I'll do
> > that once I figure out the exact mvn command line.
> >
> > Cheers,
> > Sergiy.
> >
> > On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de> wrote:
> >
> >> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
> >> <se...@gmail.com> wrote:
> >> > In other words, it would be much easier to run and debug our apps on
> cisl
> >> > cluster if we compile against hadoop 2.7.3; that would also make
> >> Unmanaged
> >> > AM mode available without caveats.
> >>
> >> Can't we add this to the readme and suggest people compile with `-D
> >> haddop.version=2.7.3` if they are interested in these new features? We
> >> can then require 2.7.3 for the next release.
> >>
> >> Markus
> >>
>



-- 
Anupam
Bellevue, WA
Ph: +1 (425)-777-5570

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Markus Weimer <ma...@weimo.de>.
Hi Sergiy,

that is odd. We have `<hadoop.version>2.6.0</hadoop.version>` in our
pom.xml. Maybe you fall victim to some odd escaping issue?

Markus

On Mon, Mar 27, 2017 at 11:40 PM, Sergiy Matusevych
<se...@gmail.com> wrote:
> Hi Markus,
>
> That sounds like a very good idea! I currently have some problems running
> maven like this (it does not like the dot at "hadoop.version" property
> name), but I'll keep trying tomorrow. Worst case, we'll have to use some
> other property name.
>
> P.S. Either way, we *must* document the behavior in the README. I'll do
> that once I figure out the exact mvn command line.
>
> Cheers,
> Sergiy.
>
> On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de> wrote:
>
>> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
>> <se...@gmail.com> wrote:
>> > In other words, it would be much easier to run and debug our apps on cisl
>> > cluster if we compile against hadoop 2.7.3; that would also make
>> Unmanaged
>> > AM mode available without caveats.
>>
>> Can't we add this to the readme and suggest people compile with `-D
>> haddop.version=2.7.3` if they are interested in these new features? We
>> can then require 2.7.3 for the next release.
>>
>> Markus
>>

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Sergiy Matusevych <se...@gmail.com>.
Hi Markus,

That sounds like a very good idea! I currently have some problems running
maven like this (it does not like the dot at "hadoop.version" property
name), but I'll keep trying tomorrow. Worst case, we'll have to use some
other property name.

P.S. Either way, we *must* document the behavior in the README. I'll do
that once I figure out the exact mvn command line.

Cheers,
Sergiy.

On Mon, Mar 27, 2017 at 9:35 PM, Markus Weimer <ma...@weimo.de> wrote:

> On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
> <se...@gmail.com> wrote:
> > In other words, it would be much easier to run and debug our apps on cisl
> > cluster if we compile against hadoop 2.7.3; that would also make
> Unmanaged
> > AM mode available without caveats.
>
> Can't we add this to the readme and suggest people compile with `-D
> haddop.version=2.7.3` if they are interested in these new features? We
> can then require 2.7.3 for the next release.
>
> Markus
>

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Markus Weimer <ma...@weimo.de>.
On Mon, Mar 27, 2017 at 8:13 PM, Sergiy Matusevych
<se...@gmail.com> wrote:
> In other words, it would be much easier to run and debug our apps on cisl
> cluster if we compile against hadoop 2.7.3; that would also make Unmanaged
> AM mode available without caveats.

Can't we add this to the readme and suggest people compile with `-D
haddop.version=2.7.3` if they are interested in these new features? We
can then require 2.7.3 for the next release.

Markus

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Sergiy Matusevych <se...@gmail.com>.
Hi Markus, Gon,

These are very good questions. Here's what I observe:

1) REEF (including Unmanaged AM functionality) compiles with Hadoop 2.6
jars just fine.

2) All unit tests pass in local mode regardless of Hadoop version that REEF
is compiled with.

3) When compiled against Hadoop 2.6 (i.e. current default), REEF java apps
and unit tests run on YARN only when Hadoop 2.6 jars are in the classpath.
Note that it is still OK to run YARN 2.7+ RM, NM, and other services - we
just need 2.6 jars available to our app at runtime - on all nodes that run
REEF Java parts.

3a) Item #3 is why Todd's application did not run on cisl-slave-002. It was
quite hard to figure out - the app fails at runtime because it cannot find
some method that was available in Hadoop 2.6 but not in 2.7.

4) Our CISL cluster runs Hadoop 2.7.1. If we compile REEF against Hadoop
2.7.3, everything works fine on YARN.

5) I don't know what version of YARN HDInsight uses - I'll ask tomorrow.
What really matters, though, is what version of Hadoop jars we have
available to the apps on HDI.


In other words, it would be much easier to run and debug our apps on cisl
cluster if we compile against hadoop 2.7.3; that would also make Unmanaged
AM mode available without caveats.

Alternatively, we can keep compiling against hadoop 2.6, but we have to
write a warning in the README that the same hadoop 2.6 jars must be
available in the classpath on all YARN nodes, and that YARN 2.7.3+ RM is
required for the Unmanaged AM mode.

Cheers,
Sergiy.



On Mon, Mar 27, 2017 at 5:56 PM, Byung-Gon Chun <bg...@gmail.com> wrote:

> What version of YARN does HDInsight use? Is it fine to upgrade to YARN
> 2.7.3?
>
> If not, for now we can state that Unmanaged AM (REEF-as-a-library) requires
> YARN 2.7.3 or higher in README.
>
>
>
> On Tue, Mar 28, 2017 at 2:34 AM, Sergiy Matusevych <
> sergiy.matusevych@gmail.com> wrote:
>
> > Hi fellow REEF developers,
> >
> > Our build currently requires Hadoop 2.6. This is fine for all things
> REEF,
> > except the new Unmanaged AM feature (plus the related functionality, like
> > REEF-on-REEF, Spark integration, etc), that requires YARN 2.7.3 or higher
> > to work properly. Do we want to upgrade our build to Hadoop 2.7.3 in 0.16
> > to be in sync with the YARN server version that we need?
> >
> > What do you guys think?
> > -- Sergiy.
> >
>
>
>
> --
> Byung-Gon Chun
>

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Byung-Gon Chun <bg...@gmail.com>.
What version of YARN does HDInsight use? Is it fine to upgrade to YARN
2.7.3?

If not, for now we can state that Unmanaged AM (REEF-as-a-library) requires
YARN 2.7.3 or higher in README.



On Tue, Mar 28, 2017 at 2:34 AM, Sergiy Matusevych <
sergiy.matusevych@gmail.com> wrote:

> Hi fellow REEF developers,
>
> Our build currently requires Hadoop 2.6. This is fine for all things REEF,
> except the new Unmanaged AM feature (plus the related functionality, like
> REEF-on-REEF, Spark integration, etc), that requires YARN 2.7.3 or higher
> to work properly. Do we want to upgrade our build to Hadoop 2.7.3 in 0.16
> to be in sync with the YARN server version that we need?
>
> What do you guys think?
> -- Sergiy.
>



-- 
Byung-Gon Chun

Re: Upgrade to Hadoop 2.7.3 for 0.16?

Posted by Markus Weimer <ma...@weimo.de>.
Does the code not compile with 2.6? -- Markus

On Mon, Mar 27, 2017 at 10:34 AM, Sergiy Matusevych
<se...@gmail.com> wrote:
> Hi fellow REEF developers,
>
> Our build currently requires Hadoop 2.6. This is fine for all things REEF,
> except the new Unmanaged AM feature (plus the related functionality, like
> REEF-on-REEF, Spark integration, etc), that requires YARN 2.7.3 or higher
> to work properly. Do we want to upgrade our build to Hadoop 2.7.3 in 0.16
> to be in sync with the YARN server version that we need?
>
> What do you guys think?
> -- Sergiy.