You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "MrAsanjar ." <af...@gmail.com> on 2016/01/28 05:05:04 UTC

A Bigtop build question

Does the artifacts in maven "*local*" repository remain persistent between
bigtop component builds (i.e. Hadoop, Spark, HIve..)??

This question is relevant to any bigtop project (i.e. Spark, HBase,
Hive,..) with build dependency on Hadoop libraries.

In the event of Power8 (or any none-x86) build, these projects require
Hadoop libraries for Power8 copied to the "*local*" maven repository prior
to the build. Maven public repository hosts only x86 version of Hadoop
artifacts.

For example, before building Spark for Power, bigtop Hadoop build for Power
must first get copied to the local maven repository.

However, evidence suggests that either "install" option included in
Hadoop's do-component-build file isn't working as designed :)  or maven
local repository is not persistent.

mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
install "$@"

Re: A Bigtop build question

Posted by "MrAsanjar ." <af...@gmail.com>.
Jay, hadoop also has native libraries with JNI wrapper that is platform
specific.


On Wed, Jan 27, 2016 at 10:21 PM, Jay Vyas <ja...@gmail.com>
wrote:

> Maven is building jar files...
>
> Shouldn't those "just work" on powers JVM the way any jar would?
>
> > On Jan 27, 2016, at 11:05 PM, MrAsanjar . <af...@gmail.com> wrote:
> >
> > Does the artifacts in maven "*local*" repository remain persistent
> between
> > bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> >
> > This question is relevant to any bigtop project (i.e. Spark, HBase,
> > Hive,..) with build dependency on Hadoop libraries.
> >
> > In the event of Power8 (or any none-x86) build, these projects require
> > Hadoop libraries for Power8 copied to the "*local*" maven repository
> prior
> > to the build. Maven public repository hosts only x86 version of Hadoop
> > artifacts.
> >
> > For example, before building Spark for Power, bigtop Hadoop build for
> Power
> > must first get copied to the local maven repository.
> >
> > However, evidence suggests that either "install" option included in
> > Hadoop's do-component-build file isn't working as designed :)  or maven
> > local repository is not persistent.
> >
> > mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
> > install "$@"
>

Re: A Bigtop build question

Posted by Jay Vyas <ja...@gmail.com>.
Maven is building jar files...

Shouldn't those "just work" on powers JVM the way any jar would?

> On Jan 27, 2016, at 11:05 PM, MrAsanjar . <af...@gmail.com> wrote:
> 
> Does the artifacts in maven "*local*" repository remain persistent between
> bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> 
> This question is relevant to any bigtop project (i.e. Spark, HBase,
> Hive,..) with build dependency on Hadoop libraries.
> 
> In the event of Power8 (or any none-x86) build, these projects require
> Hadoop libraries for Power8 copied to the "*local*" maven repository prior
> to the build. Maven public repository hosts only x86 version of Hadoop
> artifacts.
> 
> For example, before building Spark for Power, bigtop Hadoop build for Power
> must first get copied to the local maven repository.
> 
> However, evidence suggests that either "install" option included in
> Hadoop's do-component-build file isn't working as designed :)  or maven
> local repository is not persistent.
> 
> mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
> install "$@"

Re: A Bigtop build question

Posted by "MrAsanjar ." <af...@gmail.com>.
excellent.
Have you also considered setting up a BigTop distribution private maven
repository in S3?
It is not only useful for the current issue but also for development. As a
big data application developer, I would rather have my mapreduce or spark
application built with BigTop distribution artifacts.
I could help with building such repository, we could start with Power a PoC.

On Wed, Jan 27, 2016 at 11:42 PM, Konstantin Boudnik <co...@apache.org> wrote:

> On Wed, Jan 27, 2016 at 11:12PM, MrAsanjar . wrote:
> > Cos, regardless of this Power issue, wouldn't you think bigtop as a
> > distribution should first consume artifact built by the distribution.
>
> I would think so, yes. In fact, everyone on this list are thinking exactly
> this. We had these discussions a couple of times and even an attempt or
> two to
> implement it. Let me explain more further down.
>
> > Anyway, how about this as a workaround. Could we add step "./gardlew
> > install-hadoop"
> > after Hadoop build and before the other components. for example:
> >
> >   docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l
> -c
> > 'cd /ws ; ./gradlew hadoop-deb'
> >
> > *./gradlew install-hadoop*
> >
> >  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
> > 'cd /ws ; ./gradlew spark-deb'
> >  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
> > 'cd /ws ; ./gradlew hbase-deb'
>
> This would require a pass-through of a tasks non-specific to the top-level
> build all the way down to the packages. And it isn't enough to just install
> the artifact: you would need to mount to all consequent container runs the
> shared local .m2 directory. In case of our CI we actually will have to
> deploy
> the artifacts into a nexus server, so other parts of the stack build can
> take
> the advantage of that. Which leads to a need to preserve the order of build
> execution (which we can do with directional graph feature of the build, but
> not in separate docker containers).
>
> I expect to see a patch for a more transparent install/deploy mechanism for
> Hadoop and ZK sometime tomorrow. It has been discussed earlier today with
> someone who wants to contribute this piece of code into Bigtop.
>
> Cos
>
> > .....
> > .....
> > .....
> >
> > On Wed, Jan 27, 2016 at 10:27 PM, Konstantin Boudnik <co...@apache.org>
> wrote:
> >
> > > On Wed, Jan 27, 2016 at 10:05PM, MrAsanjar . wrote:
> > > > Does the artifacts in maven "*local*" repository remain persistent
> > > between
> > > > bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> > > >
> > > > This question is relevant to any bigtop project (i.e. Spark, HBase,
> > > > Hive,..) with build dependency on Hadoop libraries.
> > > >
> > > > In the event of Power8 (or any none-x86) build, these projects
> require
> > > > Hadoop libraries for Power8 copied to the "*local*" maven repository
> > > prior
> > > > to the build. Maven public repository hosts only x86 version of
> Hadoop
> > > > artifacts.
> > > >
> > > > For example, before building Spark for Power, bigtop Hadoop build for
> > > Power
> > > > must first get copied to the local maven repository.
> > > >
> > > > However, evidence suggests that either "install" option included in
> > > > Hadoop's do-component-build file isn't working as designed :)  or
> maven
> > > > local repository is not persistent.
> > > >
> > > > mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar
> ${MAVEN_OPTS}
> > > > install "$@"
> > >
> > > if you're building this with Bigtop current CI, then it is possible
> that
> > > ~/.m2
> > > isn't persisted between two different containers. However, if you're
> > > running
> > > component builds on the same machine, local repo-artifacts should be
> > > persisted.
> > >
> > > Cos
> > >
>

Re: A Bigtop build question

Posted by Ashish Singh <as...@hortonworks.com>.
Sure Cos, will provide the patch for hadoop and zookeeper to begin with.

@Asanjar,

With respect to mvn install, are you saying that maven is not installing the artifacts in local ~/.m2 cache on Power machines ?
Or I am not understanding your question properly.

I hope on Power machine all the components are getting build sequentially and on the same machine. 
In that case ~/.m2 cache artifacts can be consumed for downstream build dependency. 

If components are build on different power machines/containers, in that case local ~/.m2 cache has to be share across build machines via some sharing mechanism. 
The another approach is to upload the artifacts to the nexus/artifactory, so that downstream components can consume directly form the nexus/artifactory if not present in local ~/.m2 cache.

Anyways, I am working on the deploy part, so that components can deploy the artifacts to the nexus/artifactory along with the builds.

~Ashish




On 1/27/16, 9:42 PM, "Konstantin Boudnik" <co...@apache.org> wrote:

>On Wed, Jan 27, 2016 at 11:12PM, MrAsanjar . wrote:
>> Cos, regardless of this Power issue, wouldn't you think bigtop as a
>> distribution should first consume artifact built by the distribution.
>
>I would think so, yes. In fact, everyone on this list are thinking exactly
>this. We had these discussions a couple of times and even an attempt or two to
>implement it. Let me explain more further down.
>
>> Anyway, how about this as a workaround. Could we add step "./gardlew
>> install-hadoop"
>> after Hadoop build and before the other components. for example:
>> 
>>   docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
>> 'cd /ws ; ./gradlew hadoop-deb'
>> 
>> *./gradlew install-hadoop*
>> 
>>  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
>> 'cd /ws ; ./gradlew spark-deb'
>>  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
>> 'cd /ws ; ./gradlew hbase-deb'
>
>This would require a pass-through of a tasks non-specific to the top-level
>build all the way down to the packages. And it isn't enough to just install
>the artifact: you would need to mount to all consequent container runs the
>shared local .m2 directory. In case of our CI we actually will have to deploy
>the artifacts into a nexus server, so other parts of the stack build can take
>the advantage of that. Which leads to a need to preserve the order of build
>execution (which we can do with directional graph feature of the build, but
>not in separate docker containers).
>
>I expect to see a patch for a more transparent install/deploy mechanism for
>Hadoop and ZK sometime tomorrow. It has been discussed earlier today with
>someone who wants to contribute this piece of code into Bigtop.
>
>Cos
>
>> .....
>> .....
>> .....
>> 
>> On Wed, Jan 27, 2016 at 10:27 PM, Konstantin Boudnik <co...@apache.org> wrote:
>> 
>> > On Wed, Jan 27, 2016 at 10:05PM, MrAsanjar . wrote:
>> > > Does the artifacts in maven "*local*" repository remain persistent
>> > between
>> > > bigtop component builds (i.e. Hadoop, Spark, HIve..)??
>> > >
>> > > This question is relevant to any bigtop project (i.e. Spark, HBase,
>> > > Hive,..) with build dependency on Hadoop libraries.
>> > >
>> > > In the event of Power8 (or any none-x86) build, these projects require
>> > > Hadoop libraries for Power8 copied to the "*local*" maven repository
>> > prior
>> > > to the build. Maven public repository hosts only x86 version of Hadoop
>> > > artifacts.
>> > >
>> > > For example, before building Spark for Power, bigtop Hadoop build for
>> > Power
>> > > must first get copied to the local maven repository.
>> > >
>> > > However, evidence suggests that either "install" option included in
>> > > Hadoop's do-component-build file isn't working as designed :)  or maven
>> > > local repository is not persistent.
>> > >
>> > > mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
>> > > install "$@"
>> >
>> > if you're building this with Bigtop current CI, then it is possible that
>> > ~/.m2
>> > isn't persisted between two different containers. However, if you're
>> > running
>> > component builds on the same machine, local repo-artifacts should be
>> > persisted.
>> >
>> > Cos
>> >

Re: A Bigtop build question

Posted by Konstantin Boudnik <co...@apache.org>.
On Wed, Jan 27, 2016 at 11:12PM, MrAsanjar . wrote:
> Cos, regardless of this Power issue, wouldn't you think bigtop as a
> distribution should first consume artifact built by the distribution.

I would think so, yes. In fact, everyone on this list are thinking exactly
this. We had these discussions a couple of times and even an attempt or two to
implement it. Let me explain more further down.

> Anyway, how about this as a workaround. Could we add step "./gardlew
> install-hadoop"
> after Hadoop build and before the other components. for example:
> 
>   docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
> 'cd /ws ; ./gradlew hadoop-deb'
> 
> *./gradlew install-hadoop*
> 
>  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
> 'cd /ws ; ./gradlew spark-deb'
>  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
> 'cd /ws ; ./gradlew hbase-deb'

This would require a pass-through of a tasks non-specific to the top-level
build all the way down to the packages. And it isn't enough to just install
the artifact: you would need to mount to all consequent container runs the
shared local .m2 directory. In case of our CI we actually will have to deploy
the artifacts into a nexus server, so other parts of the stack build can take
the advantage of that. Which leads to a need to preserve the order of build
execution (which we can do with directional graph feature of the build, but
not in separate docker containers).

I expect to see a patch for a more transparent install/deploy mechanism for
Hadoop and ZK sometime tomorrow. It has been discussed earlier today with
someone who wants to contribute this piece of code into Bigtop.

Cos

> .....
> .....
> .....
> 
> On Wed, Jan 27, 2016 at 10:27 PM, Konstantin Boudnik <co...@apache.org> wrote:
> 
> > On Wed, Jan 27, 2016 at 10:05PM, MrAsanjar . wrote:
> > > Does the artifacts in maven "*local*" repository remain persistent
> > between
> > > bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> > >
> > > This question is relevant to any bigtop project (i.e. Spark, HBase,
> > > Hive,..) with build dependency on Hadoop libraries.
> > >
> > > In the event of Power8 (or any none-x86) build, these projects require
> > > Hadoop libraries for Power8 copied to the "*local*" maven repository
> > prior
> > > to the build. Maven public repository hosts only x86 version of Hadoop
> > > artifacts.
> > >
> > > For example, before building Spark for Power, bigtop Hadoop build for
> > Power
> > > must first get copied to the local maven repository.
> > >
> > > However, evidence suggests that either "install" option included in
> > > Hadoop's do-component-build file isn't working as designed :)  or maven
> > > local repository is not persistent.
> > >
> > > mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
> > > install "$@"
> >
> > if you're building this with Bigtop current CI, then it is possible that
> > ~/.m2
> > isn't persisted between two different containers. However, if you're
> > running
> > component builds on the same machine, local repo-artifacts should be
> > persisted.
> >
> > Cos
> >

Re: A Bigtop build question

Posted by "MrAsanjar ." <af...@gmail.com>.
Cos, regardless of this Power issue, wouldn't you think bigtop as a
distribution should first consume artifact built by the distribution.

Anyway, how about this as a workaround. Could we add step "./gardlew
install-hadoop"
after Hadoop build and before the other components. for example:

  docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
'cd /ws ; ./gradlew hadoop-deb'

*./gradlew install-hadoop*

 docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
'cd /ws ; ./gradlew spark-deb'
 docker run -v `pwd`:/ws bigtop/slaves:trunk-fedora-22-ppc64le bash -l -c
'cd /ws ; ./gradlew hbase-deb'
.....
.....
.....

On Wed, Jan 27, 2016 at 10:27 PM, Konstantin Boudnik <co...@apache.org> wrote:

> On Wed, Jan 27, 2016 at 10:05PM, MrAsanjar . wrote:
> > Does the artifacts in maven "*local*" repository remain persistent
> between
> > bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> >
> > This question is relevant to any bigtop project (i.e. Spark, HBase,
> > Hive,..) with build dependency on Hadoop libraries.
> >
> > In the event of Power8 (or any none-x86) build, these projects require
> > Hadoop libraries for Power8 copied to the "*local*" maven repository
> prior
> > to the build. Maven public repository hosts only x86 version of Hadoop
> > artifacts.
> >
> > For example, before building Spark for Power, bigtop Hadoop build for
> Power
> > must first get copied to the local maven repository.
> >
> > However, evidence suggests that either "install" option included in
> > Hadoop's do-component-build file isn't working as designed :)  or maven
> > local repository is not persistent.
> >
> > mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
> > install "$@"
>
> if you're building this with Bigtop current CI, then it is possible that
> ~/.m2
> isn't persisted between two different containers. However, if you're
> running
> component builds on the same machine, local repo-artifacts should be
> persisted.
>
> Cos
>

Re: A Bigtop build question

Posted by Konstantin Boudnik <co...@apache.org>.
On Wed, Jan 27, 2016 at 10:05PM, MrAsanjar . wrote:
> Does the artifacts in maven "*local*" repository remain persistent between
> bigtop component builds (i.e. Hadoop, Spark, HIve..)??
> 
> This question is relevant to any bigtop project (i.e. Spark, HBase,
> Hive,..) with build dependency on Hadoop libraries.
> 
> In the event of Power8 (or any none-x86) build, these projects require
> Hadoop libraries for Power8 copied to the "*local*" maven repository prior
> to the build. Maven public repository hosts only x86 version of Hadoop
> artifacts.
> 
> For example, before building Spark for Power, bigtop Hadoop build for Power
> must first get copied to the local maven repository.
> 
> However, evidence suggests that either "install" option included in
> Hadoop's do-component-build file isn't working as designed :)  or maven
> local repository is not persistent.
> 
> mvn $ANT_OPTS $BUNDLE_SNAPPY -Pdist -Pnative -Psrc -Dtar ${MAVEN_OPTS}
> install "$@"

if you're building this with Bigtop current CI, then it is possible that ~/.m2
isn't persisted between two different containers. However, if you're running
component builds on the same machine, local repo-artifacts should be persisted.

Cos