You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Demai Ni <ni...@gmail.com> on 2015/06/01 22:37:40 UTC

a non-commerial distribution of hadoop ecosystem?

hi, Guys,

I have been doing some research/POC using hadoop system. Normally, I either
use homebrew on mac for single node installation, or use CDH(Cloudera) for
a 3~4 nodes small linux cluster.

My question is besides the commercial distributions: CDH(Cloudera)  , HDP
(Horton work), and others like Mapr, IBM... Is there a distribution that is
NOT owned by a company?  I am looking for something simple for cluster
configuration/installation for multiple components: hdfs, yarn, zookeeper,
hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
he/she can build the distribution from Apache releases. Well, I am more
interested on building application on top of it, and hopefully to find one
packed them together.

BTW, I don't need the latest releases like other commercial distribution
offered.  I am also looking into the ODP(the open data platform), but that
project is kind of quiet after the initial Feb announcement.

Thanks.

 Demai

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Evans Ye <ev...@apache.org>.
Our company is also using bigtop to build up our private hadoop
distribution.
A nice thing about bigtop is that you can patch the sourse code, add or
modify features you'd like to have and then leverage bigtop's packaging and
testing framework to produce production ready packages for cluster admin to
easily upgrade the cluster either to fix bugs or introduce new features.
We're currently upgrading our production hadoop to bigtop 1.0, which
consist some of our private patches already. The distribution looks
promising since we've been backed by bigtop's CI and testing framework.



2015-06-02 10:27 GMT+08:00 Demai Ni <ni...@gmail.com>:

> Andrew,
>
> great to hear that you are also using BigTop. I will surely try it out, to
> replace my (a little bit) old CDH cluster. :-)
>
> cheers
>
> Demai
>
> On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
> wrote:
>
> > Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> > project that produces a build framework that takes as input source from
> > Hadoop and related big data projects and produces as output OS native
> > packages for installation and management - certainly, a distribution of
> the
> > Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> > the distribution components are working well together, coupled with a
> suite
> > of Puppet scripts for post-deploy configuration management. It's a rather
> > large nutshell. (Smile)  Bigtop distribution packages are supported by
> > Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> > tried it).
> >
> > I've personally used Bigtop for years to produce several custom Hadoop
> > distributions. For this purpose it is a great tool.
> >
> > Please mail user@bigtop.apache.org if you would like to know more, we'd
> > love to talk with you.
> >
> >
> > On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> >
> > Chris and Roman,
> >
> > many thanks for the quick response.  I will take a look at bigtop.
> > Actually, I heard about it, but thought it is a installation framework,
> > instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> > hadoop instruction, which probably will work fine for my needs.
> Appreciate
> > the pointer.
> >
> > Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> > for me. Well, in reality, it is owned by a few companies, at least not by
> > ONE company. :-)  It is fine with me, as long as ODP is open to be used
> by
> > others. I am just having trouble to find document/installation info of
> the
> > ODP. maybe I should google harder? :-)
> >
> > Demai
> >
> >
> > On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> >
> >> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> >> > My question is besides the commercial distributions: CDH(Cloudera)  ,
> >> HDP
> >> > (Horton work), and others like Mapr, IBM... Is there a distribution
> >> that is
> >> > NOT owned by a company?  I am looking for something simple for cluster
> >> > configuration/installation for multiple components: hdfs, yarn,
> >> zookeeper,
> >> > hive, hbase, maybe Spark. Surely, for a well-experience person(not
> me),
> >> > he/she can build the distribution from Apache releases. Well, I am
> more
> >> > interested on building application on top of it, and hopefully to find
> >> one
> >> > packed them together.
> >>
> >> Apache Bigtop (CCed) aims at delivering a 100% open and
> >> community-driven distribution of big data management technologies
> >> around Apache Hadoop. Same as, for example, what Debian is trying
> >> to do for Linux.
> >>
> >> > BTW, I don't need the latest releases like other commercial
> distribution
> >> > offered.  I am also looking into the ODP(the open data platform), but
> >> that
> >> > project is kind of quiet after the initial Feb announcement.
> >>
> >> Feel free to ping me off list if you want more details on ODP.
> >>
> >> Thanks,
> >> Roman.
> >>
> >
> >
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Andrew,

great to hear that you are also using BigTop. I will surely try it out, to
replace my (a little bit) old CDH cluster. :-)

cheers

Demai

On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> project that produces a build framework that takes as input source from
> Hadoop and related big data projects and produces as output OS native
> packages for installation and management - certainly, a distribution of the
> Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> the distribution components are working well together, coupled with a suite
> of Puppet scripts for post-deploy configuration management. It's a rather
> large nutshell. (Smile)  Bigtop distribution packages are supported by
> Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> tried it).
>
> I've personally used Bigtop for years to produce several custom Hadoop
> distributions. For this purpose it is a great tool.
>
> Please mail user@bigtop.apache.org if you would like to know more, we'd
> love to talk with you.
>
>
> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
>
> Chris and Roman,
>
> many thanks for the quick response.  I will take a look at bigtop.
> Actually, I heard about it, but thought it is a installation framework,
> instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> hadoop instruction, which probably will work fine for my needs. Appreciate
> the pointer.
>
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> for me. Well, in reality, it is owned by a few companies, at least not by
> ONE company. :-)  It is fine with me, as long as ODP is open to be used by
> others. I am just having trouble to find document/installation info of the
> ODP. maybe I should google harder? :-)
>
> Demai
>
>
> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  ,
>> HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution
>> that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn,
>> zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find
>> one
>> > packed them together.
>>
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>>
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but
>> that
>> > project is kind of quiet after the initial Feb announcement.
>>
>> Feel free to ping me off list if you want more details on ODP.
>>
>> Thanks,
>> Roman.
>>
>
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Andrew,

great to hear that you are also using BigTop. I will surely try it out, to
replace my (a little bit) old CDH cluster. :-)

cheers

Demai

On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> project that produces a build framework that takes as input source from
> Hadoop and related big data projects and produces as output OS native
> packages for installation and management - certainly, a distribution of the
> Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> the distribution components are working well together, coupled with a suite
> of Puppet scripts for post-deploy configuration management. It's a rather
> large nutshell. (Smile)  Bigtop distribution packages are supported by
> Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> tried it).
>
> I've personally used Bigtop for years to produce several custom Hadoop
> distributions. For this purpose it is a great tool.
>
> Please mail user@bigtop.apache.org if you would like to know more, we'd
> love to talk with you.
>
>
> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
>
> Chris and Roman,
>
> many thanks for the quick response.  I will take a look at bigtop.
> Actually, I heard about it, but thought it is a installation framework,
> instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> hadoop instruction, which probably will work fine for my needs. Appreciate
> the pointer.
>
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> for me. Well, in reality, it is owned by a few companies, at least not by
> ONE company. :-)  It is fine with me, as long as ODP is open to be used by
> others. I am just having trouble to find document/installation info of the
> ODP. maybe I should google harder? :-)
>
> Demai
>
>
> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  ,
>> HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution
>> that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn,
>> zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find
>> one
>> > packed them together.
>>
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>>
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but
>> that
>> > project is kind of quiet after the initial Feb announcement.
>>
>> Feel free to ping me off list if you want more details on ODP.
>>
>> Thanks,
>> Roman.
>>
>
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Andrew,

great to hear that you are also using BigTop. I will surely try it out, to
replace my (a little bit) old CDH cluster. :-)

cheers

Demai

On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> project that produces a build framework that takes as input source from
> Hadoop and related big data projects and produces as output OS native
> packages for installation and management - certainly, a distribution of the
> Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> the distribution components are working well together, coupled with a suite
> of Puppet scripts for post-deploy configuration management. It's a rather
> large nutshell. (Smile)  Bigtop distribution packages are supported by
> Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> tried it).
>
> I've personally used Bigtop for years to produce several custom Hadoop
> distributions. For this purpose it is a great tool.
>
> Please mail user@bigtop.apache.org if you would like to know more, we'd
> love to talk with you.
>
>
> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
>
> Chris and Roman,
>
> many thanks for the quick response.  I will take a look at bigtop.
> Actually, I heard about it, but thought it is a installation framework,
> instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> hadoop instruction, which probably will work fine for my needs. Appreciate
> the pointer.
>
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> for me. Well, in reality, it is owned by a few companies, at least not by
> ONE company. :-)  It is fine with me, as long as ODP is open to be used by
> others. I am just having trouble to find document/installation info of the
> ODP. maybe I should google harder? :-)
>
> Demai
>
>
> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  ,
>> HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution
>> that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn,
>> zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find
>> one
>> > packed them together.
>>
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>>
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but
>> that
>> > project is kind of quiet after the initial Feb announcement.
>>
>> Feel free to ping me off list if you want more details on ODP.
>>
>> Thanks,
>> Roman.
>>
>
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Andrew,

great to hear that you are also using BigTop. I will surely try it out, to
replace my (a little bit) old CDH cluster. :-)

cheers

Demai

On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> project that produces a build framework that takes as input source from
> Hadoop and related big data projects and produces as output OS native
> packages for installation and management - certainly, a distribution of the
> Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> the distribution components are working well together, coupled with a suite
> of Puppet scripts for post-deploy configuration management. It's a rather
> large nutshell. (Smile)  Bigtop distribution packages are supported by
> Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> tried it).
>
> I've personally used Bigtop for years to produce several custom Hadoop
> distributions. For this purpose it is a great tool.
>
> Please mail user@bigtop.apache.org if you would like to know more, we'd
> love to talk with you.
>
>
> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
>
> Chris and Roman,
>
> many thanks for the quick response.  I will take a look at bigtop.
> Actually, I heard about it, but thought it is a installation framework,
> instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> hadoop instruction, which probably will work fine for my needs. Appreciate
> the pointer.
>
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> for me. Well, in reality, it is owned by a few companies, at least not by
> ONE company. :-)  It is fine with me, as long as ODP is open to be used by
> others. I am just having trouble to find document/installation info of the
> ODP. maybe I should google harder? :-)
>
> Demai
>
>
> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  ,
>> HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution
>> that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn,
>> zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find
>> one
>> > packed them together.
>>
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>>
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but
>> that
>> > project is kind of quiet after the initial Feb announcement.
>>
>> Feel free to ping me off list if you want more details on ODP.
>>
>> Thanks,
>> Roman.
>>
>
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Andrew,

great to hear that you are also using BigTop. I will surely try it out, to
replace my (a little bit) old CDH cluster. :-)

cheers

Demai

On Mon, Jun 1, 2015 at 5:29 PM, Andrew Purtell <an...@gmail.com>
wrote:

> Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache
> project that produces a build framework that takes as input source from
> Hadoop and related big data projects and produces as output OS native
> packages for installation and management - certainly, a distribution of the
> Hadoop ecosystem - coupled with a suite of integration tests for ensuring
> the distribution components are working well together, coupled with a suite
> of Puppet scripts for post-deploy configuration management. It's a rather
> large nutshell. (Smile)  Bigtop distribution packages are supported by
> Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't
> tried it).
>
> I've personally used Bigtop for years to produce several custom Hadoop
> distributions. For this purpose it is a great tool.
>
> Please mail user@bigtop.apache.org if you would like to know more, we'd
> love to talk with you.
>
>
> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
>
> Chris and Roman,
>
> many thanks for the quick response.  I will take a look at bigtop.
> Actually, I heard about it, but thought it is a installation framework,
> instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
> hadoop instruction, which probably will work fine for my needs. Appreciate
> the pointer.
>
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one
> for me. Well, in reality, it is owned by a few companies, at least not by
> ONE company. :-)  It is fine with me, as long as ODP is open to be used by
> others. I am just having trouble to find document/installation info of the
> ODP. maybe I should google harder? :-)
>
> Demai
>
>
> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  ,
>> HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution
>> that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn,
>> zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find
>> one
>> > packed them together.
>>
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>>
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but
>> that
>> > project is kind of quiet after the initial Feb announcement.
>>
>> Feel free to ping me off list if you want more details on ODP.
>>
>> Thanks,
>> Roman.
>>
>
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Andrew Purtell <an...@gmail.com>.
Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache project that produces a build framework that takes as input source from Hadoop and related big data projects and produces as output OS native packages for installation and management - certainly, a distribution of the Hadoop ecosystem - coupled with a suite of integration tests for ensuring the distribution components are working well together, coupled with a suite of Puppet scripts for post-deploy configuration management. It's a rather large nutshell. (Smile)  Bigtop distribution packages are supported by Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't tried it).

I've personally used Bigtop for years to produce several custom Hadoop distributions. For this purpose it is a great tool. 

Please mail user@bigtop.apache.org if you would like to know more, we'd love to talk with you.


> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> 
> Chris and Roman,
> 
> many thanks for the quick response.  I will take a look at bigtop. Actually, I heard about it, but thought it is a installation framework, instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0 hadoop instruction, which probably will work fine for my needs. Appreciate the pointer.
> 
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one for me. Well, in reality, it is owned by a few companies, at least not by ONE company. :-)  It is fine with me, as long as ODP is open to be used by others. I am just having trouble to find document/installation info of the ODP. maybe I should google harder? :-)
> 
> Demai 
> 
> 
>> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn, zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find one
>> > packed them together.
>> 
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>> 
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but that
>> > project is kind of quiet after the initial Feb announcement.
>> 
>> Feel free to ping me off list if you want more details on ODP.
>> 
>> Thanks,
>> Roman.
> 

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Andrew Purtell <an...@gmail.com>.
Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache project that produces a build framework that takes as input source from Hadoop and related big data projects and produces as output OS native packages for installation and management - certainly, a distribution of the Hadoop ecosystem - coupled with a suite of integration tests for ensuring the distribution components are working well together, coupled with a suite of Puppet scripts for post-deploy configuration management. It's a rather large nutshell. (Smile)  Bigtop distribution packages are supported by Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't tried it).

I've personally used Bigtop for years to produce several custom Hadoop distributions. For this purpose it is a great tool. 

Please mail user@bigtop.apache.org if you would like to know more, we'd love to talk with you.


> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> 
> Chris and Roman,
> 
> many thanks for the quick response.  I will take a look at bigtop. Actually, I heard about it, but thought it is a installation framework, instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0 hadoop instruction, which probably will work fine for my needs. Appreciate the pointer.
> 
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one for me. Well, in reality, it is owned by a few companies, at least not by ONE company. :-)  It is fine with me, as long as ODP is open to be used by others. I am just having trouble to find document/installation info of the ODP. maybe I should google harder? :-)
> 
> Demai 
> 
> 
>> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn, zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find one
>> > packed them together.
>> 
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>> 
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but that
>> > project is kind of quiet after the initial Feb announcement.
>> 
>> Feel free to ping me off list if you want more details on ODP.
>> 
>> Thanks,
>> Roman.
> 

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Andrew Purtell <an...@gmail.com>.
Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache project that produces a build framework that takes as input source from Hadoop and related big data projects and produces as output OS native packages for installation and management - certainly, a distribution of the Hadoop ecosystem - coupled with a suite of integration tests for ensuring the distribution components are working well together, coupled with a suite of Puppet scripts for post-deploy configuration management. It's a rather large nutshell. (Smile)  Bigtop distribution packages are supported by Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't tried it).

I've personally used Bigtop for years to produce several custom Hadoop distributions. For this purpose it is a great tool. 

Please mail user@bigtop.apache.org if you would like to know more, we'd love to talk with you.


> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> 
> Chris and Roman,
> 
> many thanks for the quick response.  I will take a look at bigtop. Actually, I heard about it, but thought it is a installation framework, instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0 hadoop instruction, which probably will work fine for my needs. Appreciate the pointer.
> 
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one for me. Well, in reality, it is owned by a few companies, at least not by ONE company. :-)  It is fine with me, as long as ODP is open to be used by others. I am just having trouble to find document/installation info of the ODP. maybe I should google harder? :-)
> 
> Demai 
> 
> 
>> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn, zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find one
>> > packed them together.
>> 
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>> 
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but that
>> > project is kind of quiet after the initial Feb announcement.
>> 
>> Feel free to ping me off list if you want more details on ODP.
>> 
>> Thanks,
>> Roman.
> 

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Andrew Purtell <an...@gmail.com>.
Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache project that produces a build framework that takes as input source from Hadoop and related big data projects and produces as output OS native packages for installation and management - certainly, a distribution of the Hadoop ecosystem - coupled with a suite of integration tests for ensuring the distribution components are working well together, coupled with a suite of Puppet scripts for post-deploy configuration management. It's a rather large nutshell. (Smile)  Bigtop distribution packages are supported by Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't tried it).

I've personally used Bigtop for years to produce several custom Hadoop distributions. For this purpose it is a great tool. 

Please mail user@bigtop.apache.org if you would like to know more, we'd love to talk with you.


> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> 
> Chris and Roman,
> 
> many thanks for the quick response.  I will take a look at bigtop. Actually, I heard about it, but thought it is a installation framework, instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0 hadoop instruction, which probably will work fine for my needs. Appreciate the pointer.
> 
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one for me. Well, in reality, it is owned by a few companies, at least not by ONE company. :-)  It is fine with me, as long as ODP is open to be used by others. I am just having trouble to find document/installation info of the ODP. maybe I should google harder? :-)
> 
> Demai 
> 
> 
>> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn, zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find one
>> > packed them together.
>> 
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>> 
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but that
>> > project is kind of quiet after the initial Feb announcement.
>> 
>> Feel free to ping me off list if you want more details on ODP.
>> 
>> Thanks,
>> Roman.
> 

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Andrew Purtell <an...@gmail.com>.
Bigtop, in a nutshell, is a non-commercial multi-stakeholder Apache project that produces a build framework that takes as input source from Hadoop and related big data projects and produces as output OS native packages for installation and management - certainly, a distribution of the Hadoop ecosystem - coupled with a suite of integration tests for ensuring the distribution components are working well together, coupled with a suite of Puppet scripts for post-deploy configuration management. It's a rather large nutshell. (Smile)  Bigtop distribution packages are supported by Cask's Coopr (coopr.io) and I think to some extent by Ambari (haven't tried it).

I've personally used Bigtop for years to produce several custom Hadoop distributions. For this purpose it is a great tool. 

Please mail user@bigtop.apache.org if you would like to know more, we'd love to talk with you.


> On Jun 2, 2015, at 7:16 AM, Demai Ni <ni...@gmail.com> wrote:
> 
> Chris and Roman,
> 
> many thanks for the quick response.  I will take a look at bigtop. Actually, I heard about it, but thought it is a installation framework, instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0 hadoop instruction, which probably will work fine for my needs. Appreciate the pointer.
> 
> Roman, I will ping you off list for ODP. I was hoping ODP will be the one for me. Well, in reality, it is owned by a few companies, at least not by ONE company. :-)  It is fine with me, as long as ODP is open to be used by others. I am just having trouble to find document/installation info of the ODP. maybe I should google harder? :-)
> 
> Demai 
> 
> 
>> On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
>> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
>> > (Horton work), and others like Mapr, IBM... Is there a distribution that is
>> > NOT owned by a company?  I am looking for something simple for cluster
>> > configuration/installation for multiple components: hdfs, yarn, zookeeper,
>> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
>> > he/she can build the distribution from Apache releases. Well, I am more
>> > interested on building application on top of it, and hopefully to find one
>> > packed them together.
>> 
>> Apache Bigtop (CCed) aims at delivering a 100% open and
>> community-driven distribution of big data management technologies
>> around Apache Hadoop. Same as, for example, what Debian is trying
>> to do for Linux.
>> 
>> > BTW, I don't need the latest releases like other commercial distribution
>> > offered.  I am also looking into the ODP(the open data platform), but that
>> > project is kind of quiet after the initial Feb announcement.
>> 
>> Feel free to ping me off list if you want more details on ODP.
>> 
>> Thanks,
>> Roman.
> 

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Chris and Roman,

many thanks for the quick response.  I will take a look at bigtop.
Actually, I heard about it, but thought it is a installation framework,
instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
hadoop instruction, which probably will work fine for my needs. Appreciate
the pointer.

Roman, I will ping you off list for ODP. I was hoping ODP will be the one
for me. Well, in reality, it is owned by a few companies, at least not by
ONE company. :-)  It is fine with me, as long as ODP is open to be used by
others. I am just having trouble to find document/installation info of the
ODP. maybe I should google harder? :-)

Demai


On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by jay vyas <ja...@gmail.com>.
I'll reiterate part of what roman said :)

Apache BigTop IS the upstream opensource hadoop distribution :)   Its the
only open distro out there that is actually contributed to and built by
people

who also are commiter/contributors to the sister projects (hbase, hadoop,
spark, and so on).

Engineers at Cloudera, Rackspace, Pivotal, WanDisco, Red Hat, Hortonworks,
and many other companies regularly contribute to and advise on the
development of it,

although the big vendors have a love/hate relationship with it - it really
is the most integrated and robust fully open hadoop distribution out there.

And its super easy to test out and play with.  Just join the mailing list
and ask us where to get started, you can

have a working custom hadoop cluster with spark or hadoop or hbase running
on multiple VMs in a matter of minutes.

We also curate reference implementations of full stack applications with
idiomatic unit testing / build / data lifecycling that you

can use to build "real world" bigdata applications as well.

See you on #bigtop in irc or on the mailing list !  We're happy to help you
get started.





On Mon, Jun 1, 2015 at 4:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>



-- 
jay vyas

Re: a non-commerial distribution of hadoop ecosystem?

Posted by jay vyas <ja...@gmail.com>.
I'll reiterate part of what roman said :)

Apache BigTop IS the upstream opensource hadoop distribution :)   Its the
only open distro out there that is actually contributed to and built by
people

who also are commiter/contributors to the sister projects (hbase, hadoop,
spark, and so on).

Engineers at Cloudera, Rackspace, Pivotal, WanDisco, Red Hat, Hortonworks,
and many other companies regularly contribute to and advise on the
development of it,

although the big vendors have a love/hate relationship with it - it really
is the most integrated and robust fully open hadoop distribution out there.

And its super easy to test out and play with.  Just join the mailing list
and ask us where to get started, you can

have a working custom hadoop cluster with spark or hadoop or hbase running
on multiple VMs in a matter of minutes.

We also curate reference implementations of full stack applications with
idiomatic unit testing / build / data lifecycling that you

can use to build "real world" bigdata applications as well.

See you on #bigtop in irc or on the mailing list !  We're happy to help you
get started.





On Mon, Jun 1, 2015 at 4:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>



-- 
jay vyas

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Chris and Roman,

many thanks for the quick response.  I will take a look at bigtop.
Actually, I heard about it, but thought it is a installation framework,
instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
hadoop instruction, which probably will work fine for my needs. Appreciate
the pointer.

Roman, I will ping you off list for ODP. I was hoping ODP will be the one
for me. Well, in reality, it is owned by a few companies, at least not by
ONE company. :-)  It is fine with me, as long as ODP is open to be used by
others. I am just having trouble to find document/installation info of the
ODP. maybe I should google harder? :-)

Demai


On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by jay vyas <ja...@gmail.com>.
I'll reiterate part of what roman said :)

Apache BigTop IS the upstream opensource hadoop distribution :)   Its the
only open distro out there that is actually contributed to and built by
people

who also are commiter/contributors to the sister projects (hbase, hadoop,
spark, and so on).

Engineers at Cloudera, Rackspace, Pivotal, WanDisco, Red Hat, Hortonworks,
and many other companies regularly contribute to and advise on the
development of it,

although the big vendors have a love/hate relationship with it - it really
is the most integrated and robust fully open hadoop distribution out there.

And its super easy to test out and play with.  Just join the mailing list
and ask us where to get started, you can

have a working custom hadoop cluster with spark or hadoop or hbase running
on multiple VMs in a matter of minutes.

We also curate reference implementations of full stack applications with
idiomatic unit testing / build / data lifecycling that you

can use to build "real world" bigdata applications as well.

See you on #bigtop in irc or on the mailing list !  We're happy to help you
get started.





On Mon, Jun 1, 2015 at 4:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>



-- 
jay vyas

Re: a non-commerial distribution of hadoop ecosystem?

Posted by jay vyas <ja...@gmail.com>.
I'll reiterate part of what roman said :)

Apache BigTop IS the upstream opensource hadoop distribution :)   Its the
only open distro out there that is actually contributed to and built by
people

who also are commiter/contributors to the sister projects (hbase, hadoop,
spark, and so on).

Engineers at Cloudera, Rackspace, Pivotal, WanDisco, Red Hat, Hortonworks,
and many other companies regularly contribute to and advise on the
development of it,

although the big vendors have a love/hate relationship with it - it really
is the most integrated and robust fully open hadoop distribution out there.

And its super easy to test out and play with.  Just join the mailing list
and ask us where to get started, you can

have a working custom hadoop cluster with spark or hadoop or hbase running
on multiple VMs in a matter of minutes.

We also curate reference implementations of full stack applications with
idiomatic unit testing / build / data lifecycling that you

can use to build "real world" bigdata applications as well.

See you on #bigtop in irc or on the mailing list !  We're happy to help you
get started.





On Mon, Jun 1, 2015 at 4:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>



-- 
jay vyas

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Chris and Roman,

many thanks for the quick response.  I will take a look at bigtop.
Actually, I heard about it, but thought it is a installation framework,
instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
hadoop instruction, which probably will work fine for my needs. Appreciate
the pointer.

Roman, I will ping you off list for ODP. I was hoping ODP will be the one
for me. Well, in reality, it is owned by a few companies, at least not by
ONE company. :-)  It is fine with me, as long as ODP is open to be used by
others. I am just having trouble to find document/installation info of the
ODP. maybe I should google harder? :-)

Demai


On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by jay vyas <ja...@gmail.com>.
I'll reiterate part of what roman said :)

Apache BigTop IS the upstream opensource hadoop distribution :)   Its the
only open distro out there that is actually contributed to and built by
people

who also are commiter/contributors to the sister projects (hbase, hadoop,
spark, and so on).

Engineers at Cloudera, Rackspace, Pivotal, WanDisco, Red Hat, Hortonworks,
and many other companies regularly contribute to and advise on the
development of it,

although the big vendors have a love/hate relationship with it - it really
is the most integrated and robust fully open hadoop distribution out there.

And its super easy to test out and play with.  Just join the mailing list
and ask us where to get started, you can

have a working custom hadoop cluster with spark or hadoop or hbase running
on multiple VMs in a matter of minutes.

We also curate reference implementations of full stack applications with
idiomatic unit testing / build / data lifecycling that you

can use to build "real world" bigdata applications as well.

See you on #bigtop in irc or on the mailing list !  We're happy to help you
get started.





On Mon, Jun 1, 2015 at 4:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>



-- 
jay vyas

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Chris and Roman,

many thanks for the quick response.  I will take a look at bigtop.
Actually, I heard about it, but thought it is a installation framework,
instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
hadoop instruction, which probably will work fine for my needs. Appreciate
the pointer.

Roman, I will ping you off list for ODP. I was hoping ODP will be the one
for me. Well, in reality, it is owned by a few companies, at least not by
ONE company. :-)  It is fine with me, as long as ODP is open to be used by
others. I am just having trouble to find document/installation info of the
ODP. maybe I should google harder? :-)

Demai


On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Demai Ni <ni...@gmail.com>.
Chris and Roman,

many thanks for the quick response.  I will take a look at bigtop.
Actually, I heard about it, but thought it is a installation framework,
instead of a hadoop distribution. Now I am looking at the BigTop 0.7.0
hadoop instruction, which probably will work fine for my needs. Appreciate
the pointer.

Roman, I will ping you off list for ODP. I was hoping ODP will be the one
for me. Well, in reality, it is owned by a few companies, at least not by
ONE company. :-)  It is fine with me, as long as ODP is open to be used by
others. I am just having trouble to find document/installation info of the
ODP. maybe I should google harder? :-)

Demai


On Mon, Jun 1, 2015 at 1:46 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> > My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> > (Horton work), and others like Mapr, IBM... Is there a distribution that
> is
> > NOT owned by a company?  I am looking for something simple for cluster
> > configuration/installation for multiple components: hdfs, yarn,
> zookeeper,
> > hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> > he/she can build the distribution from Apache releases. Well, I am more
> > interested on building application on top of it, and hopefully to find
> one
> > packed them together.
>
> Apache Bigtop (CCed) aims at delivering a 100% open and
> community-driven distribution of big data management technologies
> around Apache Hadoop. Same as, for example, what Debian is trying
> to do for Linux.
>
> > BTW, I don't need the latest releases like other commercial distribution
> > offered.  I am also looking into the ODP(the open data platform), but
> that
> > project is kind of quiet after the initial Feb announcement.
>
> Feel free to ping me off list if you want more details on ODP.
>
> Thanks,
> Roman.
>

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> (Horton work), and others like Mapr, IBM... Is there a distribution that is
> NOT owned by a company?  I am looking for something simple for cluster
> configuration/installation for multiple components: hdfs, yarn, zookeeper,
> hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> he/she can build the distribution from Apache releases. Well, I am more
> interested on building application on top of it, and hopefully to find one
> packed them together.

Apache Bigtop (CCed) aims at delivering a 100% open and
community-driven distribution of big data management technologies
around Apache Hadoop. Same as, for example, what Debian is trying
to do for Linux.

> BTW, I don't need the latest releases like other commercial distribution
> offered.  I am also looking into the ODP(the open data platform), but that
> project is kind of quiet after the initial Feb announcement.

Feel free to ping me off list if you want more details on ODP.

Thanks,
Roman.

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Demai,

Apache Bigtop is a project that tests and publishes rpm and deb packages for Hadoop ecosystem components.  They'll have more details on their own site.

http://bigtop.apache.org/

Would this suit your needs?

--Chris Nauroth

From: Demai Ni <ni...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 1, 2015 at 1:37 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: a non-commerial distribution of hadoop ecosystem?

hi, Guys,

I have been doing some research/POC using hadoop system. Normally, I either use homebrew on mac for single node installation, or use CDH(Cloudera) for a 3~4 nodes small linux cluster.

My question is besides the commercial distributions: CDH(Cloudera)  , HDP (Horton work), and others like Mapr, IBM... Is there a distribution that is NOT owned by a company?  I am looking for something simple for cluster configuration/installation for multiple components: hdfs, yarn, zookeeper, hive, hbase, maybe Spark. Surely, for a well-experience person(not me), he/she can build the distribution from Apache releases. Well, I am more interested on building application on top of it, and hopefully to find one packed them together.

BTW, I don't need the latest releases like other commercial distribution offered.  I am also looking into the ODP(the open data platform), but that project is kind of quiet after the initial Feb announcement.

Thanks.

 Demai

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Demai,

Apache Bigtop is a project that tests and publishes rpm and deb packages for Hadoop ecosystem components.  They'll have more details on their own site.

http://bigtop.apache.org/

Would this suit your needs?

--Chris Nauroth

From: Demai Ni <ni...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 1, 2015 at 1:37 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: a non-commerial distribution of hadoop ecosystem?

hi, Guys,

I have been doing some research/POC using hadoop system. Normally, I either use homebrew on mac for single node installation, or use CDH(Cloudera) for a 3~4 nodes small linux cluster.

My question is besides the commercial distributions: CDH(Cloudera)  , HDP (Horton work), and others like Mapr, IBM... Is there a distribution that is NOT owned by a company?  I am looking for something simple for cluster configuration/installation for multiple components: hdfs, yarn, zookeeper, hive, hbase, maybe Spark. Surely, for a well-experience person(not me), he/she can build the distribution from Apache releases. Well, I am more interested on building application on top of it, and hopefully to find one packed them together.

BTW, I don't need the latest releases like other commercial distribution offered.  I am also looking into the ODP(the open data platform), but that project is kind of quiet after the initial Feb announcement.

Thanks.

 Demai

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Demai,

Apache Bigtop is a project that tests and publishes rpm and deb packages for Hadoop ecosystem components.  They'll have more details on their own site.

http://bigtop.apache.org/

Would this suit your needs?

--Chris Nauroth

From: Demai Ni <ni...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 1, 2015 at 1:37 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: a non-commerial distribution of hadoop ecosystem?

hi, Guys,

I have been doing some research/POC using hadoop system. Normally, I either use homebrew on mac for single node installation, or use CDH(Cloudera) for a 3~4 nodes small linux cluster.

My question is besides the commercial distributions: CDH(Cloudera)  , HDP (Horton work), and others like Mapr, IBM... Is there a distribution that is NOT owned by a company?  I am looking for something simple for cluster configuration/installation for multiple components: hdfs, yarn, zookeeper, hive, hbase, maybe Spark. Surely, for a well-experience person(not me), he/she can build the distribution from Apache releases. Well, I am more interested on building application on top of it, and hopefully to find one packed them together.

BTW, I don't need the latest releases like other commercial distribution offered.  I am also looking into the ODP(the open data platform), but that project is kind of quiet after the initial Feb announcement.

Thanks.

 Demai

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Demai,

Apache Bigtop is a project that tests and publishes rpm and deb packages for Hadoop ecosystem components.  They'll have more details on their own site.

http://bigtop.apache.org/

Would this suit your needs?

--Chris Nauroth

From: Demai Ni <ni...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Monday, June 1, 2015 at 1:37 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: a non-commerial distribution of hadoop ecosystem?

hi, Guys,

I have been doing some research/POC using hadoop system. Normally, I either use homebrew on mac for single node installation, or use CDH(Cloudera) for a 3~4 nodes small linux cluster.

My question is besides the commercial distributions: CDH(Cloudera)  , HDP (Horton work), and others like Mapr, IBM... Is there a distribution that is NOT owned by a company?  I am looking for something simple for cluster configuration/installation for multiple components: hdfs, yarn, zookeeper, hive, hbase, maybe Spark. Surely, for a well-experience person(not me), he/she can build the distribution from Apache releases. Well, I am more interested on building application on top of it, and hopefully to find one packed them together.

BTW, I don't need the latest releases like other commercial distribution offered.  I am also looking into the ODP(the open data platform), but that project is kind of quiet after the initial Feb announcement.

Thanks.

 Demai

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> (Horton work), and others like Mapr, IBM... Is there a distribution that is
> NOT owned by a company?  I am looking for something simple for cluster
> configuration/installation for multiple components: hdfs, yarn, zookeeper,
> hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> he/she can build the distribution from Apache releases. Well, I am more
> interested on building application on top of it, and hopefully to find one
> packed them together.

Apache Bigtop (CCed) aims at delivering a 100% open and
community-driven distribution of big data management technologies
around Apache Hadoop. Same as, for example, what Debian is trying
to do for Linux.

> BTW, I don't need the latest releases like other commercial distribution
> offered.  I am also looking into the ODP(the open data platform), but that
> project is kind of quiet after the initial Feb announcement.

Feel free to ping me off list if you want more details on ODP.

Thanks,
Roman.

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> (Horton work), and others like Mapr, IBM... Is there a distribution that is
> NOT owned by a company?  I am looking for something simple for cluster
> configuration/installation for multiple components: hdfs, yarn, zookeeper,
> hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> he/she can build the distribution from Apache releases. Well, I am more
> interested on building application on top of it, and hopefully to find one
> packed them together.

Apache Bigtop (CCed) aims at delivering a 100% open and
community-driven distribution of big data management technologies
around Apache Hadoop. Same as, for example, what Debian is trying
to do for Linux.

> BTW, I don't need the latest releases like other commercial distribution
> offered.  I am also looking into the ODP(the open data platform), but that
> project is kind of quiet after the initial Feb announcement.

Feel free to ping me off list if you want more details on ODP.

Thanks,
Roman.

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> (Horton work), and others like Mapr, IBM... Is there a distribution that is
> NOT owned by a company?  I am looking for something simple for cluster
> configuration/installation for multiple components: hdfs, yarn, zookeeper,
> hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> he/she can build the distribution from Apache releases. Well, I am more
> interested on building application on top of it, and hopefully to find one
> packed them together.

Apache Bigtop (CCed) aims at delivering a 100% open and
community-driven distribution of big data management technologies
around Apache Hadoop. Same as, for example, what Debian is trying
to do for Linux.

> BTW, I don't need the latest releases like other commercial distribution
> offered.  I am also looking into the ODP(the open data platform), but that
> project is kind of quiet after the initial Feb announcement.

Feel free to ping me off list if you want more details on ODP.

Thanks,
Roman.

Re: a non-commerial distribution of hadoop ecosystem?

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Jun 1, 2015 at 1:37 PM, Demai Ni <ni...@gmail.com> wrote:
> My question is besides the commercial distributions: CDH(Cloudera)  , HDP
> (Horton work), and others like Mapr, IBM... Is there a distribution that is
> NOT owned by a company?  I am looking for something simple for cluster
> configuration/installation for multiple components: hdfs, yarn, zookeeper,
> hive, hbase, maybe Spark. Surely, for a well-experience person(not me),
> he/she can build the distribution from Apache releases. Well, I am more
> interested on building application on top of it, and hopefully to find one
> packed them together.

Apache Bigtop (CCed) aims at delivering a 100% open and
community-driven distribution of big data management technologies
around Apache Hadoop. Same as, for example, what Debian is trying
to do for Linux.

> BTW, I don't need the latest releases like other commercial distribution
> offered.  I am also looking into the ODP(the open data platform), but that
> project is kind of quiet after the initial Feb announcement.

Feel free to ping me off list if you want more details on ODP.

Thanks,
Roman.