You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@bigtop.apache.org by David Fryer <df...@gmail.com> on 2014/07/16 16:39:05 UTC

New to Bigtop, where to start?

Hi Bigtop!

I'm looking to use bigtop to help set up a small hadoop cluster. I'm
currently messing about with the hadoop tarball and all of the associated
xml files, and I don't really have the time or expertise to get it up and
working.

Jay suggested that bigtop may be a good solution, so I've decided to give
it a shot. Unfortunately, documentation is fairly sparse and I'm not quite
sure where to start. I've cloned the github repo and used the startup.sh
script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up a virtual
cluster, but I am unsure how to apply this to physical machines. I'm also
not quite sure how to get hadoop and hdfs up and working.

Any help would be appreciated!

Thanks,
David Fryer

Re: New to Bigtop, where to start?

Posted by David Fryer <df...@gmail.com>.

Thanks Mark, each machine now runs in pseudo-distributed mode now!


On Wed, Jul 16, 2014 at 12:56 PM, Mark Grover <gr...@gmail.com>
wrote:

> The 'hadoop' package just delivers the hadoop common bits but no init
> scripts to start the service, no convenience artifacts that deploy
> configuration for say, starting hadoop pseudo distributed cluster. For all
> practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
> packages which deliver bits for HDFS and MR. However, even that may not be
> enough, you likely need init scripts to be installed for starting and
> stopping services related to HDFS and MR. So, depending on if you are
> installing Hadoop on a fully-distributed cluster or a pseudo-distributed
> cluster, you may need to install one or more services (and hence packages)
> like resource manager, node manager, namenode and datanode on the node(s).
> Then, you will have to deploy the configuration yourself. We have default
> configuration installed by packages but you definitely need to add some
> entries to make it work for a fully-distributed cluster e.g. adding the
> name of the namenode host to configuration of datanodes. If you are using
> just a pseudo-distributed, you can installed the pseudo distributed
> configuration package (which has all the necessary dependencies so
> installing that nothing else should be good) and you will get an
> out-of-the-box experience.
>
> FYI, if you do
> yum list 'hadoop*'
> You would find a list of all hadoop related packages that are available to
> be installed.
>
>
>
> On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com> wrote:
>
>> Is it necessary to install the whole hadoop stack?
>>
>>
>> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
>> wrote:
>>
>>> The only output from that is:
>>> hadoop-2.0.5.1-1.el6.x86_64
>>>
>>> -David
>>>
>>>
>>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>>>
>>>> Possibly, can you check what packages you have installed related to
>>>> hadoop.
>>>>
>>>> rpm -qa | grep hadoop
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Mark,
>>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>>> after running "yum install hadoop\*", I can't find anything related to
>>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>>
>>>>> -David
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>>>>>
>>>>>> Welcome, David.
>>>>>>
>>>>>> For physical machines, I personally always use instructions like
>>>>>> these:
>>>>>>
>>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>>
>>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <
>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>
>>>>>>> one more note : by "look at the csv file" above i meant, "edit it so
>>>>>>> that it reflects your
>>>>>>> environment".
>>>>>>>
>>>>>>> Make sure and read  the puppet README file as well under
>>>>>>> bigtop-deploy/puppet.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi david .
>>>>>>>>
>>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>>>>>> will be to port it to bare metal, like you say.
>>>>>>>>
>>>>>>>> The Vagrantfile does two things
>>>>>>>>
>>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>>> 2) It spins up centos boxes .
>>>>>>>>
>>>>>>>>
>>>>>>>> So in the "real world" you will need to obviously set up ssh
>>>>>>>> between machines to start.
>>>>>>>> After that , roughly, will need to do the following:
>>>>>>>>
>>>>>>>> - clone bigtop onto each of your  machines
>>>>>>>> - install puppet 2.x on each of the machines
>>>>>>>> - look at the csv file created in the vagrant provisioner, and read
>>>>>>>> the puppet README file (in bigtop-deploy)
>>>>>>>> - run puppet apply on the head node
>>>>>>>> Once that works
>>>>>>>> - run puppet apply on each slave.
>>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>>> yum install -y pig mahout
>>>>>>>>
>>>>>>>> And you have a working hadoop cluster.
>>>>>>>>
>>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>>> directions are admittedly a little bit rough.
>>>>>>>>
>>>>>>>> Also, once you get this working, you can help us to update the wiki
>>>>>>>> pages.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <dfryer1193@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Bigtop!
>>>>>>>>>
>>>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster.
>>>>>>>>> I'm currently messing about with the hadoop tarball and all of the
>>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>>> it up and working.
>>>>>>>>>
>>>>>>>>> Jay suggested that bigtop may be a good solution, so I've decided
>>>>>>>>> to give it a shot. Unfortunately, documentation is fairly sparse and I'm
>>>>>>>>> not quite sure where to start. I've cloned the github repo and used the
>>>>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>>
>>>>>>>>> Any help would be appreciated!
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David Fryer
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> jay vyas
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> jay vyas
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by Sean Mackrory <ma...@gmail.com>.

You can find details of this problem here, with one solution highlighted:
https://issues.apache.org/jira/browse/HDFS-107. In a Bigtop deployment you
will find that file under /var/lib/hadoop-hdfs/.


On Wed, Jul 16, 2014 at 2:51 PM, Sean Mackrory <ma...@gmail.com> wrote:

> I suspect your problem is that all your DataNodes are already initialized
> with the namespace-id of the pseudo-distributed instances they originally
> connected to. When a DataNode first connects to it's NameNode, it gets this
> ID and then if you ever re-format the NameNode or just create a new
> NameNode, the DataNode won't just play-nice with the new one. This will
> cause you to lose all your data (but if you lose all your NameNodes
> permanently you've pretty much lost it anyway), but the solution is to
> delete the old namespace-id so that when you restart the DataNode, it will
> connect to the new NameNode as part of a new cluster / filesystem. IIRC,
> you can do this by simply deleting the file containing this ID and then
> 'service hadoop-hdfs-datanode restart'. Let me look up which file that is...
>
>
> On Wed, Jul 16, 2014 at 2:35 PM, David Fryer <df...@gmail.com> wrote:
>
>> Hi Sean,
>> I now have each machine running in pseudo-distributed mode but when I try
>> to run in distributed mode, I get an exception saying that there are 0
>> datanodes running. Any suggestions? I've modified core-site.xml to reflect
>> what the cluster is supposed to look like.
>>
>> -David
>>
>>
>> On Wed, Jul 16, 2014 at 1:17 PM, Sean Mackrory <ma...@gmail.com>
>> wrote:
>>
>>> It might be easiest to get it working on a single node, and then once
>>> you're familiar with the Bigtop packages and related files try on a
>>> cluster. On a single node, you can do "yum install hadoop-conf-pseudo",
>>> then format the namenode with "service hadoop-hdfs-namenode init", and then
>>> start all of Hadoop: "for service in hadoop-hdfs-namenode
>>> hadoop-hdfs-secondarynamenode hadoop-hdfs-datanode
>>> hadoop-yarn-resourcemanager hadoop-yarn-nodemanager; do service $service
>>> start; done". That should give you an idea of how Bigtop deploys stuff and
>>> what packages you need. hadoop-conf-pseudo will install all the packages
>>> that provide the init scripts and libraries required for every role, and a
>>> working single-node configuration. You would want to install those roles on
>>> different machines (e.g. NameNode and ResourceManager on one, DataNode and
>>> NodeManager on all the others), and then edit the configuration files in
>>> /etc/hadoop/conf on each node accordingly so the datanodes know which
>>> namenode to connect to, etc.
>>>
>>>
>>> On Wed, Jul 16, 2014 at 10:56 AM, Mark Grover <
>>> grover.markgrover@gmail.com> wrote:
>>>
>>>> The 'hadoop' package just delivers the hadoop common bits but no init
>>>> scripts to start the service, no convenience artifacts that deploy
>>>> configuration for say, starting hadoop pseudo distributed cluster. For all
>>>> practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
>>>> packages which deliver bits for HDFS and MR. However, even that may not be
>>>> enough, you likely need init scripts to be installed for starting and
>>>> stopping services related to HDFS and MR. So, depending on if you are
>>>> installing Hadoop on a fully-distributed cluster or a pseudo-distributed
>>>> cluster, you may need to install one or more services (and hence packages)
>>>> like resource manager, node manager, namenode and datanode on the node(s).
>>>> Then, you will have to deploy the configuration yourself. We have default
>>>> configuration installed by packages but you definitely need to add some
>>>> entries to make it work for a fully-distributed cluster e.g. adding the
>>>> name of the namenode host to configuration of datanodes. If you are using
>>>> just a pseudo-distributed, you can installed the pseudo distributed
>>>> configuration package (which has all the necessary dependencies so
>>>> installing that nothing else should be good) and you will get an
>>>> out-of-the-box experience.
>>>>
>>>> FYI, if you do
>>>> yum list 'hadoop*'
>>>> You would find a list of all hadoop related packages that are available
>>>> to be installed.
>>>>
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com>
>>>> wrote:
>>>>
>>>>> Is it necessary to install the whole hadoop stack?
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The only output from that is:
>>>>>> hadoop-2.0.5.1-1.el6.x86_64
>>>>>>
>>>>>> -David
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Possibly, can you check what packages you have installed related to
>>>>>>> hadoop.
>>>>>>>
>>>>>>> rpm -qa | grep hadoop
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Mark,
>>>>>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>>>>>> after running "yum install hadoop\*", I can't find anything related to
>>>>>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>>>>>
>>>>>>>> -David
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Welcome, David.
>>>>>>>>>
>>>>>>>>> For physical machines, I personally always use instructions like
>>>>>>>>> these:
>>>>>>>>>
>>>>>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>>>>>
>>>>>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>>>>>
>>>>>>>>> Mark
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <
>>>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> one more note : by "look at the csv file" above i meant, "edit it
>>>>>>>>>> so that it reflects your
>>>>>>>>>> environment".
>>>>>>>>>>
>>>>>>>>>> Make sure and read  the puppet README file as well under
>>>>>>>>>> bigtop-deploy/puppet.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi david .
>>>>>>>>>>>
>>>>>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next
>>>>>>>>>>> step will be to port it to bare metal, like you say.
>>>>>>>>>>>
>>>>>>>>>>> The Vagrantfile does two things
>>>>>>>>>>>
>>>>>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>>>>>> 2) It spins up centos boxes .
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So in the "real world" you will need to obviously set up ssh
>>>>>>>>>>> between machines to start.
>>>>>>>>>>> After that , roughly, will need to do the following:
>>>>>>>>>>>
>>>>>>>>>>> - clone bigtop onto each of your  machines
>>>>>>>>>>> - install puppet 2.x on each of the machines
>>>>>>>>>>> - look at the csv file created in the vagrant provisioner, and
>>>>>>>>>>> read the puppet README file (in bigtop-deploy)
>>>>>>>>>>> - run puppet apply on the head node
>>>>>>>>>>> Once that works
>>>>>>>>>>> - run puppet apply on each slave.
>>>>>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>>>>>> yum install -y pig mahout
>>>>>>>>>>>
>>>>>>>>>>> And you have a working hadoop cluster.
>>>>>>>>>>>
>>>>>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>>>>>> directions are admittedly a little bit rough.
>>>>>>>>>>>
>>>>>>>>>>> Also, once you get this working, you can help us to update the
>>>>>>>>>>> wiki pages.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <
>>>>>>>>>>> dfryer1193@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Bigtop!
>>>>>>>>>>>>
>>>>>>>>>>>> I'm looking to use bigtop to help set up a small hadoop
>>>>>>>>>>>> cluster. I'm currently messing about with the hadoop tarball and all of the
>>>>>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>>>>>> it up and working.
>>>>>>>>>>>>
>>>>>>>>>>>> Jay suggested that bigtop may be a good solution, so I've
>>>>>>>>>>>> decided to give it a shot. Unfortunately, documentation is fairly sparse
>>>>>>>>>>>> and I'm not quite sure where to start. I've cloned the github repo and used
>>>>>>>>>>>> the startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to
>>>>>>>>>>>> set up a virtual cluster, but I am unsure how to apply this to physical
>>>>>>>>>>>> machines. I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>>>>>
>>>>>>>>>>>> Any help would be appreciated!
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David Fryer
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> jay vyas
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> jay vyas
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by Sean Mackrory <ma...@gmail.com>.

I suspect your problem is that all your DataNodes are already initialized
with the namespace-id of the pseudo-distributed instances they originally
connected to. When a DataNode first connects to it's NameNode, it gets this
ID and then if you ever re-format the NameNode or just create a new
NameNode, the DataNode won't just play-nice with the new one. This will
cause you to lose all your data (but if you lose all your NameNodes
permanently you've pretty much lost it anyway), but the solution is to
delete the old namespace-id so that when you restart the DataNode, it will
connect to the new NameNode as part of a new cluster / filesystem. IIRC,
you can do this by simply deleting the file containing this ID and then
'service hadoop-hdfs-datanode restart'. Let me look up which file that is...


On Wed, Jul 16, 2014 at 2:35 PM, David Fryer <df...@gmail.com> wrote:

> Hi Sean,
> I now have each machine running in pseudo-distributed mode but when I try
> to run in distributed mode, I get an exception saying that there are 0
> datanodes running. Any suggestions? I've modified core-site.xml to reflect
> what the cluster is supposed to look like.
>
> -David
>
>
> On Wed, Jul 16, 2014 at 1:17 PM, Sean Mackrory <ma...@gmail.com>
> wrote:
>
>> It might be easiest to get it working on a single node, and then once
>> you're familiar with the Bigtop packages and related files try on a
>> cluster. On a single node, you can do "yum install hadoop-conf-pseudo",
>> then format the namenode with "service hadoop-hdfs-namenode init", and then
>> start all of Hadoop: "for service in hadoop-hdfs-namenode
>> hadoop-hdfs-secondarynamenode hadoop-hdfs-datanode
>> hadoop-yarn-resourcemanager hadoop-yarn-nodemanager; do service $service
>> start; done". That should give you an idea of how Bigtop deploys stuff and
>> what packages you need. hadoop-conf-pseudo will install all the packages
>> that provide the init scripts and libraries required for every role, and a
>> working single-node configuration. You would want to install those roles on
>> different machines (e.g. NameNode and ResourceManager on one, DataNode and
>> NodeManager on all the others), and then edit the configuration files in
>> /etc/hadoop/conf on each node accordingly so the datanodes know which
>> namenode to connect to, etc.
>>
>>
>> On Wed, Jul 16, 2014 at 10:56 AM, Mark Grover <
>> grover.markgrover@gmail.com> wrote:
>>
>>> The 'hadoop' package just delivers the hadoop common bits but no init
>>> scripts to start the service, no convenience artifacts that deploy
>>> configuration for say, starting hadoop pseudo distributed cluster. For all
>>> practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
>>> packages which deliver bits for HDFS and MR. However, even that may not be
>>> enough, you likely need init scripts to be installed for starting and
>>> stopping services related to HDFS and MR. So, depending on if you are
>>> installing Hadoop on a fully-distributed cluster or a pseudo-distributed
>>> cluster, you may need to install one or more services (and hence packages)
>>> like resource manager, node manager, namenode and datanode on the node(s).
>>> Then, you will have to deploy the configuration yourself. We have default
>>> configuration installed by packages but you definitely need to add some
>>> entries to make it work for a fully-distributed cluster e.g. adding the
>>> name of the namenode host to configuration of datanodes. If you are using
>>> just a pseudo-distributed, you can installed the pseudo distributed
>>> configuration package (which has all the necessary dependencies so
>>> installing that nothing else should be good) and you will get an
>>> out-of-the-box experience.
>>>
>>> FYI, if you do
>>> yum list 'hadoop*'
>>> You would find a list of all hadoop related packages that are available
>>> to be installed.
>>>
>>>
>>>
>>> On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com>
>>> wrote:
>>>
>>>> Is it necessary to install the whole hadoop stack?
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
>>>> wrote:
>>>>
>>>>> The only output from that is:
>>>>> hadoop-2.0.5.1-1.el6.x86_64
>>>>>
>>>>> -David
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>>>>>
>>>>>> Possibly, can you check what packages you have installed related to
>>>>>> hadoop.
>>>>>>
>>>>>> rpm -qa | grep hadoop
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mark,
>>>>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>>>>> after running "yum install hadoop\*", I can't find anything related to
>>>>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>>>>
>>>>>>> -David
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Welcome, David.
>>>>>>>>
>>>>>>>> For physical machines, I personally always use instructions like
>>>>>>>> these:
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>>>>
>>>>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>>>>
>>>>>>>> Mark
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <
>>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> one more note : by "look at the csv file" above i meant, "edit it
>>>>>>>>> so that it reflects your
>>>>>>>>> environment".
>>>>>>>>>
>>>>>>>>> Make sure and read  the puppet README file as well under
>>>>>>>>> bigtop-deploy/puppet.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi david .
>>>>>>>>>>
>>>>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next
>>>>>>>>>> step will be to port it to bare metal, like you say.
>>>>>>>>>>
>>>>>>>>>> The Vagrantfile does two things
>>>>>>>>>>
>>>>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>>>>> 2) It spins up centos boxes .
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So in the "real world" you will need to obviously set up ssh
>>>>>>>>>> between machines to start.
>>>>>>>>>> After that , roughly, will need to do the following:
>>>>>>>>>>
>>>>>>>>>> - clone bigtop onto each of your  machines
>>>>>>>>>> - install puppet 2.x on each of the machines
>>>>>>>>>> - look at the csv file created in the vagrant provisioner, and
>>>>>>>>>> read the puppet README file (in bigtop-deploy)
>>>>>>>>>> - run puppet apply on the head node
>>>>>>>>>> Once that works
>>>>>>>>>> - run puppet apply on each slave.
>>>>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>>>>> yum install -y pig mahout
>>>>>>>>>>
>>>>>>>>>> And you have a working hadoop cluster.
>>>>>>>>>>
>>>>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>>>>> directions are admittedly a little bit rough.
>>>>>>>>>>
>>>>>>>>>> Also, once you get this working, you can help us to update the
>>>>>>>>>> wiki pages.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <
>>>>>>>>>> dfryer1193@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Bigtop!
>>>>>>>>>>>
>>>>>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster.
>>>>>>>>>>> I'm currently messing about with the hadoop tarball and all of the
>>>>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>>>>> it up and working.
>>>>>>>>>>>
>>>>>>>>>>> Jay suggested that bigtop may be a good solution, so I've
>>>>>>>>>>> decided to give it a shot. Unfortunately, documentation is fairly sparse
>>>>>>>>>>> and I'm not quite sure where to start. I've cloned the github repo and used
>>>>>>>>>>> the startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to
>>>>>>>>>>> set up a virtual cluster, but I am unsure how to apply this to physical
>>>>>>>>>>> machines. I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>>>>
>>>>>>>>>>> Any help would be appreciated!
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David Fryer
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> jay vyas
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> jay vyas
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by David Fryer <df...@gmail.com>.

Hi Sean,
I now have each machine running in pseudo-distributed mode but when I try
to run in distributed mode, I get an exception saying that there are 0
datanodes running. Any suggestions? I've modified core-site.xml to reflect
what the cluster is supposed to look like.

-David


On Wed, Jul 16, 2014 at 1:17 PM, Sean Mackrory <ma...@gmail.com> wrote:

> It might be easiest to get it working on a single node, and then once
> you're familiar with the Bigtop packages and related files try on a
> cluster. On a single node, you can do "yum install hadoop-conf-pseudo",
> then format the namenode with "service hadoop-hdfs-namenode init", and then
> start all of Hadoop: "for service in hadoop-hdfs-namenode
> hadoop-hdfs-secondarynamenode hadoop-hdfs-datanode
> hadoop-yarn-resourcemanager hadoop-yarn-nodemanager; do service $service
> start; done". That should give you an idea of how Bigtop deploys stuff and
> what packages you need. hadoop-conf-pseudo will install all the packages
> that provide the init scripts and libraries required for every role, and a
> working single-node configuration. You would want to install those roles on
> different machines (e.g. NameNode and ResourceManager on one, DataNode and
> NodeManager on all the others), and then edit the configuration files in
> /etc/hadoop/conf on each node accordingly so the datanodes know which
> namenode to connect to, etc.
>
>
> On Wed, Jul 16, 2014 at 10:56 AM, Mark Grover <grover.markgrover@gmail.com
> > wrote:
>
>> The 'hadoop' package just delivers the hadoop common bits but no init
>> scripts to start the service, no convenience artifacts that deploy
>> configuration for say, starting hadoop pseudo distributed cluster. For all
>> practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
>> packages which deliver bits for HDFS and MR. However, even that may not be
>> enough, you likely need init scripts to be installed for starting and
>> stopping services related to HDFS and MR. So, depending on if you are
>> installing Hadoop on a fully-distributed cluster or a pseudo-distributed
>> cluster, you may need to install one or more services (and hence packages)
>> like resource manager, node manager, namenode and datanode on the node(s).
>> Then, you will have to deploy the configuration yourself. We have default
>> configuration installed by packages but you definitely need to add some
>> entries to make it work for a fully-distributed cluster e.g. adding the
>> name of the namenode host to configuration of datanodes. If you are using
>> just a pseudo-distributed, you can installed the pseudo distributed
>> configuration package (which has all the necessary dependencies so
>> installing that nothing else should be good) and you will get an
>> out-of-the-box experience.
>>
>> FYI, if you do
>> yum list 'hadoop*'
>> You would find a list of all hadoop related packages that are available
>> to be installed.
>>
>>
>>
>> On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com>
>> wrote:
>>
>>> Is it necessary to install the whole hadoop stack?
>>>
>>>
>>> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
>>> wrote:
>>>
>>>> The only output from that is:
>>>> hadoop-2.0.5.1-1.el6.x86_64
>>>>
>>>> -David
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>>>>
>>>>> Possibly, can you check what packages you have installed related to
>>>>> hadoop.
>>>>>
>>>>> rpm -qa | grep hadoop
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mark,
>>>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>>>> after running "yum install hadoop\*", I can't find anything related to
>>>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>>>
>>>>>> -David
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Welcome, David.
>>>>>>>
>>>>>>> For physical machines, I personally always use instructions like
>>>>>>> these:
>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>>>
>>>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>>>
>>>>>>> Mark
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <
>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>
>>>>>>>> one more note : by "look at the csv file" above i meant, "edit it
>>>>>>>> so that it reflects your
>>>>>>>> environment".
>>>>>>>>
>>>>>>>> Make sure and read  the puppet README file as well under
>>>>>>>> bigtop-deploy/puppet.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi david .
>>>>>>>>>
>>>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next
>>>>>>>>> step will be to port it to bare metal, like you say.
>>>>>>>>>
>>>>>>>>> The Vagrantfile does two things
>>>>>>>>>
>>>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>>>> 2) It spins up centos boxes .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So in the "real world" you will need to obviously set up ssh
>>>>>>>>> between machines to start.
>>>>>>>>> After that , roughly, will need to do the following:
>>>>>>>>>
>>>>>>>>> - clone bigtop onto each of your  machines
>>>>>>>>> - install puppet 2.x on each of the machines
>>>>>>>>> - look at the csv file created in the vagrant provisioner, and
>>>>>>>>> read the puppet README file (in bigtop-deploy)
>>>>>>>>> - run puppet apply on the head node
>>>>>>>>> Once that works
>>>>>>>>> - run puppet apply on each slave.
>>>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>>>> yum install -y pig mahout
>>>>>>>>>
>>>>>>>>> And you have a working hadoop cluster.
>>>>>>>>>
>>>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>>>> directions are admittedly a little bit rough.
>>>>>>>>>
>>>>>>>>> Also, once you get this working, you can help us to update the
>>>>>>>>> wiki pages.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <
>>>>>>>>> dfryer1193@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Bigtop!
>>>>>>>>>>
>>>>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster.
>>>>>>>>>> I'm currently messing about with the hadoop tarball and all of the
>>>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>>>> it up and working.
>>>>>>>>>>
>>>>>>>>>> Jay suggested that bigtop may be a good solution, so I've decided
>>>>>>>>>> to give it a shot. Unfortunately, documentation is fairly sparse and I'm
>>>>>>>>>> not quite sure where to start. I've cloned the github repo and used the
>>>>>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>>>
>>>>>>>>>> Any help would be appreciated!
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David Fryer
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> jay vyas
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> jay vyas
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by Sean Mackrory <ma...@gmail.com>.

It might be easiest to get it working on a single node, and then once
you're familiar with the Bigtop packages and related files try on a
cluster. On a single node, you can do "yum install hadoop-conf-pseudo",
then format the namenode with "service hadoop-hdfs-namenode init", and then
start all of Hadoop: "for service in hadoop-hdfs-namenode
hadoop-hdfs-secondarynamenode hadoop-hdfs-datanode
hadoop-yarn-resourcemanager hadoop-yarn-nodemanager; do service $service
start; done". That should give you an idea of how Bigtop deploys stuff and
what packages you need. hadoop-conf-pseudo will install all the packages
that provide the init scripts and libraries required for every role, and a
working single-node configuration. You would want to install those roles on
different machines (e.g. NameNode and ResourceManager on one, DataNode and
NodeManager on all the others), and then edit the configuration files in
/etc/hadoop/conf on each node accordingly so the datanodes know which
namenode to connect to, etc.


On Wed, Jul 16, 2014 at 10:56 AM, Mark Grover <gr...@gmail.com>
wrote:

> The 'hadoop' package just delivers the hadoop common bits but no init
> scripts to start the service, no convenience artifacts that deploy
> configuration for say, starting hadoop pseudo distributed cluster. For all
> practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
> packages which deliver bits for HDFS and MR. However, even that may not be
> enough, you likely need init scripts to be installed for starting and
> stopping services related to HDFS and MR. So, depending on if you are
> installing Hadoop on a fully-distributed cluster or a pseudo-distributed
> cluster, you may need to install one or more services (and hence packages)
> like resource manager, node manager, namenode and datanode on the node(s).
> Then, you will have to deploy the configuration yourself. We have default
> configuration installed by packages but you definitely need to add some
> entries to make it work for a fully-distributed cluster e.g. adding the
> name of the namenode host to configuration of datanodes. If you are using
> just a pseudo-distributed, you can installed the pseudo distributed
> configuration package (which has all the necessary dependencies so
> installing that nothing else should be good) and you will get an
> out-of-the-box experience.
>
> FYI, if you do
> yum list 'hadoop*'
> You would find a list of all hadoop related packages that are available to
> be installed.
>
>
>
> On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com> wrote:
>
>> Is it necessary to install the whole hadoop stack?
>>
>>
>> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
>> wrote:
>>
>>> The only output from that is:
>>> hadoop-2.0.5.1-1.el6.x86_64
>>>
>>> -David
>>>
>>>
>>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>>>
>>>> Possibly, can you check what packages you have installed related to
>>>> hadoop.
>>>>
>>>> rpm -qa | grep hadoop
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Mark,
>>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>>> after running "yum install hadoop\*", I can't find anything related to
>>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>>
>>>>> -David
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>>>>>
>>>>>> Welcome, David.
>>>>>>
>>>>>> For physical machines, I personally always use instructions like
>>>>>> these:
>>>>>>
>>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>>
>>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <
>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>
>>>>>>> one more note : by "look at the csv file" above i meant, "edit it so
>>>>>>> that it reflects your
>>>>>>> environment".
>>>>>>>
>>>>>>> Make sure and read  the puppet README file as well under
>>>>>>> bigtop-deploy/puppet.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi david .
>>>>>>>>
>>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>>>>>> will be to port it to bare metal, like you say.
>>>>>>>>
>>>>>>>> The Vagrantfile does two things
>>>>>>>>
>>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>>> 2) It spins up centos boxes .
>>>>>>>>
>>>>>>>>
>>>>>>>> So in the "real world" you will need to obviously set up ssh
>>>>>>>> between machines to start.
>>>>>>>> After that , roughly, will need to do the following:
>>>>>>>>
>>>>>>>> - clone bigtop onto each of your  machines
>>>>>>>> - install puppet 2.x on each of the machines
>>>>>>>> - look at the csv file created in the vagrant provisioner, and read
>>>>>>>> the puppet README file (in bigtop-deploy)
>>>>>>>> - run puppet apply on the head node
>>>>>>>> Once that works
>>>>>>>> - run puppet apply on each slave.
>>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>>> yum install -y pig mahout
>>>>>>>>
>>>>>>>> And you have a working hadoop cluster.
>>>>>>>>
>>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>>> directions are admittedly a little bit rough.
>>>>>>>>
>>>>>>>> Also, once you get this working, you can help us to update the wiki
>>>>>>>> pages.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <dfryer1193@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Bigtop!
>>>>>>>>>
>>>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster.
>>>>>>>>> I'm currently messing about with the hadoop tarball and all of the
>>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>>> it up and working.
>>>>>>>>>
>>>>>>>>> Jay suggested that bigtop may be a good solution, so I've decided
>>>>>>>>> to give it a shot. Unfortunately, documentation is fairly sparse and I'm
>>>>>>>>> not quite sure where to start. I've cloned the github repo and used the
>>>>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>>
>>>>>>>>> Any help would be appreciated!
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David Fryer
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> jay vyas
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> jay vyas
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by Mark Grover <gr...@gmail.com>.

The 'hadoop' package just delivers the hadoop common bits but no init
scripts to start the service, no convenience artifacts that deploy
configuration for say, starting hadoop pseudo distributed cluster. For all
practical purposes, you are going to need hadoop-hdfs and hadoop-mapreduce
packages which deliver bits for HDFS and MR. However, even that may not be
enough, you likely need init scripts to be installed for starting and
stopping services related to HDFS and MR. So, depending on if you are
installing Hadoop on a fully-distributed cluster or a pseudo-distributed
cluster, you may need to install one or more services (and hence packages)
like resource manager, node manager, namenode and datanode on the node(s).
Then, you will have to deploy the configuration yourself. We have default
configuration installed by packages but you definitely need to add some
entries to make it work for a fully-distributed cluster e.g. adding the
name of the namenode host to configuration of datanodes. If you are using
just a pseudo-distributed, you can installed the pseudo distributed
configuration package (which has all the necessary dependencies so
installing that nothing else should be good) and you will get an
out-of-the-box experience.

FYI, if you do
yum list 'hadoop*'
You would find a list of all hadoop related packages that are available to
be installed.


On Wed, Jul 16, 2014 at 9:39 AM, David Fryer <df...@gmail.com> wrote:

> Is it necessary to install the whole hadoop stack?
>
>
> On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com>
> wrote:
>
>> The only output from that is:
>> hadoop-2.0.5.1-1.el6.x86_64
>>
>> -David
>>
>>
>> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>>
>>> Possibly, can you check what packages you have installed related to
>>> hadoop.
>>>
>>> rpm -qa | grep hadoop
>>>
>>>
>>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>>> wrote:
>>>
>>>> Hi Mark,
>>>> I'm trying to follow those instructions on a CentOS 6 machine, and
>>>> after running "yum install hadoop\*", I can't find anything related to
>>>> hadoop in /etc/init.d. Is there something I'm missing?
>>>>
>>>> -David
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>>>>
>>>>> Welcome, David.
>>>>>
>>>>> For physical machines, I personally always use instructions like these:
>>>>>
>>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>>
>>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we
>>>>> don't have a page for that unfortunately (we should and if you could help
>>>>> with that, that'd be much appreciated!). We are tying up lose ends for
>>>>> Bigtop 0.8, so we hope to release it soon.
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <jayunit100.apache@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> one more note : by "look at the csv file" above i meant, "edit it so
>>>>>> that it reflects your
>>>>>> environment".
>>>>>>
>>>>>> Make sure and read  the puppet README file as well under
>>>>>> bigtop-deploy/puppet.
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>>
>>>>>>> Hi david .
>>>>>>>
>>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>>>>> will be to port it to bare metal, like you say.
>>>>>>>
>>>>>>> The Vagrantfile does two things
>>>>>>>
>>>>>>> 1) It creates a shared folder for all machines.
>>>>>>> 2) It spins up centos boxes .
>>>>>>>
>>>>>>>
>>>>>>> So in the "real world" you will need to obviously set up ssh between
>>>>>>> machines to start.
>>>>>>> After that , roughly, will need to do the following:
>>>>>>>
>>>>>>> - clone bigtop onto each of your  machines
>>>>>>> - install puppet 2.x on each of the machines
>>>>>>> - look at the csv file created in the vagrant provisioner, and read
>>>>>>> the puppet README file (in bigtop-deploy)
>>>>>>> - run puppet apply on the head node
>>>>>>> Once that works
>>>>>>> - run puppet apply on each slave.
>>>>>>> now on any node that you use as client, (i just use the master
>>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>>> yum install -y pig mahout
>>>>>>>
>>>>>>> And you have a working hadoop cluster.
>>>>>>>
>>>>>>> one idea as I know your on the east coast, if your company is
>>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>>> directions are admittedly a little bit rough.
>>>>>>>
>>>>>>> Also, once you get this working, you can help us to update the wiki
>>>>>>> pages.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Bigtop!
>>>>>>>>
>>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster.
>>>>>>>> I'm currently messing about with the hadoop tarball and all of the
>>>>>>>> associated xml files, and I don't really have the time or expertise to get
>>>>>>>> it up and working.
>>>>>>>>
>>>>>>>> Jay suggested that bigtop may be a good solution, so I've decided
>>>>>>>> to give it a shot. Unfortunately, documentation is fairly sparse and I'm
>>>>>>>> not quite sure where to start. I've cloned the github repo and used the
>>>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>>
>>>>>>>> Any help would be appreciated!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David Fryer
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> jay vyas
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> jay vyas
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by David Fryer <df...@gmail.com>.

Is it necessary to install the whole hadoop stack?


On Wed, Jul 16, 2014 at 12:37 PM, David Fryer <df...@gmail.com> wrote:

> The only output from that is:
> hadoop-2.0.5.1-1.el6.x86_64
>
> -David
>
>
> On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:
>
>> Possibly, can you check what packages you have installed related to
>> hadoop.
>>
>> rpm -qa | grep hadoop
>>
>>
>> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com>
>> wrote:
>>
>>> Hi Mark,
>>> I'm trying to follow those instructions on a CentOS 6 machine, and after
>>> running "yum install hadoop\*", I can't find anything related to hadoop in
>>> /etc/init.d. Is there something I'm missing?
>>>
>>> -David
>>>
>>>
>>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>>>
>>>> Welcome, David.
>>>>
>>>> For physical machines, I personally always use instructions like these:
>>>>
>>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>>
>>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we don't
>>>> have a page for that unfortunately (we should and if you could help with
>>>> that, that'd be much appreciated!). We are tying up lose ends for Bigtop
>>>> 0.8, so we hope to release it soon.
>>>>
>>>> Mark
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> one more note : by "look at the csv file" above i meant, "edit it so
>>>>> that it reflects your
>>>>> environment".
>>>>>
>>>>> Make sure and read  the puppet README file as well under
>>>>> bigtop-deploy/puppet.
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <
>>>>> jayunit100.apache@gmail.com> wrote:
>>>>>
>>>>>> Hi david .
>>>>>>
>>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>>>> will be to port it to bare metal, like you say.
>>>>>>
>>>>>> The Vagrantfile does two things
>>>>>>
>>>>>> 1) It creates a shared folder for all machines.
>>>>>> 2) It spins up centos boxes .
>>>>>>
>>>>>>
>>>>>> So in the "real world" you will need to obviously set up ssh between
>>>>>> machines to start.
>>>>>> After that , roughly, will need to do the following:
>>>>>>
>>>>>> - clone bigtop onto each of your  machines
>>>>>> - install puppet 2.x on each of the machines
>>>>>> - look at the csv file created in the vagrant provisioner, and read
>>>>>> the puppet README file (in bigtop-deploy)
>>>>>> - run puppet apply on the head node
>>>>>> Once that works
>>>>>> - run puppet apply on each slave.
>>>>>> now on any node that you use as client, (i just use the master
>>>>>> usually) you can yum install your favorite ecosystem components:
>>>>>> yum install -y pig mahout
>>>>>>
>>>>>> And you have a working hadoop cluster.
>>>>>>
>>>>>> one idea as I know your on the east coast, if your company is
>>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>>> some folks from the boston / nyc area together to walk through building a
>>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>>> directions are admittedly a little bit rough.
>>>>>>
>>>>>> Also, once you get this working, you can help us to update the wiki
>>>>>> pages.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Bigtop!
>>>>>>>
>>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>>>>>>> currently messing about with the hadoop tarball and all of the associated
>>>>>>> xml files, and I don't really have the time or expertise to get it up and
>>>>>>> working.
>>>>>>>
>>>>>>> Jay suggested that bigtop may be a good solution, so I've decided to
>>>>>>> give it a shot. Unfortunately, documentation is fairly sparse and I'm not
>>>>>>> quite sure where to start. I've cloned the github repo and used the
>>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>>
>>>>>>> Any help would be appreciated!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David Fryer
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> jay vyas
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> jay vyas
>>>>>
>>>>
>>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by David Fryer <df...@gmail.com>.

The only output from that is:
hadoop-2.0.5.1-1.el6.x86_64

-David


On Wed, Jul 16, 2014 at 12:34 PM, Mark Grover <ma...@apache.org> wrote:

> Possibly, can you check what packages you have installed related to hadoop.
>
> rpm -qa | grep hadoop
>
>
> On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com> wrote:
>
>> Hi Mark,
>> I'm trying to follow those instructions on a CentOS 6 machine, and after
>> running "yum install hadoop\*", I can't find anything related to hadoop in
>> /etc/init.d. Is there something I'm missing?
>>
>> -David
>>
>>
>> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>>
>>> Welcome, David.
>>>
>>> For physical machines, I personally always use instructions like these:
>>>
>>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>>
>>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we don't
>>> have a page for that unfortunately (we should and if you could help with
>>> that, that'd be much appreciated!). We are tying up lose ends for Bigtop
>>> 0.8, so we hope to release it soon.
>>>
>>> Mark
>>>
>>>
>>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <ja...@gmail.com>
>>> wrote:
>>>
>>>> one more note : by "look at the csv file" above i meant, "edit it so
>>>> that it reflects your
>>>> environment".
>>>>
>>>> Make sure and read  the puppet README file as well under
>>>> bigtop-deploy/puppet.
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <jayunit100.apache@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi david .
>>>>>
>>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>>> will be to port it to bare metal, like you say.
>>>>>
>>>>> The Vagrantfile does two things
>>>>>
>>>>> 1) It creates a shared folder for all machines.
>>>>> 2) It spins up centos boxes .
>>>>>
>>>>>
>>>>> So in the "real world" you will need to obviously set up ssh between
>>>>> machines to start.
>>>>> After that , roughly, will need to do the following:
>>>>>
>>>>> - clone bigtop onto each of your  machines
>>>>> - install puppet 2.x on each of the machines
>>>>> - look at the csv file created in the vagrant provisioner, and read
>>>>> the puppet README file (in bigtop-deploy)
>>>>> - run puppet apply on the head node
>>>>> Once that works
>>>>> - run puppet apply on each slave.
>>>>> now on any node that you use as client, (i just use the master
>>>>> usually) you can yum install your favorite ecosystem components:
>>>>> yum install -y pig mahout
>>>>>
>>>>> And you have a working hadoop cluster.
>>>>>
>>>>> one idea as I know your on the east coast, if your company is
>>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>>> some folks from the boston / nyc area together to walk through building a
>>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>>> directions are admittedly a little bit rough.
>>>>>
>>>>> Also, once you get this working, you can help us to update the wiki
>>>>> pages.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Bigtop!
>>>>>>
>>>>>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>>>>>> currently messing about with the hadoop tarball and all of the associated
>>>>>> xml files, and I don't really have the time or expertise to get it up and
>>>>>> working.
>>>>>>
>>>>>> Jay suggested that bigtop may be a good solution, so I've decided to
>>>>>> give it a shot. Unfortunately, documentation is fairly sparse and I'm not
>>>>>> quite sure where to start. I've cloned the github repo and used the
>>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>>
>>>>>> Any help would be appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> David Fryer
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> jay vyas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> jay vyas
>>>>
>>>
>>>
>>
>

Re: New to Bigtop, where to start?

Posted by Mark Grover <ma...@apache.org>.

Possibly, can you check what packages you have installed related to hadoop.

rpm -qa | grep hadoop


On Wed, Jul 16, 2014 at 9:28 AM, David Fryer <df...@gmail.com> wrote:

> Hi Mark,
> I'm trying to follow those instructions on a CentOS 6 machine, and after
> running "yum install hadoop\*", I can't find anything related to hadoop in
> /etc/init.d. Is there something I'm missing?
>
> -David
>
>
> On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:
>
>> Welcome, David.
>>
>> For physical machines, I personally always use instructions like these:
>>
>> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>>
>> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we don't
>> have a page for that unfortunately (we should and if you could help with
>> that, that'd be much appreciated!). We are tying up lose ends for Bigtop
>> 0.8, so we hope to release it soon.
>>
>> Mark
>>
>>
>> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <ja...@gmail.com>
>> wrote:
>>
>>> one more note : by "look at the csv file" above i meant, "edit it so
>>> that it reflects your
>>> environment".
>>>
>>> Make sure and read  the puppet README file as well under
>>> bigtop-deploy/puppet.
>>>
>>>
>>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <ja...@gmail.com>
>>> wrote:
>>>
>>>> Hi david .
>>>>
>>>> Glad to hear the vagrant stuff worked for you.  Now , the next step
>>>> will be to port it to bare metal, like you say.
>>>>
>>>> The Vagrantfile does two things
>>>>
>>>> 1) It creates a shared folder for all machines.
>>>> 2) It spins up centos boxes .
>>>>
>>>>
>>>> So in the "real world" you will need to obviously set up ssh between
>>>> machines to start.
>>>> After that , roughly, will need to do the following:
>>>>
>>>> - clone bigtop onto each of your  machines
>>>> - install puppet 2.x on each of the machines
>>>> - look at the csv file created in the vagrant provisioner, and read the
>>>> puppet README file (in bigtop-deploy)
>>>> - run puppet apply on the head node
>>>> Once that works
>>>> - run puppet apply on each slave.
>>>> now on any node that you use as client, (i just use the master usually)
>>>> you can yum install your favorite ecosystem components:
>>>> yum install -y pig mahout
>>>>
>>>> And you have a working hadoop cluster.
>>>>
>>>> one idea as I know your on the east coast, if your company is
>>>> interested in hosting/sponsoring a bigtop meetup, we could possibly bring
>>>> some folks from the boston / nyc area together to walk through building a
>>>> bigtop cluster on bare metal.  Let us know if any other questions.   These
>>>> directions are admittedly a little bit rough.
>>>>
>>>> Also, once you get this working, you can help us to update the wiki
>>>> pages.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Bigtop!
>>>>>
>>>>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>>>>> currently messing about with the hadoop tarball and all of the associated
>>>>> xml files, and I don't really have the time or expertise to get it up and
>>>>> working.
>>>>>
>>>>> Jay suggested that bigtop may be a good solution, so I've decided to
>>>>> give it a shot. Unfortunately, documentation is fairly sparse and I'm not
>>>>> quite sure where to start. I've cloned the github repo and used the
>>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>>
>>>>> Any help would be appreciated!
>>>>>
>>>>> Thanks,
>>>>> David Fryer
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> jay vyas
>>>>
>>>
>>>
>>>
>>> --
>>> jay vyas
>>>
>>
>>
>

Re: New to Bigtop, where to start?

Posted by David Fryer <df...@gmail.com>.

Hi Mark,
I'm trying to follow those instructions on a CentOS 6 machine, and after
running "yum install hadoop\*", I can't find anything related to hadoop in
/etc/init.d. Is there something I'm missing?

-David


On Wed, Jul 16, 2014 at 11:34 AM, Mark Grover <ma...@apache.org> wrote:

> Welcome, David.
>
> For physical machines, I personally always use instructions like these:
>
> https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0
>
> These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we don't
> have a page for that unfortunately (we should and if you could help with
> that, that'd be much appreciated!). We are tying up lose ends for Bigtop
> 0.8, so we hope to release it soon.
>
> Mark
>
>
> On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <ja...@gmail.com>
> wrote:
>
>> one more note : by "look at the csv file" above i meant, "edit it so that
>> it reflects your
>> environment".
>>
>> Make sure and read  the puppet README file as well under
>> bigtop-deploy/puppet.
>>
>>
>> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <ja...@gmail.com>
>> wrote:
>>
>>> Hi david .
>>>
>>> Glad to hear the vagrant stuff worked for you.  Now , the next step will
>>> be to port it to bare metal, like you say.
>>>
>>> The Vagrantfile does two things
>>>
>>> 1) It creates a shared folder for all machines.
>>> 2) It spins up centos boxes .
>>>
>>>
>>> So in the "real world" you will need to obviously set up ssh between
>>> machines to start.
>>> After that , roughly, will need to do the following:
>>>
>>> - clone bigtop onto each of your  machines
>>> - install puppet 2.x on each of the machines
>>> - look at the csv file created in the vagrant provisioner, and read the
>>> puppet README file (in bigtop-deploy)
>>> - run puppet apply on the head node
>>> Once that works
>>> - run puppet apply on each slave.
>>> now on any node that you use as client, (i just use the master usually)
>>> you can yum install your favorite ecosystem components:
>>> yum install -y pig mahout
>>>
>>> And you have a working hadoop cluster.
>>>
>>> one idea as I know your on the east coast, if your company is interested
>>> in hosting/sponsoring a bigtop meetup, we could possibly bring some folks
>>> from the boston / nyc area together to walk through building a bigtop
>>> cluster on bare metal.  Let us know if any other questions.   These
>>> directions are admittedly a little bit rough.
>>>
>>> Also, once you get this working, you can help us to update the wiki
>>> pages.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>>> wrote:
>>>
>>>> Hi Bigtop!
>>>>
>>>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>>>> currently messing about with the hadoop tarball and all of the associated
>>>> xml files, and I don't really have the time or expertise to get it up and
>>>> working.
>>>>
>>>> Jay suggested that bigtop may be a good solution, so I've decided to
>>>> give it a shot. Unfortunately, documentation is fairly sparse and I'm not
>>>> quite sure where to start. I've cloned the github repo and used the
>>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>>
>>>> Any help would be appreciated!
>>>>
>>>> Thanks,
>>>> David Fryer
>>>>
>>>
>>>
>>>
>>> --
>>> jay vyas
>>>
>>
>>
>>
>> --
>> jay vyas
>>
>
>

Re: New to Bigtop, where to start?

Posted by Mark Grover <ma...@apache.org>.

Welcome, David.

For physical machines, I personally always use instructions like these:
https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0

These for Bigtop 0.6.0, the latest Bigtop release is 0.7.0 but we don't
have a page for that unfortunately (we should and if you could help with
that, that'd be much appreciated!). We are tying up lose ends for Bigtop
0.8, so we hope to release it soon.

Mark


On Wed, Jul 16, 2014 at 8:20 AM, jay vyas <ja...@gmail.com>
wrote:

> one more note : by "look at the csv file" above i meant, "edit it so that
> it reflects your
> environment".
>
> Make sure and read  the puppet README file as well under
> bigtop-deploy/puppet.
>
>
> On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <ja...@gmail.com>
> wrote:
>
>> Hi david .
>>
>> Glad to hear the vagrant stuff worked for you.  Now , the next step will
>> be to port it to bare metal, like you say.
>>
>> The Vagrantfile does two things
>>
>> 1) It creates a shared folder for all machines.
>> 2) It spins up centos boxes .
>>
>>
>> So in the "real world" you will need to obviously set up ssh between
>> machines to start.
>> After that , roughly, will need to do the following:
>>
>> - clone bigtop onto each of your  machines
>> - install puppet 2.x on each of the machines
>> - look at the csv file created in the vagrant provisioner, and read the
>> puppet README file (in bigtop-deploy)
>> - run puppet apply on the head node
>> Once that works
>> - run puppet apply on each slave.
>> now on any node that you use as client, (i just use the master usually)
>> you can yum install your favorite ecosystem components:
>> yum install -y pig mahout
>>
>> And you have a working hadoop cluster.
>>
>> one idea as I know your on the east coast, if your company is interested
>> in hosting/sponsoring a bigtop meetup, we could possibly bring some folks
>> from the boston / nyc area together to walk through building a bigtop
>> cluster on bare metal.  Let us know if any other questions.   These
>> directions are admittedly a little bit rough.
>>
>> Also, once you get this working, you can help us to update the wiki pages.
>>
>>
>>
>>
>>
>>
>> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
>> wrote:
>>
>>> Hi Bigtop!
>>>
>>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>>> currently messing about with the hadoop tarball and all of the associated
>>> xml files, and I don't really have the time or expertise to get it up and
>>> working.
>>>
>>> Jay suggested that bigtop may be a good solution, so I've decided to
>>> give it a shot. Unfortunately, documentation is fairly sparse and I'm not
>>> quite sure where to start. I've cloned the github repo and used the
>>> startup.sh script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up
>>> a virtual cluster, but I am unsure how to apply this to physical machines.
>>> I'm also not quite sure how to get hadoop and hdfs up and working.
>>>
>>> Any help would be appreciated!
>>>
>>> Thanks,
>>> David Fryer
>>>
>>
>>
>>
>> --
>> jay vyas
>>
>
>
>
> --
> jay vyas
>

Re: New to Bigtop, where to start?

Posted by jay vyas <ja...@gmail.com>.

one more note : by "look at the csv file" above i meant, "edit it so that
it reflects your
environment".

Make sure and read  the puppet README file as well under
bigtop-deploy/puppet.


On Wed, Jul 16, 2014 at 11:15 AM, jay vyas <ja...@gmail.com>
wrote:

> Hi david .
>
> Glad to hear the vagrant stuff worked for you.  Now , the next step will
> be to port it to bare metal, like you say.
>
> The Vagrantfile does two things
>
> 1) It creates a shared folder for all machines.
> 2) It spins up centos boxes .
>
>
> So in the "real world" you will need to obviously set up ssh between
> machines to start.
> After that , roughly, will need to do the following:
>
> - clone bigtop onto each of your  machines
> - install puppet 2.x on each of the machines
> - look at the csv file created in the vagrant provisioner, and read the
> puppet README file (in bigtop-deploy)
> - run puppet apply on the head node
> Once that works
> - run puppet apply on each slave.
> now on any node that you use as client, (i just use the master usually)
> you can yum install your favorite ecosystem components:
> yum install -y pig mahout
>
> And you have a working hadoop cluster.
>
> one idea as I know your on the east coast, if your company is interested
> in hosting/sponsoring a bigtop meetup, we could possibly bring some folks
> from the boston / nyc area together to walk through building a bigtop
> cluster on bare metal.  Let us know if any other questions.   These
> directions are admittedly a little bit rough.
>
> Also, once you get this working, you can help us to update the wiki pages.
>
>
>
>
>
>
> On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com>
> wrote:
>
>> Hi Bigtop!
>>
>> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
>> currently messing about with the hadoop tarball and all of the associated
>> xml files, and I don't really have the time or expertise to get it up and
>> working.
>>
>> Jay suggested that bigtop may be a good solution, so I've decided to give
>> it a shot. Unfortunately, documentation is fairly sparse and I'm not quite
>> sure where to start. I've cloned the github repo and used the startup.sh
>> script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up a virtual
>> cluster, but I am unsure how to apply this to physical machines. I'm also
>> not quite sure how to get hadoop and hdfs up and working.
>>
>> Any help would be appreciated!
>>
>> Thanks,
>> David Fryer
>>
>
>
>
> --
> jay vyas
>



-- 
jay vyas

Re: New to Bigtop, where to start?

Posted by jay vyas <ja...@gmail.com>.

Hi david .

Glad to hear the vagrant stuff worked for you.  Now , the next step will be
to port it to bare metal, like you say.

The Vagrantfile does two things

1) It creates a shared folder for all machines.
2) It spins up centos boxes .

So in the "real world" you will need to obviously set up ssh between
machines to start.
After that , roughly, will need to do the following:

- clone bigtop onto each of your  machines
- install puppet 2.x on each of the machines
- look at the csv file created in the vagrant provisioner, and read the
puppet README file (in bigtop-deploy)
- run puppet apply on the head node
Once that works
- run puppet apply on each slave.
now on any node that you use as client, (i just use the master usually) you
can yum install your favorite ecosystem components:
yum install -y pig mahout

And you have a working hadoop cluster.

one idea as I know your on the east coast, if your company is interested in
hosting/sponsoring a bigtop meetup, we could possibly bring some folks from
the boston / nyc area together to walk through building a bigtop cluster on
bare metal.  Let us know if any other questions.   These directions are
admittedly a little bit rough.

Also, once you get this working, you can help us to update the wiki pages.

On Wed, Jul 16, 2014 at 10:39 AM, David Fryer <df...@gmail.com> wrote:

> Hi Bigtop!
>
> I'm looking to use bigtop to help set up a small hadoop cluster. I'm
> currently messing about with the hadoop tarball and all of the associated
> xml files, and I don't really have the time or expertise to get it up and
> working.
>
> Jay suggested that bigtop may be a good solution, so I've decided to give
> it a shot. Unfortunately, documentation is fairly sparse and I'm not quite
> sure where to start. I've cloned the github repo and used the startup.sh
> script found in bigtop/bigtop-deploy/vm/vagrant-puppet to set up a virtual
> cluster, but I am unsure how to apply this to physical machines. I'm also
> not quite sure how to get hadoop and hdfs up and working.
>
> Any help would be appreciated!
>
> Thanks,
> David Fryer
>

-- 
jay vyas