You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2006/09/05 10:06:47 UTC

[Lucene-hadoop Wiki] Trivial Update of "GettingStartedWithHadoop" by SameerParanjpye

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by SameerParanjpye:
http://wiki.apache.org/lucene-hadoop/GettingStartedWithHadoop

------------------------------------------------------------------------------
= Downloading and installing Hadoop =
- Hadoop can be downloaded from [http://www.apache.org/dyn/closer.cgi/lucene/hadoop/ Download]. To install Hadoop untar the tar file in your install directory. So the directory structure would like installdir/hadoop-[version]/. All the scripts to run Hadoop are in hadoop-[version]/bin. I will refer to this directory as hadoop/bin from now on.
+ Hadoop can be downloaded from [http://www.apache.org/dyn/closer.cgi/lucene/hadoop/ here]. To install Hadoop untar the tar file in your install directory. So the directory structure would like installdir/hadoop-[version]/. All the scripts to run Hadoop are in hadoop-[version]/bin. I will refer to this directory as hadoop/bin from now on.

= Starting Hadoop using Hadoop scripts =
This section explains how to set up a Hadoop cluster running Hadoop DFS and Hadoop Mapreduce. The startup scripts are in hadoop/bin. The file that contains all the slave nodes that would join the DFS and map reduce cluster is the slaves file in hadoop/conf. Edit the slaves file to add nodes to your cluster. You need to edit the slaves file only on the machines you plan to run the Jobtracker and Namenode on. In case you want to run a single node cluster you do not have to edit the slaves file. Next edit the file hadoop-env.sh in the hadoop/conf directory. Make sure JAVA_HOME is set correctly. You can change the other environment variables as per your requirements. HADOOP_HOME is automatically determined depending on where you run your hadoop scripts from.