You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/06/01 00:18:57 UTC

RE: built hadoop! please help with next steps?

Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:

*         Deploy the simplest "development only" cluster (single node?) and learn how to debug within it.  I read about the "local runner" configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

Answered my own question.  The Eclipse installs with Centos6 (or with yum) seems to have this problem.  A direct download of Eclipse for Java EE works fine.
John

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Monday, June 03, 2013 5:49 PM
To: user@hadoop.apache.org; Deepak Vohra
Subject: RE: built hadoop! please help with next steps?

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

Answered my own question.  The Eclipse installs with Centos6 (or with yum) seems to have this problem.  A direct download of Eclipse for Java EE works fine.
John

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Monday, June 03, 2013 5:49 PM
To: user@hadoop.apache.org; Deepak Vohra
Subject: RE: built hadoop! please help with next steps?

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

Answered my own question.  The Eclipse installs with Centos6 (or with yum) seems to have this problem.  A direct download of Eclipse for Java EE works fine.
John

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Monday, June 03, 2013 5:49 PM
To: user@hadoop.apache.org; Deepak Vohra
Subject: RE: built hadoop! please help with next steps?

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

Answered my own question.  The Eclipse installs with Centos6 (or with yum) seems to have this problem.  A direct download of Eclipse for Java EE works fine.
John

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Monday, June 03, 2013 5:49 PM
To: user@hadoop.apache.org; Deepak Vohra
Subject: RE: built hadoop! please help with next steps?

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I am getting errors trying to install m2e… has anyone else encountered this?
Cannot complete the install because one or more required items could not be found.
  Software being installed: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
  Missing requirement: Maven POM XML Editor 1.4.0.20130601-0314 (org.eclipse.m2e.editor.xml 1.4.0.20130601-0314) requires 'bundle org.eclipse.wst.xml.ui 0.0.0' but it could not be found
  Cannot satisfy dependency:
    From: m2e - Maven Integration for Eclipse 1.4.0.20130601-0314 (org.eclipse.m2e.feature.feature.group 1.4.0.20130601-0314)
    To: org.eclipse.m2e.editor.xml [1.4.0.20130601-0314]

From: Deepak Vohra [mailto:dvohra09@yahoo.com]
Sent: Monday, June 03, 2013 4:12 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

thanks,
Deepak

________________________________
From: John Lilley <jo...@redpoint.net>>
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?

I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name/> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name/> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
•         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
•         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
•         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

Re: built hadoop! please help with next steps?

Posted by Deepak Vohra <dv...@yahoo.com>.

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


thanks,
Deepak


________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?
 


 
I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'               
Any idea what this means or how to fix it?
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.
 
On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
Hi John,
 
Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:
 
  mvn clean package -Pdist -Dtar -DskipTests=true
 
The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.
 
I then copy these scripts into the same directory:
 
hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
 
hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format
 
hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1
 
I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:
 
yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name = yarn
core-site.xml:
  fs.default.name = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false
 
Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop
 
Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.
 
Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?
 
Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
·         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
·         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
·         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?
 
Thanks
John

Re: built hadoop! please help with next steps?

Posted by Deepak Vohra <dv...@yahoo.com>.

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


thanks,
Deepak


________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?
 


 
I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'               
Any idea what this means or how to fix it?
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.
 
On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
Hi John,
 
Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:
 
  mvn clean package -Pdist -Dtar -DskipTests=true
 
The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.
 
I then copy these scripts into the same directory:
 
hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
 
hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format
 
hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1
 
I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:
 
yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name = yarn
core-site.xml:
  fs.default.name = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false
 
Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop
 
Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.
 
Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?
 
Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
·         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
·         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
·         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?
 
Thanks
John

Re: built hadoop! please help with next steps?

Posted by Deepak Vohra <dv...@yahoo.com>.

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


thanks,
Deepak


________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?
 


 
I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'               
Any idea what this means or how to fix it?
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.
 
On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
Hi John,
 
Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:
 
  mvn clean package -Pdist -Dtar -DskipTests=true
 
The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.
 
I then copy these scripts into the same directory:
 
hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
 
hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format
 
hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1
 
I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:
 
yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name = yarn
core-site.xml:
  fs.default.name = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false
 
Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop
 
Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.
 
Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?
 
Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
·         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
·         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
·         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?
 
Thanks
John

Re: built hadoop! please help with next steps?

Posted by Deepak Vohra <dv...@yahoo.com>.

John

The following patch is related to the issue cited.

https://issues.apache.org/jira/browse/HADOOP-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


thanks,
Deepak


________________________________
 From: John Lilley <jo...@redpoint.net>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org> 
Sent: Monday, June 3, 2013 1:51 PM
Subject: RE: built hadoop! please help with next steps?
 


 
I’ve followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace…
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'               
Any idea what this means or how to fix it?
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.
 
On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the “shell” ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I’m not sure if it applies to Hadoop 2.0 and YARN.
John
 
From:Sandy Ryza [mailto:sandy.ryza@cloudera.com] 
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?
 
Hi John,
 
Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:
 
  mvn clean package -Pdist -Dtar -DskipTests=true
 
The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.
 
I then copy these scripts into the same directory:
 
hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
 
hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format
 
hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1
 
I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:
 
yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name = yarn
core-site.xml:
  fs.default.name = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false
 
Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop
 
Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.
 
Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?
 
Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net> wrote:
Thanks for help me to build Hadoop!  I’m through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:
·         Deploy the simplest “development only” cluster (single node?) and learn how to debug within it.  I read about the “local runner” configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?
·         Build and run the ApplicationMaster “shell” sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.
·         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?
 
Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I've followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace...
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:

*         Deploy the simplest "development only" cluster (single node?) and learn how to debug within it.  I read about the "local runner" configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I've followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace...
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:

*         Deploy the simplest "development only" cluster (single node?) and learn how to debug within it.  I read about the "local runner" configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I've followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace...
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:

*         Deploy the simplest "development only" cluster (single node?) and learn how to debug within it.  I read about the "local runner" configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

RE: built hadoop! please help with next steps?

Posted by John Lilley <jo...@redpoint.net>.

I've followed the instructions in BUILDING.txt, generated the eclipse projects and imported the eclipse projects generated by maven using File -> Import -> General -> Existing project into workspace...
And they all appear.  However, the problems window shows:
Project 'hadoop-streaming' is missing required source folder: '/home/jlilley/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf'
Any idea what this means or how to fix it?
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 4:23 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

I've been successful with importing all the leaf-level maven projects as "Existing Maven Projects" using the eclipse maven plugin.  I've also gotten things to work without the eclipse maven plugin with some combination of mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top pom.xml as my eclipse workspace directory.

On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding the last question, I am still trying to get the source loaded into Eclipse in a manner that facilitates easier browsing, symbol search, editing, etc.  Perhaps I am just missing some obvious FAQ?  This is leading up to modifying and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=true

The tar will be located in the project directory under hadoop-dist/target/.  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=`pwd`
export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}
export YARN_HOME=${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <deploy directory>/etc/hadoop, and then copy the minimal site configuration into .  The advantage of using a directory that's not the /conf directory is that it won't be overwritten the next time you untar a new build.  Lastly, I copy the minimal site configuration into the conf files.  For the sake of brevity, I won't include the properties in full xml format, but here are the ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services = mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class = org.apache.hadoop.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> = yarn
core-site.xml:
  fs.default.name<http://fs.default.name> = hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication = 1
  dfs.permissions = false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that build submodules (sometimes all of mapreduce, sometimes just the resourcemanager) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it into Eclipse already, and want tips on the best way to browse within it?  Or that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of maven plugins into Eclipse.  I could use some pointers for next steps I want to take, which are:

*         Deploy the simplest "development only" cluster (single node?) and learn how to debug within it.  I read about the "local runner" configuration here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that still apply to MR2/YARN?  It seems like an old page; perhaps there is a newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that as a starting point for a customer AM.  I would much appreciate any advice on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse load?).  What is typically done to make for easy browsing of referenced classes/methods by name?

Thanks
John

Re: built hadoop! please help with next steps?

Posted by Sandy Ryza <sa...@cloudera.com>.

I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.


On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **
>
> Regarding your last question, are you saying that you were able to load it
> into Eclipse already, and want tips on the best way to browse within it?
>  Or that you're trying to get the source loaded into Eclipse?****
>
> ** **
>
> Hope that helps!****
>
> Sandy****
>
> On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Thanks for help me to build Hadoop!  I’m through compile and install of
> maven plugins into Eclipse.  I could use some pointers for next steps I
> want to take, which are:****
>
> ·         Deploy the simplest “development only” cluster (single node?)
> and learn how to debug within it.  I read about the “local runner”
> configuration here (
> http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that
> still apply to MR2/YARN?  It seems like an old page; perhaps there is a
> newer FAQ?****
>
> ·         Build and run the ApplicationMaster “shell” sample, and use
> that as a starting point for a customer AM.  I would much appreciate any
> advice on getting the edit/build/debug cycle ironed out for an AM.****
>
> ·         Setup Hadoop source for easier browsing and learning (Eclipse
> load?).  What is typically done to make for easy browsing of referenced
> classes/methods by name?****
>
>  ****
>
> Thanks****
>
> John****
>
>  ****
>
> ** **
>

Re: built hadoop! please help with next steps?

Posted by Sandy Ryza <sa...@cloudera.com>.

I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.


On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **
>
> Regarding your last question, are you saying that you were able to load it
> into Eclipse already, and want tips on the best way to browse within it?
>  Or that you're trying to get the source loaded into Eclipse?****
>
> ** **
>
> Hope that helps!****
>
> Sandy****
>
> On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Thanks for help me to build Hadoop!  I’m through compile and install of
> maven plugins into Eclipse.  I could use some pointers for next steps I
> want to take, which are:****
>
> ·         Deploy the simplest “development only” cluster (single node?)
> and learn how to debug within it.  I read about the “local runner”
> configuration here (
> http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that
> still apply to MR2/YARN?  It seems like an old page; perhaps there is a
> newer FAQ?****
>
> ·         Build and run the ApplicationMaster “shell” sample, and use
> that as a starting point for a customer AM.  I would much appreciate any
> advice on getting the edit/build/debug cycle ironed out for an AM.****
>
> ·         Setup Hadoop source for easier browsing and learning (Eclipse
> load?).  What is typically done to make for easy browsing of referenced
> classes/methods by name?****
>
>  ****
>
> Thanks****
>
> John****
>
>  ****
>
> ** **
>

Re: built hadoop! please help with next steps?

Posted by Sandy Ryza <sa...@cloudera.com>.

I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.


On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **
>
> Regarding your last question, are you saying that you were able to load it
> into Eclipse already, and want tips on the best way to browse within it?
>  Or that you're trying to get the source loaded into Eclipse?****
>
> ** **
>
> Hope that helps!****
>
> Sandy****
>
> On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Thanks for help me to build Hadoop!  I’m through compile and install of
> maven plugins into Eclipse.  I could use some pointers for next steps I
> want to take, which are:****
>
> ·         Deploy the simplest “development only” cluster (single node?)
> and learn how to debug within it.  I read about the “local runner”
> configuration here (
> http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that
> still apply to MR2/YARN?  It seems like an old page; perhaps there is a
> newer FAQ?****
>
> ·         Build and run the ApplicationMaster “shell” sample, and use
> that as a starting point for a customer AM.  I would much appreciate any
> advice on getting the edit/build/debug cycle ironed out for an AM.****
>
> ·         Setup Hadoop source for easier browsing and learning (Eclipse
> load?).  What is typically done to make for easy browsing of referenced
> classes/methods by name?****
>
>  ****
>
> Thanks****
>
> John****
>
>  ****
>
> ** **
>

Re: built hadoop! please help with next steps?

Posted by Sandy Ryza <sa...@cloudera.com>.

I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.


On Fri, May 31, 2013 at 3:18 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **
>
> Regarding your last question, are you saying that you were able to load it
> into Eclipse already, and want tips on the best way to browse within it?
>  Or that you're trying to get the source loaded into Eclipse?****
>
> ** **
>
> Hope that helps!****
>
> Sandy****
>
> On Thu, May 30, 2013 at 9:32 AM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Thanks for help me to build Hadoop!  I’m through compile and install of
> maven plugins into Eclipse.  I could use some pointers for next steps I
> want to take, which are:****
>
> ·         Deploy the simplest “development only” cluster (single node?)
> and learn how to debug within it.  I read about the “local runner”
> configuration here (
> http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that
> still apply to MR2/YARN?  It seems like an old page; perhaps there is a
> newer FAQ?****
>
> ·         Build and run the ApplicationMaster “shell” sample, and use
> that as a starting point for a customer AM.  I would much appreciate any
> advice on getting the edit/build/debug cycle ironed out for an AM.****
>
> ·         Setup Hadoop source for easier browsing and learning (Eclipse
> load?).  What is typically done to make for easy browsing of referenced
> classes/methods by name?****
>
>  ****
>
> Thanks****
>
> John****
>
>  ****
>
> ** **
>