You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by KrzyCube <yu...@gmail.com> on 2007/06/21 11:08:25 UTC

How to Start Hadoop Cluster from source code in Eclipse

Hi,all:

I am using Eclipse to View Hadoop source code , and i want to trace to see
how it works, I code a few code to call the FSClient  and when i call into
the RPC  object, it can not to be deep more .

So i just want to start cluster from source code , which i am holding them
in Eclipse now. 
I browse the start-*.sh , and find that it must start several threads , such
as namenode , datanode,secondnamenode. i just don't know how to figure out.

or is there any way to attach my code to a running process , just as the gdb
while we are debug c code 

Does any body ever use Eclipse to debug these source code , please give some
tip.

                                                                                     
Thanks .
                                                                                      
KrzyCube
-- 
View this message in context: http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Eclipse-tf3957457.html#a11229322
Sent from the Hadoop Users mailing list archive at Nabble.com.


Re: How to Start Hadoop Cluster from source code in Eclipse

Posted by KrzyCube <yu...@gmail.com>.
Finally, I start it successfully , with the "NameNode"  and one "DataNode"
which both on the localhost.

My configures are :

1. extract the code from the tar.gz  , i got the version hadoop-0.12.3
2. in eclipse, new a project from the ant file "build.xml" which under the
source code folder.
3. Try compile. (may be have to configure the java compile version,in
project properties or eclipse preference. I just enable the java6.0 in my
ubuntu7.04)
4.if that done well , find the NameNode.java ,configure as a JavaApplication
, and try to run.
5. If there are some exceptions in log4j like "can not found log appender".
It might be  "conf" problem. I fix this with add the "Hadoop/conf" folder to
"use as source folder". In eclipse it is easy , 
find the the conf folder in the source exploer tree view , then right-click
->Build Path->"use as source folder"

6. rebuild , try run again . Now there may be exception like "NameNode have
not be formatted"
7.add "-format" arguments to the application once , it will format the
namenode , then drop this arguments.
8. then i take some other configurations here , export the "HADOOP_HOME" in
the hadoop-env.sh.
    make it direct to the source code path is OK.
    configure the hadoop-site.xml , just as there in the hadoop wiki
says,host, ports , and the paths such as dfs.name.dir. here i just give it
the path that format generated , something look like this
"*/workspace/Hadoop/filesystem/name".

9.  rebuild and retry, then , must go to the "webapps not found in classpath
" found,which i refer in my last post . Just copy to the Hadoop/bin folder
won't be ok , that just cause another strange exception.

10. after trace some code , i found that while create the httpserver it
found webapps in /src/webapps, yes , it's there , but not work , i copy the
"Hadoop/src/webapps" to "Hadoop/src/java/webapps" the , refresh the tree
view in eclipse , and find the webapps folder under java/ ,
right-click->build-path->include.
Now the webapps folder will be copy to the output to the path which we set
for the bulid-output-folder, Hadoop/bin or Hadoop/build , i choose the first
as default.

11. Try again , the NameNode started , cheer.
12. Configure DataNode.java as JavaApplication , run , started , cheer
again.
13. Then i toggle some breakpoints in the source files , and write some code
who calls the FSShell from another computer, wonderful , the breakpoints
actived at the server-side.
-------------------------------------------------------------------------------------------------------------------------

Then after that , still i have problem that :

1. If i want to start the jobtrackers ,etc , just do as the DataNode  ??
2. The how can i start a cluster with several datanode , need some scripts ?

------------And thanks for you guys reply , those really help me a lot ,
thanks.
                                                                                         
KrzyCube





KrzyCube wrote:
> 
>  I take steps below:
> 
> 1. New a project from the exist ant file "build.xml"
> 2.try to compile the project , its done well.
> 3.find NameNode.java and configure as a Java App to run.
> 4.Told me that NameNode not formatted , then i do it with -format argument
> 5.Then , Exceptions as "webapps" not found in classpath
> 6.so i try to configure the src/webapps folder as Build->Use as source
> folder
> 7.Build the project again. But i can find the webapps output to
> build_output_path
> 8.Then i just copy the "webapps" to the bin/ path , as my build output
> path is Hadoop/bin.
> 9.Then Exceptions like these:
> ----------------------------------------------------------------------------------------------------------------------
> 07/06/22 12:42:22 INFO dfs.StateChange: STATE* Network topology has 0
> racks and 0 datanodes
> 07/06/22 12:42:22 INFO dfs.StateChange: STATE* UnderReplicatedBlocks has 0
> blocks
> 07/06/22 12:42:22 INFO util.Credential: Checking Resource aliases
> 07/06/22 12:42:22 INFO http.HttpServer: Version Jetty/5.1.4
> 07/06/22 12:42:22 INFO util.Container: Started
> HttpContext[/static,/static]
> 07/06/22 12:42:23 INFO util.Container: Started
> org.mortbay.jetty.servlet.WebApplicationHandler@1ec6696
> 07/06/22 12:42:23 INFO http.SocketListener: Started SocketListener on
> 0.0.0.0:50070
> 07/06/22 12:42:23 ERROR dfs.NameNode: java.io.IOException: Problem
> starting http server
> 	at
> org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:211)
> 	at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:274)
> 	at org.apache.hadoop.dfs.NameNode.init(NameNode.java:178)
> 	at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:195)
> 	at org.apache.hadoop.dfs.NameNode.main(NameNode.java:728)
> Caused by:
> org.mortbay.util.MultiException[java.lang.ClassNotFoundException:
> org.apache.hadoop.dfs.dfshealth_jsp, java.lang.ClassNotFoundException:
> org.apache.hadoop.dfs.nn_005fbrowsedfscontent_jsp]
> 	at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
> 	at org.mortbay.util.Container.start(Container.java:72)
> 	at
> org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:188)
> 	... 4 more
> -------------------------------------------------------------------------------------------------------------------------
> I tried configure here and there and try and try , but , this exception is
> still there.
> what the problem this exception might be?
> 
>                                                                     
> Thanks a lot
>                                                      KrzyCube
> 
> 
> Konstantin Shvachko wrote:
>> 
>> I run entire one node cluster in eclipse by just executing main() (run 
>> or debug menus) for each component.
>> You need to configure eclipse correctly in order to do that. Can you 
>> compile the whole thing under eclipse?
>> NameNode example:
>> = Open NameNode.java in the editor.
>> = Run / Run
>> = New Java Application -> will create an entry under "Java Application" 
>> named NameNode
>> = Select NameNode, go to tab Arguments and enter the following arguments 
>> under "VM Arguments":
>>   -Dhadoop.log.dir=./logs  
>>   -Xmx500m
>>   -ea
>>     The first one is required, can point to your log directory, the 
>> other two are optional
>> = go to the "Classpath" tab, add "hadoop/build" path under "User entries"
>> by
>>     Advanced / New Folder / select "hadoop/build"
>> That should be it, if the default classpath is configured correctly, and 
>> if I am not forgetting anything.
>> Let me know if that helped, I 'll send you screenshots of my 
>> configuration if not.
>> 
>> --Konstantin
>> 
>> 
>> Mahajan, Neeraj wrote:
>> 
>>>There are two sepearete issues you are asking here:
>>>1. How to modify/add to haddop code and execute the changed -
>>>Eclipse is just an IDE, it doesn't matter whether you use eclipse or
>>>some other editor.
>>>I have been using eclipse. What I do is modify the code using eclipse
>>>and then run "ant jar" in the root folder of hadoop (you could also
>>>configure this to work directly from eclipse). This would regenerate the
>>>jars and put them in build/ folder. Now you can either copy these jars
>>>into hadoop root folder (removing "dev" in their name) so that they
>>>replace the original jars or modify the scripts in bin/ to point to the
>>>newly generated jars.
>>>
>>>2. How to debug using a IDE -
>>>This page gives a high-level intro to debugging hadoop -
>>>http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms
>>>According to me, there are two ways you can debug hadoop programs: Run
>>>hadoop in local mode and debug in process in the IDE or run hadoop in
>>>distributed mode and remote debug using IDE.
>>>
>>>The first way is easy. In the bin/hadoop script at the end there is a
>>>exec command, instead of that put a echo command and run your program.
>>>You can see what the paramters the script passes while starting hadoop.
>>>Use these same parameters in the IDE and you can debug hadoop. Remember
>>>to make change to the conf files so that hadoop runs in local mode. To
>>>be more specific, you will have to set the program arguemnts, VM
>>>arguments and add an entry in the classpath pointing to the conf folder.
>>>
>>>The second method is compilcated. You will have to modify the scripts
>>>and put in some extra params like "-Xdebug
>>>-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the
>>>java command. Specify the <port> of you choice in it. On the server
>>>where you are running both the namenode/jobnode there will be a conflict
>>>as the same port would be specified. So you will have to do some
>>>intelligent scripting to take care of this. Once the java processes
>>>start you can attach eclipse debugger to that machine's <port> and set
>>>breakpoints. Till this part you can debug all the things before map
>>>reduce tasks. Mapp reduce tasks run in separate process, for debugging
>>>them you will have to figure out yourself.
>>>
>>>The best way is to debug using the first approach (as the above link
>>>says). I think by that approach you can fix any map-reduce related
>>>problems and for other purely distributed kind of problems you can
>>>follow the second approach.
>>>
>>>~ Neeraj
>>>
>>>-----Original Message-----
>>>From: KrzyCube [mailto:yuxh312@gmail.com] 
>>>Sent: Thursday, June 21, 2007 2:08 AM
>>>To: hadoop-user@lucene.apache.org
>>>Subject: How to Start Hadoop Cluster from source code in Eclipse
>>>
>>>
>>>Hi,all:
>>>
>>>I am using Eclipse to View Hadoop source code , and i want to trace to
>>>see how it works, I code a few code to call the FSClient  and when i
>>>call into the RPC  object, it can not to be deep more .
>>>
>>>So i just want to start cluster from source code , which i am holding
>>>them in Eclipse now. 
>>>I browse the start-*.sh , and find that it must start several threads ,
>>>such as namenode , datanode,secondnamenode. i just don't know how to
>>>figure out.
>>>
>>>or is there any way to attach my code to a running process , just as the
>>>gdb while we are debug c code 
>>>
>>>Does any body ever use Eclipse to debug these source code , please give
>>>some tip.
>>>
>>> 
>>>
>>>Thanks .
>>> 
>>>
>>>KrzyCube
>>>--
>>>View this message in context:
>>>http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec
>>>lipse-tf3957457.html#a11229322
>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>
>>>  
>>>
>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Eclipse-tf3957457.html#a11247363
Sent from the Hadoop Users mailing list archive at Nabble.com.


Re: How to Start Hadoop Cluster from source code in Eclipse

Posted by KrzyCube <yu...@gmail.com>.
 I take steps below:

1. New a project from the exist ant file "build.xml"
2.try to compile the project , its done well.
3.find NameNode.java and configure as a Java App to run.
4.Told me that NameNode not formatted , then i do it with -format argument
5.Then , Exceptions as "webapps" not found in classpath
6.so i try to configure the src/webapps folder as Build->Use as source
folder
7.Build the project again. But i can find the webapps output to
build_output_path
8.Then i just copy the "webapps" to the bin/ path , as my build output path
is Hadoop/bin.
9.Then Exceptions like these:
----------------------------------------------------------------------------------------------------------------------
07/06/22 12:42:22 INFO dfs.StateChange: STATE* Network topology has 0 racks
and 0 datanodes
07/06/22 12:42:22 INFO dfs.StateChange: STATE* UnderReplicatedBlocks has 0
blocks
07/06/22 12:42:22 INFO util.Credential: Checking Resource aliases
07/06/22 12:42:22 INFO http.HttpServer: Version Jetty/5.1.4
07/06/22 12:42:22 INFO util.Container: Started HttpContext[/static,/static]
07/06/22 12:42:23 INFO util.Container: Started
org.mortbay.jetty.servlet.WebApplicationHandler@1ec6696
07/06/22 12:42:23 INFO http.SocketListener: Started SocketListener on
0.0.0.0:50070
07/06/22 12:42:23 ERROR dfs.NameNode: java.io.IOException: Problem starting
http server
	at
org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:211)
	at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:274)
	at org.apache.hadoop.dfs.NameNode.init(NameNode.java:178)
	at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:195)
	at org.apache.hadoop.dfs.NameNode.main(NameNode.java:728)
Caused by: org.mortbay.util.MultiException[java.lang.ClassNotFoundException:
org.apache.hadoop.dfs.dfshealth_jsp, java.lang.ClassNotFoundException:
org.apache.hadoop.dfs.nn_005fbrowsedfscontent_jsp]
	at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
	at org.mortbay.util.Container.start(Container.java:72)
	at
org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:188)
	... 4 more
-------------------------------------------------------------------------------------------------------------------------
I tried configure here and there and try and try , but , this exception is
still there.
what the problem this exception might be?

                                                                     Thanks
a lot
                                                     KrzyCube


Konstantin Shvachko wrote:
> 
> I run entire one node cluster in eclipse by just executing main() (run 
> or debug menus) for each component.
> You need to configure eclipse correctly in order to do that. Can you 
> compile the whole thing under eclipse?
> NameNode example:
> = Open NameNode.java in the editor.
> = Run / Run
> = New Java Application -> will create an entry under "Java Application" 
> named NameNode
> = Select NameNode, go to tab Arguments and enter the following arguments 
> under "VM Arguments":
>   -Dhadoop.log.dir=./logs  
>   -Xmx500m
>   -ea
>     The first one is required, can point to your log directory, the 
> other two are optional
> = go to the "Classpath" tab, add "hadoop/build" path under "User entries"
> by
>     Advanced / New Folder / select "hadoop/build"
> That should be it, if the default classpath is configured correctly, and 
> if I am not forgetting anything.
> Let me know if that helped, I 'll send you screenshots of my 
> configuration if not.
> 
> --Konstantin
> 
> 
> Mahajan, Neeraj wrote:
> 
>>There are two sepearete issues you are asking here:
>>1. How to modify/add to haddop code and execute the changed -
>>Eclipse is just an IDE, it doesn't matter whether you use eclipse or
>>some other editor.
>>I have been using eclipse. What I do is modify the code using eclipse
>>and then run "ant jar" in the root folder of hadoop (you could also
>>configure this to work directly from eclipse). This would regenerate the
>>jars and put them in build/ folder. Now you can either copy these jars
>>into hadoop root folder (removing "dev" in their name) so that they
>>replace the original jars or modify the scripts in bin/ to point to the
>>newly generated jars.
>>
>>2. How to debug using a IDE -
>>This page gives a high-level intro to debugging hadoop -
>>http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms
>>According to me, there are two ways you can debug hadoop programs: Run
>>hadoop in local mode and debug in process in the IDE or run hadoop in
>>distributed mode and remote debug using IDE.
>>
>>The first way is easy. In the bin/hadoop script at the end there is a
>>exec command, instead of that put a echo command and run your program.
>>You can see what the paramters the script passes while starting hadoop.
>>Use these same parameters in the IDE and you can debug hadoop. Remember
>>to make change to the conf files so that hadoop runs in local mode. To
>>be more specific, you will have to set the program arguemnts, VM
>>arguments and add an entry in the classpath pointing to the conf folder.
>>
>>The second method is compilcated. You will have to modify the scripts
>>and put in some extra params like "-Xdebug
>>-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the
>>java command. Specify the <port> of you choice in it. On the server
>>where you are running both the namenode/jobnode there will be a conflict
>>as the same port would be specified. So you will have to do some
>>intelligent scripting to take care of this. Once the java processes
>>start you can attach eclipse debugger to that machine's <port> and set
>>breakpoints. Till this part you can debug all the things before map
>>reduce tasks. Mapp reduce tasks run in separate process, for debugging
>>them you will have to figure out yourself.
>>
>>The best way is to debug using the first approach (as the above link
>>says). I think by that approach you can fix any map-reduce related
>>problems and for other purely distributed kind of problems you can
>>follow the second approach.
>>
>>~ Neeraj
>>
>>-----Original Message-----
>>From: KrzyCube [mailto:yuxh312@gmail.com] 
>>Sent: Thursday, June 21, 2007 2:08 AM
>>To: hadoop-user@lucene.apache.org
>>Subject: How to Start Hadoop Cluster from source code in Eclipse
>>
>>
>>Hi,all:
>>
>>I am using Eclipse to View Hadoop source code , and i want to trace to
>>see how it works, I code a few code to call the FSClient  and when i
>>call into the RPC  object, it can not to be deep more .
>>
>>So i just want to start cluster from source code , which i am holding
>>them in Eclipse now. 
>>I browse the start-*.sh , and find that it must start several threads ,
>>such as namenode , datanode,secondnamenode. i just don't know how to
>>figure out.
>>
>>or is there any way to attach my code to a running process , just as the
>>gdb while we are debug c code 
>>
>>Does any body ever use Eclipse to debug these source code , please give
>>some tip.
>>
>> 
>>
>>Thanks .
>> 
>>
>>KrzyCube
>>--
>>View this message in context:
>>http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec
>>lipse-tf3957457.html#a11229322
>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>
>>  
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Eclipse-tf3957457.html#a11246246
Sent from the Hadoop Users mailing list archive at Nabble.com.


Re: How to Start Hadoop Cluster from source code in Eclipse

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
I run entire one node cluster in eclipse by just executing main() (run 
or debug menus) for each component.
You need to configure eclipse correctly in order to do that. Can you 
compile the whole thing under eclipse?
NameNode example:
= Open NameNode.java in the editor.
= Run / Run
= New Java Application -> will create an entry under "Java Application" 
named NameNode
= Select NameNode, go to tab Arguments and enter the following arguments 
under "VM Arguments":
  -Dhadoop.log.dir=./logs  
  -Xmx500m
  -ea
    The first one is required, can point to your log directory, the 
other two are optional
= go to the "Classpath" tab, add "hadoop/build" path under "User entries" by
    Advanced / New Folder / select "hadoop/build"
That should be it, if the default classpath is configured correctly, and 
if I am not forgetting anything.
Let me know if that helped, I 'll send you screenshots of my 
configuration if not.

--Konstantin


Mahajan, Neeraj wrote:

>There are two sepearete issues you are asking here:
>1. How to modify/add to haddop code and execute the changed -
>Eclipse is just an IDE, it doesn't matter whether you use eclipse or
>some other editor.
>I have been using eclipse. What I do is modify the code using eclipse
>and then run "ant jar" in the root folder of hadoop (you could also
>configure this to work directly from eclipse). This would regenerate the
>jars and put them in build/ folder. Now you can either copy these jars
>into hadoop root folder (removing "dev" in their name) so that they
>replace the original jars or modify the scripts in bin/ to point to the
>newly generated jars.
>
>2. How to debug using a IDE -
>This page gives a high-level intro to debugging hadoop -
>http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms
>According to me, there are two ways you can debug hadoop programs: Run
>hadoop in local mode and debug in process in the IDE or run hadoop in
>distributed mode and remote debug using IDE.
>
>The first way is easy. In the bin/hadoop script at the end there is a
>exec command, instead of that put a echo command and run your program.
>You can see what the paramters the script passes while starting hadoop.
>Use these same parameters in the IDE and you can debug hadoop. Remember
>to make change to the conf files so that hadoop runs in local mode. To
>be more specific, you will have to set the program arguemnts, VM
>arguments and add an entry in the classpath pointing to the conf folder.
>
>The second method is compilcated. You will have to modify the scripts
>and put in some extra params like "-Xdebug
>-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the
>java command. Specify the <port> of you choice in it. On the server
>where you are running both the namenode/jobnode there will be a conflict
>as the same port would be specified. So you will have to do some
>intelligent scripting to take care of this. Once the java processes
>start you can attach eclipse debugger to that machine's <port> and set
>breakpoints. Till this part you can debug all the things before map
>reduce tasks. Mapp reduce tasks run in separate process, for debugging
>them you will have to figure out yourself.
>
>The best way is to debug using the first approach (as the above link
>says). I think by that approach you can fix any map-reduce related
>problems and for other purely distributed kind of problems you can
>follow the second approach.
>
>~ Neeraj
>
>-----Original Message-----
>From: KrzyCube [mailto:yuxh312@gmail.com] 
>Sent: Thursday, June 21, 2007 2:08 AM
>To: hadoop-user@lucene.apache.org
>Subject: How to Start Hadoop Cluster from source code in Eclipse
>
>
>Hi,all:
>
>I am using Eclipse to View Hadoop source code , and i want to trace to
>see how it works, I code a few code to call the FSClient  and when i
>call into the RPC  object, it can not to be deep more .
>
>So i just want to start cluster from source code , which i am holding
>them in Eclipse now. 
>I browse the start-*.sh , and find that it must start several threads ,
>such as namenode , datanode,secondnamenode. i just don't know how to
>figure out.
>
>or is there any way to attach my code to a running process , just as the
>gdb while we are debug c code 
>
>Does any body ever use Eclipse to debug these source code , please give
>some tip.
>
> 
>
>Thanks .
> 
>
>KrzyCube
>--
>View this message in context:
>http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec
>lipse-tf3957457.html#a11229322
>Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>  
>


RE: How to Start Hadoop Cluster from source code in Eclipse

Posted by "Mahajan, Neeraj" <ne...@ebay.com>.
There are two sepearete issues you are asking here:
1. How to modify/add to haddop code and execute the changed -
Eclipse is just an IDE, it doesn't matter whether you use eclipse or
some other editor.
I have been using eclipse. What I do is modify the code using eclipse
and then run "ant jar" in the root folder of hadoop (you could also
configure this to work directly from eclipse). This would regenerate the
jars and put them in build/ folder. Now you can either copy these jars
into hadoop root folder (removing "dev" in their name) so that they
replace the original jars or modify the scripts in bin/ to point to the
newly generated jars.

2. How to debug using a IDE -
This page gives a high-level intro to debugging hadoop -
http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms
According to me, there are two ways you can debug hadoop programs: Run
hadoop in local mode and debug in process in the IDE or run hadoop in
distributed mode and remote debug using IDE.

The first way is easy. In the bin/hadoop script at the end there is a
exec command, instead of that put a echo command and run your program.
You can see what the paramters the script passes while starting hadoop.
Use these same parameters in the IDE and you can debug hadoop. Remember
to make change to the conf files so that hadoop runs in local mode. To
be more specific, you will have to set the program arguemnts, VM
arguments and add an entry in the classpath pointing to the conf folder.

The second method is compilcated. You will have to modify the scripts
and put in some extra params like "-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the
java command. Specify the <port> of you choice in it. On the server
where you are running both the namenode/jobnode there will be a conflict
as the same port would be specified. So you will have to do some
intelligent scripting to take care of this. Once the java processes
start you can attach eclipse debugger to that machine's <port> and set
breakpoints. Till this part you can debug all the things before map
reduce tasks. Mapp reduce tasks run in separate process, for debugging
them you will have to figure out yourself.

The best way is to debug using the first approach (as the above link
says). I think by that approach you can fix any map-reduce related
problems and for other purely distributed kind of problems you can
follow the second approach.

~ Neeraj

-----Original Message-----
From: KrzyCube [mailto:yuxh312@gmail.com] 
Sent: Thursday, June 21, 2007 2:08 AM
To: hadoop-user@lucene.apache.org
Subject: How to Start Hadoop Cluster from source code in Eclipse


Hi,all:

I am using Eclipse to View Hadoop source code , and i want to trace to
see how it works, I code a few code to call the FSClient  and when i
call into the RPC  object, it can not to be deep more .

So i just want to start cluster from source code , which i am holding
them in Eclipse now. 
I browse the start-*.sh , and find that it must start several threads ,
such as namenode , datanode,secondnamenode. i just don't know how to
figure out.

or is there any way to attach my code to a running process , just as the
gdb while we are debug c code 

Does any body ever use Eclipse to debug these source code , please give
some tip.

 

Thanks .
 

KrzyCube
--
View this message in context:
http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec
lipse-tf3957457.html#a11229322
Sent from the Hadoop Users mailing list archive at Nabble.com.