You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon" <ed...@apache.org> on 2011/06/08 13:36:12 UTC

Re: how to debug bsp jobs in hama

> Maybe Hama can add more features for debug. For example the feature
> similar with org.apache.hadoop.mapred.IsolationRunner and so on.

Yes that's good idea.

Please feel free file a JIRA and Patch!

On Wed, Jun 8, 2011 at 3:36 PM, Fancy Yin <yf...@gmail.com> wrote:
> Hi, Edward:
>
> Thank you for your suggestion!
>
> I noticed the issue:
> https://issues.apache.org/jira/browse/HAMA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>
> So it is as same as Hadoop! I will try it!
>
> Maybe Hama can add more features for debug. For example the feature
> similar with org.apache.hadoop.mapred.IsolationRunner and so on.
>
> B.R.
> Fancy
>
> 2011/6/8, Edward J. Yoon <ed...@apache.org>:
>> Hi,
>>
>> There's a thread-based 'LocalBSPRunner'. It can be useful to debug.
>>
>> On Wed, Jun 8, 2011 at 11:10 AM, Fancy Yin <yf...@gmail.com> wrote:
>>> Hello:
>>>
>>> I am new to distributed system.
>>>
>>> When I tried to write some experimental
>>> code with Hama, I met one problem:how to debug a BSP job in Hama
>>> pseudo-distributed environment.
>>>
>>> From code, I know that GroomServer child task is spawned like this:
>>>
>>> this.process = Runtime.getRuntime().exec(args, null, dir);
>>>
>>> This is a child process. In Netbeans, I run the job as main process.
>>> For the main process, I can make breakpoint and debug it in any way.
>>> But for the child process, what should I do if I want to debug it ?
>>> How can I make sure whether there is some mistake in my bsp job.
>>>
>>> or maybe there is another way to debug it not in pseudo-distributed
>>> environment?
>>>
>>> My development environment is a pseudo-distributed enviroment.
>>> BSPMasterRunner process and GroomServerRunner are in the same
>>> computer.
>>>
>>> The following is my process snapshot when I run my bsp job.
>>>
>>> 5087 HRegionServer
>>> 25763 GroomServer$Child
>>> 16258 Main
>>> 22918 BSPMasterRunner
>>> 4653 TaskTracker
>>> 4412 SecondaryNameNode
>>> 4839 QuorumPeerMain
>>> 4490 JobTracker
>>> 25930 Jps
>>> 25694 Main
>>> 4933 HMaster
>>> 4247 DataNode
>>> 4060 NameNode
>>> 23055 GroomServerRunner
>>>
>>> I will appreciate u very much if you can give me some suggestions.
>>>
>>> B.R.
>>> Fancy
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: how to debug bsp jobs in hama

Posted by Fancy Yin <yf...@gmail.com>.
I will have a look at how org.apache.hadoop.mapred.IsolationRunne runs
in Hadoop.

2011/6/8, Edward J. Yoon <ed...@apache.org>:
>> Maybe Hama can add more features for debug. For example the feature
>> similar with org.apache.hadoop.mapred.IsolationRunner and so on.
>
> Yes that's good idea.
>
> Please feel free file a JIRA and Patch!
>
> On Wed, Jun 8, 2011 at 3:36 PM, Fancy Yin <yf...@gmail.com> wrote:
>> Hi, Edward:
>>
>> Thank you for your suggestion!
>>
>> I noticed the issue:
>> https://issues.apache.org/jira/browse/HAMA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>
>> So it is as same as Hadoop! I will try it!
>>
>> Maybe Hama can add more features for debug. For example the feature
>> similar with org.apache.hadoop.mapred.IsolationRunner and so on.
>>
>> B.R.
>> Fancy
>>
>> 2011/6/8, Edward J. Yoon <ed...@apache.org>:
>>> Hi,
>>>
>>> There's a thread-based 'LocalBSPRunner'. It can be useful to debug.
>>>
>>> On Wed, Jun 8, 2011 at 11:10 AM, Fancy Yin <yf...@gmail.com> wrote:
>>>> Hello:
>>>>
>>>> I am new to distributed system.
>>>>
>>>> When I tried to write some experimental
>>>> code with Hama, I met one problem:how to debug a BSP job in Hama
>>>> pseudo-distributed environment.
>>>>
>>>> From code, I know that GroomServer child task is spawned like this:
>>>>
>>>> this.process = Runtime.getRuntime().exec(args, null, dir);
>>>>
>>>> This is a child process. In Netbeans, I run the job as main process.
>>>> For the main process, I can make breakpoint and debug it in any way.
>>>> But for the child process, what should I do if I want to debug it ?
>>>> How can I make sure whether there is some mistake in my bsp job.
>>>>
>>>> or maybe there is another way to debug it not in pseudo-distributed
>>>> environment?
>>>>
>>>> My development environment is a pseudo-distributed enviroment.
>>>> BSPMasterRunner process and GroomServerRunner are in the same
>>>> computer.
>>>>
>>>> The following is my process snapshot when I run my bsp job.
>>>>
>>>> 5087 HRegionServer
>>>> 25763 GroomServer$Child
>>>> 16258 Main
>>>> 22918 BSPMasterRunner
>>>> 4653 TaskTracker
>>>> 4412 SecondaryNameNode
>>>> 4839 QuorumPeerMain
>>>> 4490 JobTracker
>>>> 25930 Jps
>>>> 25694 Main
>>>> 4933 HMaster
>>>> 4247 DataNode
>>>> 4060 NameNode
>>>> 23055 GroomServerRunner
>>>>
>>>> I will appreciate u very much if you can give me some suggestions.
>>>>
>>>> B.R.
>>>> Fancy
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Re: how to debug bsp jobs in hama

Posted by Thomas Jungblut <th...@googlemail.com>.
Hey,

I'd like to quote Aaron Kimball here:

Maybe I was jumping the gun on reading your description; you mentioned the
> LocalJobRunner, so I thought you were off-the-bat proposing to build another
> *JobRunner. Since we already have three job runners (the usual one, the
> local one, and the isolation runner), all of which have their own quirks,
> idiosyncrasies and bugs, I would be nervous about yet another one of these
> which will have its own slightly deviant semantics, and hope that we could
> reuse a lot of the existing task deployment code.
>
https://issues.apache.org/jira/browse/MAPREDUCE-1220?focusedCommentId=12780926&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12780926

I'm not too sure if we need an IsolationRunner.
IF we need one, we should extend the LocalJobRunner to run with a given
Configuration, as XML or object. (That's the main reason, why there is an
isolation runner isn't it? Besides that you can run only MAP and only REDUCE
tasks, this is obviously not transferable to BSP).
So my approach would be just to add another constructor that takes a
configuration and let it be used by the IsolationRunner class. The class is
responsible for parsing the XML and extra stuff needed, so we can prevent
duplicate code.

BUT this new runner won't give you the pseudo-distributed "feeling". So you
can equally use the LocalBSPRunner. In my experience this is enough to build
a distributed application if you don't violate hadoop's RPC serialization
rules like correctly implementing the read/write methods if you're providing
your own model classes.

I am +1 for a better testable version of a BSPCluster like used in the
Testcases, it is called MiniBspCluster isn't it?
I'm sure this would help Ashish with his GSoC project, too.
However this will run several JVMs, so you have to attach your debugger to
the one you need.
But this is the case in every distributed application, so I'm not sure how
to target this.

Sorry for the wall of text.
Best Regards,
Thomas

2011/6/8 Edward J. Yoon <ed...@apache.org>:
>> Maybe Hama can add more features for debug. For example the feature
>> similar with org.apache.hadoop.mapred.IsolationRunner and so on.
>
> Yes that's good idea.
>
> Please feel free file a JIRA and Patch!
>
> On Wed, Jun 8, 2011 at 3:36 PM, Fancy Yin <yf...@gmail.com> wrote:
>> Hi, Edward:
>>
>> Thank you for your suggestion!
>>
>> I noticed the issue:
>>
https://issues.apache.org/jira/browse/HAMA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>
>> So it is as same as Hadoop! I will try it!
>>
>> Maybe Hama can add more features for debug. For example the feature
>> similar with org.apache.hadoop.mapred.IsolationRunner and so on.
>>
>> B.R.
>> Fancy
>>
>> 2011/6/8, Edward J. Yoon <ed...@apache.org>:
>>> Hi,
>>>
>>> There's a thread-based 'LocalBSPRunner'. It can be useful to debug.
>>>
>>> On Wed, Jun 8, 2011 at 11:10 AM, Fancy Yin <yf...@gmail.com> wrote:
>>>> Hello:
>>>>
>>>> I am new to distributed system.
>>>>
>>>> When I tried to write some experimental
>>>> code with Hama, I met one problem:how to debug a BSP job in Hama
>>>> pseudo-distributed environment.
>>>>
>>>> From code, I know that GroomServer child task is spawned like this:
>>>>
>>>> this.process = Runtime.getRuntime().exec(args, null, dir);
>>>>
>>>> This is a child process. In Netbeans, I run the job as main process.
>>>> For the main process, I can make breakpoint and debug it in any way.
>>>> But for the child process, what should I do if I want to debug it ?
>>>> How can I make sure whether there is some mistake in my bsp job.
>>>>
>>>> or maybe there is another way to debug it not in pseudo-distributed
>>>> environment?
>>>>
>>>> My development environment is a pseudo-distributed enviroment.
>>>> BSPMasterRunner process and GroomServerRunner are in the same
>>>> computer.
>>>>
>>>> The following is my process snapshot when I run my bsp job.
>>>>
>>>> 5087 HRegionServer
>>>> 25763 GroomServer$Child
>>>> 16258 Main
>>>> 22918 BSPMasterRunner
>>>> 4653 TaskTracker
>>>> 4412 SecondaryNameNode
>>>> 4839 QuorumPeerMain
>>>> 4490 JobTracker
>>>> 25930 Jps
>>>> 25694 Main
>>>> 4933 HMaster
>>>> 4247 DataNode
>>>> 4060 NameNode
>>>> 23055 GroomServerRunner
>>>>
>>>> I will appreciate u very much if you can give me some suggestions.
>>>>
>>>> B.R.
>>>> Fancy
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com