You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2008/02/28 04:07:03 UTC

More HOD: is there anyway to get HOD to copy back all of the log files to the submit node?

I have found that HOD writes a series of log files to directories on the 
virtual cluster master, if you specify log directories.
The interesting part is figuring out which machine was the virtual 
cluster master, if you have a decent sized pool of machines.

It would be so nice if HOD could copy this back to the submission node.

I suppose we could configure syslog up but then we have to configure the 
syslogd on each of the submit hosts to accept from any compute node...


-- 
Jason Venner
Attributor - Publish with Confidence <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers, contact if interested


Re: More HOD: is there anyway to get HOD to copy back all of the log files to the submit node?

Posted by Jason Venner <ja...@attributor.com>.
You are my hero of the day Hemanth, these 3 things will make our life so 
much simpler.

Not knowing torque at all and trying to deal with two systems that are 
essentially unknown has been making the learning curve quite steep.


Hemanth Yamijala wrote:
> Jason Venner wrote:
>> I have found that HOD writes a series of log files to directories on 
>> the virtual cluster master, if you specify log directories.
>> The interesting part is figuring out which machine was the virtual 
>> cluster master, if you have a decent sized pool of machines.
>>
> Can you explain what you mean by 'virtual cluster master' ? I guess 
> you could mean the node which ran the 'ringmaster' process - the 
> master process in a hod virtual cluster, or the hadoop 'jobtracker'. 
> Both this information is stored by hod for an allocated cluster. You 
> could retrieve it by using
> hod -o list,
> and
> hod -o "info cluster-dir"
>
> If the cluster is deallocated, you can get the ringmaster node still 
> by using torque commands:
>
> qstat (to get torque job ids)
> and
> qstat -n <jobid>
>
> This will print a list of nodes, the first one is the ringmaster.
>
>> It would be so nice if HOD could copy this back to the submission node.
>>
>> I suppose we could configure syslog up but then we have to configure 
>> the syslogd on each of the submit hosts to accept from any compute 
>> node...
>
> Syslog configuration has a bug which will be addressed in Hadoop 0.16.1.
>
> Thanks
> hemanth
>
-- 
Jason Venner
Attributor - Publish with Confidence <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers, contact if interested

Re: More HOD: is there anyway to get HOD to copy back all of the log files to the submit node?

Posted by Hemanth Yamijala <yh...@yahoo-inc.com>.
Jason Venner wrote:
> I have found that HOD writes a series of log files to directories on 
> the virtual cluster master, if you specify log directories.
> The interesting part is figuring out which machine was the virtual 
> cluster master, if you have a decent sized pool of machines.
>
Can you explain what you mean by 'virtual cluster master' ? I guess you 
could mean the node which ran the 'ringmaster' process - the master 
process in a hod virtual cluster, or the hadoop 'jobtracker'. Both this 
information is stored by hod for an allocated cluster. You could 
retrieve it by using
hod -o list,
and
hod -o "info cluster-dir"

If the cluster is deallocated, you can get the ringmaster node still by 
using torque commands:

qstat (to get torque job ids)
and
qstat -n <jobid>

This will print a list of nodes, the first one is the ringmaster.

> It would be so nice if HOD could copy this back to the submission node.
>
> I suppose we could configure syslog up but then we have to configure 
> the syslogd on each of the submit hosts to accept from any compute 
> node...

Syslog configuration has a bug which will be addressed in Hadoop 0.16.1.

Thanks
hemanth


Re: More HOD: is there anyway to get HOD to copy back all of the log files to the submit node?

Posted by Vinod KV <vi...@yahoo-inc.com>.
Jason Venner wrote:
> I have found that HOD writes a series of log files to directories on 
> the virtual cluster master, if you specify log directories.
> The interesting part is figuring out which machine was the virtual 
> cluster master, if you have a decent sized pool of machines.
>

In torque set up, HOD starts the "virtual cluster master" called 
ringmaster on the first node in the node list returned by torque. So 
doing a "qstat -f torque-jobid" or better "qstat -an jobid" will show 
you the list of nodes allocated for the current hod allocation, the 
first of them being the ringmaster.

> It would be so nice if HOD could copy this back to the submission node.
>

HOD currently doesn't copy any logs, neither of the master nor of the 
hodrings, to the submission node. Ideally, one should be able to figure 
out anything wrong just by looking at the error codes.

> I suppose we could configure syslog up but then we have to configure 
> the syslogd on each of the submit hosts to accept from any compute 
> node...
>
>

Yeah, that should be the case when using syslogd. But a better way would 
be to set up syslog-ng which would need setup at one centralized 
location, and let hod write all logs to that. JFYI, there is one issue 
still to be resolved ( HADOOP-2809 
<https://issues.apache.org/jira/browse/HADOOP-2809> ); once it is done 
you should be able to log to syslog/sylog-ng.

Hope that helps,

Thanks,
Vinod