You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by "Agarwal, Nikhil" <Ni...@netapp.com> on 2013/05/03 10:14:01 UTC

Which data sets were processed by each tasktracker?

Hi,

I  have a 3-node cluster, with JobTracker running on one machine and TaskTrackers on other two. Instead of using HDFS, I have written my own FileSystem implementation. I am able to run a MapReduce job on this cluster but I am not able to make out from logs or TaskTracker UI, which data sets were exactly processed by each of the two slaves.

Can you please tell me some way to find out what exactly did each of my tasktracker do during the entire job execution? I am using Hadoop-1.0.4 source code.

Thanks & Regards,
Nikhil

Re: Which data sets were processed by each tasktracker?

Posted by Harsh J <ha...@cloudera.com>.

You probably need to be using a release that has
https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will
print the input split onto the task logs, letting you know therefore
what it processed at all times (so long as the input split type, such
as file splits, have intelligible outputs for toString()).

On Fri, May 3, 2013 at 1:44 PM, Agarwal, Nikhil
<Ni...@netapp.com> wrote:
> Hi,
>
>
>
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two. Instead of using HDFS, I have written my own
> FileSystem implementation. I am able to run a MapReduce job on this cluster
> but I am not able to make out from logs or TaskTracker UI, which data sets
> were exactly processed by each of the two slaves.
>
>
>
> Can you please tell me some way to find out what exactly did each of my
> tasktracker do during the entire job execution? I am using Hadoop-1.0.4
> source code.
>
>
>
> Thanks & Regards,
>
> Nikhil

--
Harsh J

Re: Which data sets were processed by each tasktracker?

Posted by Harsh J <ha...@cloudera.com>.

You probably need to be using a release that has
https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will
print the input split onto the task logs, letting you know therefore
what it processed at all times (so long as the input split type, such
as file splits, have intelligible outputs for toString()).

On Fri, May 3, 2013 at 1:44 PM, Agarwal, Nikhil
<Ni...@netapp.com> wrote:
> Hi,
>
>
>
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two. Instead of using HDFS, I have written my own
> FileSystem implementation. I am able to run a MapReduce job on this cluster
> but I am not able to make out from logs or TaskTracker UI, which data sets
> were exactly processed by each of the two slaves.
>
>
>
> Can you please tell me some way to find out what exactly did each of my
> tasktracker do during the entire job execution? I am using Hadoop-1.0.4
> source code.
>
>
>
> Thanks & Regards,
>
> Nikhil

--
Harsh J

Re: Which data sets were processed by each tasktracker?

Posted by Harsh J <ha...@cloudera.com>.

You probably need to be using a release that has
https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will
print the input split onto the task logs, letting you know therefore
what it processed at all times (so long as the input split type, such
as file splits, have intelligible outputs for toString()).

On Fri, May 3, 2013 at 1:44 PM, Agarwal, Nikhil
<Ni...@netapp.com> wrote:
> Hi,
>
>
>
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two. Instead of using HDFS, I have written my own
> FileSystem implementation. I am able to run a MapReduce job on this cluster
> but I am not able to make out from logs or TaskTracker UI, which data sets
> were exactly processed by each of the two slaves.
>
>
>
> Can you please tell me some way to find out what exactly did each of my
> tasktracker do during the entire job execution? I am using Hadoop-1.0.4
> source code.
>
>
>
> Thanks & Regards,
>
> Nikhil

--
Harsh J

Re: Which data sets were processed by each tasktracker?

Posted by Harsh J <ha...@cloudera.com>.

You probably need to be using a release that has
https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will
print the input split onto the task logs, letting you know therefore
what it processed at all times (so long as the input split type, such
as file splits, have intelligible outputs for toString()).

On Fri, May 3, 2013 at 1:44 PM, Agarwal, Nikhil
<Ni...@netapp.com> wrote:
> Hi,
>
>
>
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two. Instead of using HDFS, I have written my own
> FileSystem implementation. I am able to run a MapReduce job on this cluster
> but I am not able to make out from logs or TaskTracker UI, which data sets
> were exactly processed by each of the two slaves.
>
>
>
> Can you please tell me some way to find out what exactly did each of my
> tasktracker do during the entire job execution? I am using Hadoop-1.0.4
> source code.
>
>
>
> Thanks & Regards,
>
> Nikhil

--
Harsh J