You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jonathan Coveney <jc...@gmail.com> on 2011/05/10 17:35:39 UTC

Is there a way to see what file killed a mapper?

I have a basic job that is dying, I think, on one badly compressed file. Is
there a way to see what file it is choking on? Via the job tracker I can
find the mapper that is dying but I cannot find a record of the file that it
died on.

Thank you for your help

RE: Is there a way to see what file killed a mapper?

Posted by "GOEKE, MATTHEW [AG/1000]" <ma...@monsanto.com>.
Someone might have a more graceful method of determining this but I've
found outputting that kind of data to counters is the most effective
way. Otherwise you could use stderr or stdout but then you would need to
mine the log data on each node to figure it out.

 

Matt

 

From: Jonathan Coveney [mailto:jcoveney@gmail.com] 
Sent: Tuesday, May 10, 2011 10:36 AM
To: mapreduce-user@hadoop.apache.org
Subject: Is there a way to see what file killed a mapper?

 

I have a basic job that is dying, I think, on one badly compressed file.
Is there a way to see what file it is choking on? Via the job tracker I
can find the mapper that is dying but I cannot find a record of the file
that it died on.

 

Thank you for your help

This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.

Re: Is there a way to see what file killed a mapper?

Posted by Jonathan Coveney <jc...@gmail.com>.
Thanks, these are quite useful.

2011/5/10 Amar Kamat <am...@yahoo-inc.com>

>  MapReduce updates the task’s configuration and sets ‘map.input.file’ to
> point to the file on which the task intends to work on. In the new MapReduce
> API, its renamed to ‘mapreduce.map.input.file’. You can print the value
> corresponding to ‘map.input.file’. Similarly ‘map.input.start’  point to the
> start offset in the input file while ‘map.input.length’ points to the total
> size of the data to be read.
> Amar
>
>
> On 5/10/11 9:05 PM, "Jonathan Coveney" <jc...@gmail.com> wrote:
>
> I have a basic job that is dying, I think, on one badly compressed file. Is
> there a way to see what file it is choking on? Via the job tracker I can
> find the mapper that is dying but I cannot find a record of the file that it
> died on.
>
> Thank you for your help
>
>

Re: Is there a way to see what file killed a mapper?

Posted by Amar Kamat <am...@yahoo-inc.com>.
MapReduce updates the task's configuration and sets 'map.input.file' to point to the file on which the task intends to work on. In the new MapReduce API, its renamed to 'mapreduce.map.input.file'. You can print the value corresponding to 'map.input.file'. Similarly 'map.input.start'  point to the start offset in the input file while 'map.input.length' points to the total size of the data to be read.
Amar


On 5/10/11 9:05 PM, "Jonathan Coveney" <jc...@gmail.com> wrote:

I have a basic job that is dying, I think, on one badly compressed file. Is there a way to see what file it is choking on? Via the job tracker I can find the mapper that is dying but I cannot find a record of the file that it died on.

Thank you for your help