You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Siddharth Karandikar <si...@gmail.com> on 2010/06/29 17:11:47 UTC

newbie - job failing at reduce

Hi All,

I am new to Hadoop, but by reading online docs and other resource, I
have moved ahead and now trying to run a cluster of 3 nodes.
Before doing this, tried my program on standalone and pseudo systems
and thats working fine.

Now the issue that I am facing - mapping phase works correctly. While
doing reduce, I am seeing following error on one of the nodes -

2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
in any of the configured local directories

Lets say this is @ Node1. But there is no such directory named
'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
directory is available on Node2 (or Node3). Tried running the job
multiple times, but its always failing while reducing. Same error.

I have configured /tmp/mapred/local on each node from mapred-site.xml.

I really don't understand why mappers are misplacing these files? Or
am I missing something in configuration?

If someone wants to look @ configurations, I have pasted that below.

Thanks,
Siddharth


Configurations
==========

conf/core-site.xml
---------------------------

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://192.168.2.115/</value>
  </property>
</configuration>


conf/hdfs-site.xml
--------------------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://192.168.2.115</value>
  </property>
  <property>
    <name>dfs.data.dir</name>
    <value>/home/siddharth/hdfs/data</value>
  </property>
  <property>
    <name>dfs.name.dir</name>
    <value>/home/siddharth/hdfs/name</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
</configuration>

conf/mapred-site.xml
------------------------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>192.168.2.115:8021</value>
  </property>
  <property>
    <name>mapred.local.dir</name>
    <value>/tmp/mapred/local</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.system.dir</name>
    <value>hdfs://192.168.2.115/maperdsystem</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>4</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.tasktracker.reduce.tasks.maximum</name>
    <value>4</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx512m</value>
    <!-- Not marked as final so jobs can include JVM debugging options -->
  </property>
</configuration>

Re: newbie - job failing at reduce

Posted by Siddharth Karandikar <si...@gmail.com>.

I am running with 10240 now and jobs look to be working fine. Need to
confirm this by reverting back to 1024 and see jobs failing.  :)

Thanks!



On Wed, Jun 30, 2010 at 10:59 PM, Siddharth Karandikar
<si...@gmail.com> wrote:
> Yeah. Looks like its set to 1024 right now. Change that to say 10
> times more and run the setup again.
> Thanks Ken!
>
> - Siddharth
>
> On Wed, Jun 30, 2010 at 10:28 PM, Ken Goodhope <ke...@gmail.com> wrote:
>> Have you increased your file handle limits?  You can check this with a
>> 'ulimit -n' call.   If you are still at 1024, then you will want to increase
>> the limit to something quite a bit higher.
>>
>> On Wed, Jun 30, 2010 at 9:40 AM, Siddharth Karandikar <
>> siddharth.karandikar@gmail.com> wrote:
>>
>>> Yeah. SSH is working as mentioned in the docs. Even directory
>>> mentioned for 'mapred.local.dir' has enough space.
>>>
>>> - Siddharth
>>>
>>> On Wed, Jun 30, 2010 at 10:01 PM, Chris Collord <cc...@lanl.gov> wrote:
>>> > Interesting that the reduce phase makes it that far before failing!
>>> > Are you able to SSH (without a password) into the failing node?  Any
>>> > possible folder permissions issues?
>>> > ~Chris
>>> >
>>> > On 06/30/2010 10:26 AM, Siddharth Karandikar wrote:
>>> >>
>>> >> Hey Chris,
>>> >> Thanks for your inputs. I have tried most of the stuff, but will
>>> >> surely go though tutorial you have pointed out. May be I will get some
>>> >> hint there.
>>> >>
>>> >> Interestingly, while experimenting with it more, I noticed that, if
>>> >> small size input file is there (50MBs) the job works perfectly fine.
>>> >> If I give bigger input, it starts hanging @ reduce tasks. Map phase
>>> >> always finishes 100%.
>>> >>
>>> >> - Siddharth
>>> >>
>>> >>
>>> >> On Wed, Jun 30, 2010 at 9:11 PM, Chris Collord<cc...@lanl.gov>
>>>  wrote:
>>> >>
>>> >>>
>>> >>> Hi Siddharth,
>>> >>> I'm VERY new to this myself, but here are a few thoughts (since nobody
>>> >>> else
>>> >>> is responding!).
>>> >>> -You might want to set dfs.replication to 2.  I have read that for
>>> >>> clusters
>>> >>> <  8, you should have replication set to 2 machines.  8+ node clusters
>>> >>> use 3.
>>> >>>  This may make your cluster work, but it won't fix your problem.
>>> >>> -Run a "bin/hadoop dfsadmin -report" with the hadoop cluster running
>>> and
>>> >>> see
>>> >>> what it shows for your failing node.
>>> >>> -Check your logs/ folder for "datanode" logs and see if there's
>>> anything
>>> >>> useful in there before the error you're getting.
>>> >>> -You might try reformatting your hdfs, if you don't have anything
>>> >>> important
>>> >>> in there.  "bin/hadoop namenode -format".  (Note: this has caused
>>> >>> problems
>>> >>> for me in the past with namenode ID's, see the bottom on the link for
>>> >>> Michael Noll's tutorial if that happens)
>>> >>>
>>> >>> You should check out Michael Noll's tutorial for all the little
>>> details:
>>> >>>
>>> >>>
>>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
>>> >>>
>>> >>> Let me know if anything helps!
>>> >>> ~Chris
>>> >>>
>>> >>>
>>> >>>
>>> >>> On 06/30/2010 04:02 AM, Siddharth Karandikar wrote:
>>> >>>
>>> >>>>
>>> >>>> Anyone?
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
>>> >>>> <si...@gmail.com>    wrote:
>>> >>>>
>>> >>>>
>>> >>>>>
>>> >>>>> Hi All,
>>> >>>>>
>>> >>>>> I am new to Hadoop, but by reading online docs and other resource, I
>>> >>>>> have moved ahead and now trying to run a cluster of 3 nodes.
>>> >>>>> Before doing this, tried my program on standalone and pseudo systems
>>> >>>>> and thats working fine.
>>> >>>>>
>>> >>>>> Now the issue that I am facing - mapping phase works correctly. While
>>> >>>>> doing reduce, I am seeing following error on one of the nodes -
>>> >>>>>
>>> >>>>> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
>>> >>>>> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
>>> >>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
>>> >>>>> in any of the configured local directories
>>> >>>>>
>>> >>>>> Lets say this is @ Node1. But there is no such directory named
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
>>> >>>>> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
>>> >>>>> directory is available on Node2 (or Node3). Tried running the job
>>> >>>>> multiple times, but its always failing while reducing. Same error.
>>> >>>>>
>>> >>>>> I have configured /tmp/mapred/local on each node from
>>> mapred-site.xml.
>>> >>>>>
>>> >>>>> I really don't understand why mappers are misplacing these files? Or
>>> >>>>> am I missing something in configuration?
>>> >>>>>
>>> >>>>> If someone wants to look @ configurations, I have pasted that below.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Siddharth
>>> >>>>>
>>> >>>>>
>>> >>>>> Configurations
>>> >>>>> ==========
>>> >>>>>
>>> >>>>> conf/core-site.xml
>>> >>>>> ---------------------------
>>> >>>>>
>>> >>>>> <?xml version="1.0"?>
>>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> >>>>> <configuration>
>>> >>>>>  <property>
>>> >>>>>    <name>fs.default.name</name>
>>> >>>>>    <value>hdfs://192.168.2.115/</value>
>>> >>>>>  </property>
>>> >>>>> </configuration>
>>> >>>>>
>>> >>>>>
>>> >>>>> conf/hdfs-site.xml
>>> >>>>> --------------------------
>>> >>>>> <?xml version="1.0"?>
>>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> >>>>> <configuration>
>>> >>>>>  <property>
>>> >>>>>    <name>fs.default.name</name>
>>> >>>>>    <value>hdfs://192.168.2.115</value>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>dfs.data.dir</name>
>>> >>>>>    <value>/home/siddharth/hdfs/data</value>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>dfs.name.dir</name>
>>> >>>>>    <value>/home/siddharth/hdfs/name</value>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>dfs.replication</name>
>>> >>>>>    <value>3</value>
>>> >>>>>  </property>
>>> >>>>> </configuration>
>>> >>>>>
>>> >>>>> conf/mapred-site.xml
>>> >>>>> ------------------------------
>>> >>>>> <?xml version="1.0"?>
>>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> >>>>> <configuration>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.job.tracker</name>
>>> >>>>>    <value>192.168.2.115:8021</value>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.local.dir</name>
>>> >>>>>    <value>/tmp/mapred/local</value>
>>> >>>>>    <final>true</final>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.system.dir</name>
>>> >>>>>    <value>hdfs://192.168.2.115/maperdsystem</value>
>>> >>>>>    <final>true</final>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.tasktracker.map.tasks.maximum</name>
>>> >>>>>    <value>4</value>
>>> >>>>>    <final>true</final>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>>> >>>>>    <value>4</value>
>>> >>>>>    <final>true</final>
>>> >>>>>  </property>
>>> >>>>>  <property>
>>> >>>>>    <name>mapred.child.java.opts</name>
>>> >>>>>    <value>-Xmx512m</value>
>>> >>>>>    <!-- Not marked as final so jobs can include JVM debugging options
>>> >>>>> -->
>>> >>>>>  </property>
>>> >>>>> </configuration>
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>
>>> >>> --
>>> >>> ------------------------------
>>> >>> Chris Collord, ACS-PO 9/80 A
>>> >>> ------------------------------
>>> >>>
>>> >>>
>>> >>>
>>> >
>>> >
>>> > --
>>> > ------------------------------
>>> > Chris Collord, ACS-PO 9/80 A
>>> > ------------------------------
>>> >
>>> >
>>>
>>
>

Re: Hadoop Shutdown Hook

Posted by Jeff Hammerbacher <ha...@cloudera.com>.

Hey Arv,

CDH2 and CDH3 both have HADOOP-4829: see
http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.89.releasenotes.html and
http://archive.cloudera.com/cdh/3/hadoop-0.20.2+320.releasenotes.html.

Alternatively, you can download an Apache 0.21 release candidate at
http://people.apache.org/~tomwhite/hadoop-0.21.0-candidate-0/.

Regards,
Jeff

On Mon, Jul 5, 2010 at 6:59 AM, Arv Mistry <ar...@kindsight.net> wrote:

> Hi folks,
>
> I need to be able to overwrite the hadoop shutdown hook. The following jira
> https://issues.apache.org/jira/browse/HADOOP-4829 describes the fix as
> being implemented in 0.21.0. When will that load be available?
>
> In the mean time is there a workaround? The Jira describes using Java
> reflection but I'm not sure how that was done, any details there would be
> appreciated
>
> Cheers Arv
>
>

Hadoop Shutdown Hook

Posted by Arv Mistry <ar...@kindsight.net>.

Hi folks,

I need to be able to overwrite the hadoop shutdown hook. The following jira https://issues.apache.org/jira/browse/HADOOP-4829 describes the fix as being implemented in 0.21.0. When will that load be available?

In the mean time is there a workaround? The Jira describes using Java reflection but I'm not sure how that was done, any details there would be appreciated

Cheers Arv

Re: newbie - job failing at reduce

Posted by Siddharth Karandikar <si...@gmail.com>.

Yeah. Looks like its set to 1024 right now. Change that to say 10
times more and run the setup again.
Thanks Ken!

- Siddharth

On Wed, Jun 30, 2010 at 10:28 PM, Ken Goodhope <ke...@gmail.com> wrote:
> Have you increased your file handle limits?  You can check this with a
> 'ulimit -n' call.   If you are still at 1024, then you will want to increase
> the limit to something quite a bit higher.
>
> On Wed, Jun 30, 2010 at 9:40 AM, Siddharth Karandikar <
> siddharth.karandikar@gmail.com> wrote:
>
>> Yeah. SSH is working as mentioned in the docs. Even directory
>> mentioned for 'mapred.local.dir' has enough space.
>>
>> - Siddharth
>>
>> On Wed, Jun 30, 2010 at 10:01 PM, Chris Collord <cc...@lanl.gov> wrote:
>> > Interesting that the reduce phase makes it that far before failing!
>> > Are you able to SSH (without a password) into the failing node?  Any
>> > possible folder permissions issues?
>> > ~Chris
>> >
>> > On 06/30/2010 10:26 AM, Siddharth Karandikar wrote:
>> >>
>> >> Hey Chris,
>> >> Thanks for your inputs. I have tried most of the stuff, but will
>> >> surely go though tutorial you have pointed out. May be I will get some
>> >> hint there.
>> >>
>> >> Interestingly, while experimenting with it more, I noticed that, if
>> >> small size input file is there (50MBs) the job works perfectly fine.
>> >> If I give bigger input, it starts hanging @ reduce tasks. Map phase
>> >> always finishes 100%.
>> >>
>> >> - Siddharth
>> >>
>> >>
>> >> On Wed, Jun 30, 2010 at 9:11 PM, Chris Collord<cc...@lanl.gov>
>>  wrote:
>> >>
>> >>>
>> >>> Hi Siddharth,
>> >>> I'm VERY new to this myself, but here are a few thoughts (since nobody
>> >>> else
>> >>> is responding!).
>> >>> -You might want to set dfs.replication to 2.  I have read that for
>> >>> clusters
>> >>> <  8, you should have replication set to 2 machines.  8+ node clusters
>> >>> use 3.
>> >>>  This may make your cluster work, but it won't fix your problem.
>> >>> -Run a "bin/hadoop dfsadmin -report" with the hadoop cluster running
>> and
>> >>> see
>> >>> what it shows for your failing node.
>> >>> -Check your logs/ folder for "datanode" logs and see if there's
>> anything
>> >>> useful in there before the error you're getting.
>> >>> -You might try reformatting your hdfs, if you don't have anything
>> >>> important
>> >>> in there.  "bin/hadoop namenode -format".  (Note: this has caused
>> >>> problems
>> >>> for me in the past with namenode ID's, see the bottom on the link for
>> >>> Michael Noll's tutorial if that happens)
>> >>>
>> >>> You should check out Michael Noll's tutorial for all the little
>> details:
>> >>>
>> >>>
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
>> >>>
>> >>> Let me know if anything helps!
>> >>> ~Chris
>> >>>
>> >>>
>> >>>
>> >>> On 06/30/2010 04:02 AM, Siddharth Karandikar wrote:
>> >>>
>> >>>>
>> >>>> Anyone?
>> >>>>
>> >>>>
>> >>>> On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
>> >>>> <si...@gmail.com>    wrote:
>> >>>>
>> >>>>
>> >>>>>
>> >>>>> Hi All,
>> >>>>>
>> >>>>> I am new to Hadoop, but by reading online docs and other resource, I
>> >>>>> have moved ahead and now trying to run a cluster of 3 nodes.
>> >>>>> Before doing this, tried my program on standalone and pseudo systems
>> >>>>> and thats working fine.
>> >>>>>
>> >>>>> Now the issue that I am facing - mapping phase works correctly. While
>> >>>>> doing reduce, I am seeing following error on one of the nodes -
>> >>>>>
>> >>>>> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
>> >>>>> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
>> >>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>> >>>>>
>> >>>>>
>> >>>>>
>> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
>> >>>>> in any of the configured local directories
>> >>>>>
>> >>>>> Lets say this is @ Node1. But there is no such directory named
>> >>>>>
>> >>>>>
>> >>>>>
>> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
>> >>>>> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
>> >>>>> directory is available on Node2 (or Node3). Tried running the job
>> >>>>> multiple times, but its always failing while reducing. Same error.
>> >>>>>
>> >>>>> I have configured /tmp/mapred/local on each node from
>> mapred-site.xml.
>> >>>>>
>> >>>>> I really don't understand why mappers are misplacing these files? Or
>> >>>>> am I missing something in configuration?
>> >>>>>
>> >>>>> If someone wants to look @ configurations, I have pasted that below.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Siddharth
>> >>>>>
>> >>>>>
>> >>>>> Configurations
>> >>>>> ==========
>> >>>>>
>> >>>>> conf/core-site.xml
>> >>>>> ---------------------------
>> >>>>>
>> >>>>> <?xml version="1.0"?>
>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> >>>>> <configuration>
>> >>>>>  <property>
>> >>>>>    <name>fs.default.name</name>
>> >>>>>    <value>hdfs://192.168.2.115/</value>
>> >>>>>  </property>
>> >>>>> </configuration>
>> >>>>>
>> >>>>>
>> >>>>> conf/hdfs-site.xml
>> >>>>> --------------------------
>> >>>>> <?xml version="1.0"?>
>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> >>>>> <configuration>
>> >>>>>  <property>
>> >>>>>    <name>fs.default.name</name>
>> >>>>>    <value>hdfs://192.168.2.115</value>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>dfs.data.dir</name>
>> >>>>>    <value>/home/siddharth/hdfs/data</value>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>dfs.name.dir</name>
>> >>>>>    <value>/home/siddharth/hdfs/name</value>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>dfs.replication</name>
>> >>>>>    <value>3</value>
>> >>>>>  </property>
>> >>>>> </configuration>
>> >>>>>
>> >>>>> conf/mapred-site.xml
>> >>>>> ------------------------------
>> >>>>> <?xml version="1.0"?>
>> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> >>>>> <configuration>
>> >>>>>  <property>
>> >>>>>    <name>mapred.job.tracker</name>
>> >>>>>    <value>192.168.2.115:8021</value>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>mapred.local.dir</name>
>> >>>>>    <value>/tmp/mapred/local</value>
>> >>>>>    <final>true</final>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>mapred.system.dir</name>
>> >>>>>    <value>hdfs://192.168.2.115/maperdsystem</value>
>> >>>>>    <final>true</final>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>mapred.tasktracker.map.tasks.maximum</name>
>> >>>>>    <value>4</value>
>> >>>>>    <final>true</final>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>> >>>>>    <value>4</value>
>> >>>>>    <final>true</final>
>> >>>>>  </property>
>> >>>>>  <property>
>> >>>>>    <name>mapred.child.java.opts</name>
>> >>>>>    <value>-Xmx512m</value>
>> >>>>>    <!-- Not marked as final so jobs can include JVM debugging options
>> >>>>> -->
>> >>>>>  </property>
>> >>>>> </configuration>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>
>> >>> --
>> >>> ------------------------------
>> >>> Chris Collord, ACS-PO 9/80 A
>> >>> ------------------------------
>> >>>
>> >>>
>> >>>
>> >
>> >
>> > --
>> > ------------------------------
>> > Chris Collord, ACS-PO 9/80 A
>> > ------------------------------
>> >
>> >
>>
>

Re: newbie - job failing at reduce

Posted by Ken Goodhope <ke...@gmail.com>.

Have you increased your file handle limits?  You can check this with a
'ulimit -n' call.   If you are still at 1024, then you will want to increase
the limit to something quite a bit higher.

On Wed, Jun 30, 2010 at 9:40 AM, Siddharth Karandikar <
siddharth.karandikar@gmail.com> wrote:

> Yeah. SSH is working as mentioned in the docs. Even directory
> mentioned for 'mapred.local.dir' has enough space.
>
> - Siddharth
>
> On Wed, Jun 30, 2010 at 10:01 PM, Chris Collord <cc...@lanl.gov> wrote:
> > Interesting that the reduce phase makes it that far before failing!
> > Are you able to SSH (without a password) into the failing node?  Any
> > possible folder permissions issues?
> > ~Chris
> >
> > On 06/30/2010 10:26 AM, Siddharth Karandikar wrote:
> >>
> >> Hey Chris,
> >> Thanks for your inputs. I have tried most of the stuff, but will
> >> surely go though tutorial you have pointed out. May be I will get some
> >> hint there.
> >>
> >> Interestingly, while experimenting with it more, I noticed that, if
> >> small size input file is there (50MBs) the job works perfectly fine.
> >> If I give bigger input, it starts hanging @ reduce tasks. Map phase
> >> always finishes 100%.
> >>
> >> - Siddharth
> >>
> >>
> >> On Wed, Jun 30, 2010 at 9:11 PM, Chris Collord<cc...@lanl.gov>
>  wrote:
> >>
> >>>
> >>> Hi Siddharth,
> >>> I'm VERY new to this myself, but here are a few thoughts (since nobody
> >>> else
> >>> is responding!).
> >>> -You might want to set dfs.replication to 2.  I have read that for
> >>> clusters
> >>> <  8, you should have replication set to 2 machines.  8+ node clusters
> >>> use 3.
> >>>  This may make your cluster work, but it won't fix your problem.
> >>> -Run a "bin/hadoop dfsadmin -report" with the hadoop cluster running
> and
> >>> see
> >>> what it shows for your failing node.
> >>> -Check your logs/ folder for "datanode" logs and see if there's
> anything
> >>> useful in there before the error you're getting.
> >>> -You might try reformatting your hdfs, if you don't have anything
> >>> important
> >>> in there.  "bin/hadoop namenode -format".  (Note: this has caused
> >>> problems
> >>> for me in the past with namenode ID's, see the bottom on the link for
> >>> Michael Noll's tutorial if that happens)
> >>>
> >>> You should check out Michael Noll's tutorial for all the little
> details:
> >>>
> >>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
> >>>
> >>> Let me know if anything helps!
> >>> ~Chris
> >>>
> >>>
> >>>
> >>> On 06/30/2010 04:02 AM, Siddharth Karandikar wrote:
> >>>
> >>>>
> >>>> Anyone?
> >>>>
> >>>>
> >>>> On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
> >>>> <si...@gmail.com>    wrote:
> >>>>
> >>>>
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I am new to Hadoop, but by reading online docs and other resource, I
> >>>>> have moved ahead and now trying to run a cluster of 3 nodes.
> >>>>> Before doing this, tried my program on standalone and pseudo systems
> >>>>> and thats working fine.
> >>>>>
> >>>>> Now the issue that I am facing - mapping phase works correctly. While
> >>>>> doing reduce, I am seeing following error on one of the nodes -
> >>>>>
> >>>>> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
> >>>>> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
> >>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> >>>>>
> >>>>>
> >>>>>
> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
> >>>>> in any of the configured local directories
> >>>>>
> >>>>> Lets say this is @ Node1. But there is no such directory named
> >>>>>
> >>>>>
> >>>>>
> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
> >>>>> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
> >>>>> directory is available on Node2 (or Node3). Tried running the job
> >>>>> multiple times, but its always failing while reducing. Same error.
> >>>>>
> >>>>> I have configured /tmp/mapred/local on each node from
> mapred-site.xml.
> >>>>>
> >>>>> I really don't understand why mappers are misplacing these files? Or
> >>>>> am I missing something in configuration?
> >>>>>
> >>>>> If someone wants to look @ configurations, I have pasted that below.
> >>>>>
> >>>>> Thanks,
> >>>>> Siddharth
> >>>>>
> >>>>>
> >>>>> Configurations
> >>>>> ==========
> >>>>>
> >>>>> conf/core-site.xml
> >>>>> ---------------------------
> >>>>>
> >>>>> <?xml version="1.0"?>
> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>>>> <configuration>
> >>>>>  <property>
> >>>>>    <name>fs.default.name</name>
> >>>>>    <value>hdfs://192.168.2.115/</value>
> >>>>>  </property>
> >>>>> </configuration>
> >>>>>
> >>>>>
> >>>>> conf/hdfs-site.xml
> >>>>> --------------------------
> >>>>> <?xml version="1.0"?>
> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>>>> <configuration>
> >>>>>  <property>
> >>>>>    <name>fs.default.name</name>
> >>>>>    <value>hdfs://192.168.2.115</value>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>dfs.data.dir</name>
> >>>>>    <value>/home/siddharth/hdfs/data</value>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>dfs.name.dir</name>
> >>>>>    <value>/home/siddharth/hdfs/name</value>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>dfs.replication</name>
> >>>>>    <value>3</value>
> >>>>>  </property>
> >>>>> </configuration>
> >>>>>
> >>>>> conf/mapred-site.xml
> >>>>> ------------------------------
> >>>>> <?xml version="1.0"?>
> >>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>>>> <configuration>
> >>>>>  <property>
> >>>>>    <name>mapred.job.tracker</name>
> >>>>>    <value>192.168.2.115:8021</value>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>mapred.local.dir</name>
> >>>>>    <value>/tmp/mapred/local</value>
> >>>>>    <final>true</final>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>mapred.system.dir</name>
> >>>>>    <value>hdfs://192.168.2.115/maperdsystem</value>
> >>>>>    <final>true</final>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>mapred.tasktracker.map.tasks.maximum</name>
> >>>>>    <value>4</value>
> >>>>>    <final>true</final>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
> >>>>>    <value>4</value>
> >>>>>    <final>true</final>
> >>>>>  </property>
> >>>>>  <property>
> >>>>>    <name>mapred.child.java.opts</name>
> >>>>>    <value>-Xmx512m</value>
> >>>>>    <!-- Not marked as final so jobs can include JVM debugging options
> >>>>> -->
> >>>>>  </property>
> >>>>> </configuration>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>> --
> >>> ------------------------------
> >>> Chris Collord, ACS-PO 9/80 A
> >>> ------------------------------
> >>>
> >>>
> >>>
> >
> >
> > --
> > ------------------------------
> > Chris Collord, ACS-PO 9/80 A
> > ------------------------------
> >
> >
>

Re: newbie - job failing at reduce

Posted by Siddharth Karandikar <si...@gmail.com>.

Yeah. SSH is working as mentioned in the docs. Even directory
mentioned for 'mapred.local.dir' has enough space.

- Siddharth

On Wed, Jun 30, 2010 at 10:01 PM, Chris Collord <cc...@lanl.gov> wrote:
> Interesting that the reduce phase makes it that far before failing!
> Are you able to SSH (without a password) into the failing node?  Any
> possible folder permissions issues?
> ~Chris
>
> On 06/30/2010 10:26 AM, Siddharth Karandikar wrote:
>>
>> Hey Chris,
>> Thanks for your inputs. I have tried most of the stuff, but will
>> surely go though tutorial you have pointed out. May be I will get some
>> hint there.
>>
>> Interestingly, while experimenting with it more, I noticed that, if
>> small size input file is there (50MBs) the job works perfectly fine.
>> If I give bigger input, it starts hanging @ reduce tasks. Map phase
>> always finishes 100%.
>>
>> - Siddharth
>>
>>
>> On Wed, Jun 30, 2010 at 9:11 PM, Chris Collord<cc...@lanl.gov>  wrote:
>>
>>>
>>> Hi Siddharth,
>>> I'm VERY new to this myself, but here are a few thoughts (since nobody
>>> else
>>> is responding!).
>>> -You might want to set dfs.replication to 2.  I have read that for
>>> clusters
>>> <  8, you should have replication set to 2 machines.  8+ node clusters
>>> use 3.
>>>  This may make your cluster work, but it won't fix your problem.
>>> -Run a "bin/hadoop dfsadmin -report" with the hadoop cluster running and
>>> see
>>> what it shows for your failing node.
>>> -Check your logs/ folder for "datanode" logs and see if there's anything
>>> useful in there before the error you're getting.
>>> -You might try reformatting your hdfs, if you don't have anything
>>> important
>>> in there.  "bin/hadoop namenode -format".  (Note: this has caused
>>> problems
>>> for me in the past with namenode ID's, see the bottom on the link for
>>> Michael Noll's tutorial if that happens)
>>>
>>> You should check out Michael Noll's tutorial for all the little details:
>>>
>>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
>>>
>>> Let me know if anything helps!
>>> ~Chris
>>>
>>>
>>>
>>> On 06/30/2010 04:02 AM, Siddharth Karandikar wrote:
>>>
>>>>
>>>> Anyone?
>>>>
>>>>
>>>> On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
>>>> <si...@gmail.com>    wrote:
>>>>
>>>>
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I am new to Hadoop, but by reading online docs and other resource, I
>>>>> have moved ahead and now trying to run a cluster of 3 nodes.
>>>>> Before doing this, tried my program on standalone and pseudo systems
>>>>> and thats working fine.
>>>>>
>>>>> Now the issue that I am facing - mapping phase works correctly. While
>>>>> doing reduce, I am seeing following error on one of the nodes -
>>>>>
>>>>> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
>>>>> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
>>>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>>>
>>>>>
>>>>> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
>>>>> in any of the configured local directories
>>>>>
>>>>> Lets say this is @ Node1. But there is no such directory named
>>>>>
>>>>>
>>>>> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
>>>>> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
>>>>> directory is available on Node2 (or Node3). Tried running the job
>>>>> multiple times, but its always failing while reducing. Same error.
>>>>>
>>>>> I have configured /tmp/mapred/local on each node from mapred-site.xml.
>>>>>
>>>>> I really don't understand why mappers are misplacing these files? Or
>>>>> am I missing something in configuration?
>>>>>
>>>>> If someone wants to look @ configurations, I have pasted that below.
>>>>>
>>>>> Thanks,
>>>>> Siddharth
>>>>>
>>>>>
>>>>> Configurations
>>>>> ==========
>>>>>
>>>>> conf/core-site.xml
>>>>> ---------------------------
>>>>>
>>>>> <?xml version="1.0"?>
>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>> <configuration>
>>>>>  <property>
>>>>>    <name>fs.default.name</name>
>>>>>    <value>hdfs://192.168.2.115/</value>
>>>>>  </property>
>>>>> </configuration>
>>>>>
>>>>>
>>>>> conf/hdfs-site.xml
>>>>> --------------------------
>>>>> <?xml version="1.0"?>
>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>> <configuration>
>>>>>  <property>
>>>>>    <name>fs.default.name</name>
>>>>>    <value>hdfs://192.168.2.115</value>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>dfs.data.dir</name>
>>>>>    <value>/home/siddharth/hdfs/data</value>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>dfs.name.dir</name>
>>>>>    <value>/home/siddharth/hdfs/name</value>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>dfs.replication</name>
>>>>>    <value>3</value>
>>>>>  </property>
>>>>> </configuration>
>>>>>
>>>>> conf/mapred-site.xml
>>>>> ------------------------------
>>>>> <?xml version="1.0"?>
>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>> <configuration>
>>>>>  <property>
>>>>>    <name>mapred.job.tracker</name>
>>>>>    <value>192.168.2.115:8021</value>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>mapred.local.dir</name>
>>>>>    <value>/tmp/mapred/local</value>
>>>>>    <final>true</final>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>mapred.system.dir</name>
>>>>>    <value>hdfs://192.168.2.115/maperdsystem</value>
>>>>>    <final>true</final>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>mapred.tasktracker.map.tasks.maximum</name>
>>>>>    <value>4</value>
>>>>>    <final>true</final>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>>>>>    <value>4</value>
>>>>>    <final>true</final>
>>>>>  </property>
>>>>>  <property>
>>>>>    <name>mapred.child.java.opts</name>
>>>>>    <value>-Xmx512m</value>
>>>>>    <!-- Not marked as final so jobs can include JVM debugging options
>>>>> -->
>>>>>  </property>
>>>>> </configuration>
>>>>>
>>>>>
>>>>>
>>>
>>> --
>>> ------------------------------
>>> Chris Collord, ACS-PO 9/80 A
>>> ------------------------------
>>>
>>>
>>>
>
>
> --
> ------------------------------
> Chris Collord, ACS-PO 9/80 A
> ------------------------------
>
>

Re: newbie - job failing at reduce

Posted by Siddharth Karandikar <si...@gmail.com>.

Hey Chris,
Thanks for your inputs. I have tried most of the stuff, but will
surely go though tutorial you have pointed out. May be I will get some
hint there.

Interestingly, while experimenting with it more, I noticed that, if
small size input file is there (50MBs) the job works perfectly fine.
If I give bigger input, it starts hanging @ reduce tasks. Map phase
always finishes 100%.

- Siddharth


On Wed, Jun 30, 2010 at 9:11 PM, Chris Collord <cc...@lanl.gov> wrote:
> Hi Siddharth,
> I'm VERY new to this myself, but here are a few thoughts (since nobody else
> is responding!).
> -You might want to set dfs.replication to 2.  I have read that for clusters
> < 8, you should have replication set to 2 machines.  8+ node clusters use 3.
>  This may make your cluster work, but it won't fix your problem.
> -Run a "bin/hadoop dfsadmin -report" with the hadoop cluster running and see
> what it shows for your failing node.
> -Check your logs/ folder for "datanode" logs and see if there's anything
> useful in there before the error you're getting.
> -You might try reformatting your hdfs, if you don't have anything important
> in there.  "bin/hadoop namenode -format".  (Note: this has caused problems
> for me in the past with namenode ID's, see the bottom on the link for
> Michael Noll's tutorial if that happens)
>
> You should check out Michael Noll's tutorial for all the little details:
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
>
> Let me know if anything helps!
> ~Chris
>
>
>
> On 06/30/2010 04:02 AM, Siddharth Karandikar wrote:
>>
>> Anyone?
>>
>>
>> On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
>> <si...@gmail.com>  wrote:
>>
>>>
>>> Hi All,
>>>
>>> I am new to Hadoop, but by reading online docs and other resource, I
>>> have moved ahead and now trying to run a cluster of 3 nodes.
>>> Before doing this, tried my program on standalone and pseudo systems
>>> and thats working fine.
>>>
>>> Now the issue that I am facing - mapping phase works correctly. While
>>> doing reduce, I am seeing following error on one of the nodes -
>>>
>>> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
>>> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
>>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>>>
>>> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
>>> in any of the configured local directories
>>>
>>> Lets say this is @ Node1. But there is no such directory named
>>>
>>> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
>>> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
>>> directory is available on Node2 (or Node3). Tried running the job
>>> multiple times, but its always failing while reducing. Same error.
>>>
>>> I have configured /tmp/mapred/local on each node from mapred-site.xml.
>>>
>>> I really don't understand why mappers are misplacing these files? Or
>>> am I missing something in configuration?
>>>
>>> If someone wants to look @ configurations, I have pasted that below.
>>>
>>> Thanks,
>>> Siddharth
>>>
>>>
>>> Configurations
>>> ==========
>>>
>>> conf/core-site.xml
>>> ---------------------------
>>>
>>> <?xml version="1.0"?>
>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> <configuration>
>>>  <property>
>>>    <name>fs.default.name</name>
>>>    <value>hdfs://192.168.2.115/</value>
>>>  </property>
>>> </configuration>
>>>
>>>
>>> conf/hdfs-site.xml
>>> --------------------------
>>> <?xml version="1.0"?>
>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> <configuration>
>>>  <property>
>>>    <name>fs.default.name</name>
>>>    <value>hdfs://192.168.2.115</value>
>>>  </property>
>>>  <property>
>>>    <name>dfs.data.dir</name>
>>>    <value>/home/siddharth/hdfs/data</value>
>>>  </property>
>>>  <property>
>>>    <name>dfs.name.dir</name>
>>>    <value>/home/siddharth/hdfs/name</value>
>>>  </property>
>>>  <property>
>>>    <name>dfs.replication</name>
>>>    <value>3</value>
>>>  </property>
>>> </configuration>
>>>
>>> conf/mapred-site.xml
>>> ------------------------------
>>> <?xml version="1.0"?>
>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> <configuration>
>>>  <property>
>>>    <name>mapred.job.tracker</name>
>>>    <value>192.168.2.115:8021</value>
>>>  </property>
>>>  <property>
>>>    <name>mapred.local.dir</name>
>>>    <value>/tmp/mapred/local</value>
>>>    <final>true</final>
>>>  </property>
>>>  <property>
>>>    <name>mapred.system.dir</name>
>>>    <value>hdfs://192.168.2.115/maperdsystem</value>
>>>    <final>true</final>
>>>  </property>
>>>  <property>
>>>    <name>mapred.tasktracker.map.tasks.maximum</name>
>>>    <value>4</value>
>>>    <final>true</final>
>>>  </property>
>>>  <property>
>>>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>>>    <value>4</value>
>>>    <final>true</final>
>>>  </property>
>>>  <property>
>>>    <name>mapred.child.java.opts</name>
>>>    <value>-Xmx512m</value>
>>>    <!-- Not marked as final so jobs can include JVM debugging options -->
>>>  </property>
>>> </configuration>
>>>
>>>
>
>
> --
> ------------------------------
> Chris Collord, ACS-PO 9/80 A
> ------------------------------
>
>

Re: newbie - job failing at reduce

Posted by Siddharth Karandikar <si...@gmail.com>.

Anyone?


On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar
<si...@gmail.com> wrote:
> Hi All,
>
> I am new to Hadoop, but by reading online docs and other resource, I
> have moved ahead and now trying to run a cluster of 3 nodes.
> Before doing this, tried my program on standalone and pseudo systems
> and thats working fine.
>
> Now the issue that I am facing - mapping phase works correctly. While
> doing reduce, I am seeing following error on one of the nodes -
>
> 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker:
> getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed :
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index
> in any of the configured local directories
>
> Lets say this is @ Node1. But there is no such directory named
> 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0'
> under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this
> directory is available on Node2 (or Node3). Tried running the job
> multiple times, but its always failing while reducing. Same error.
>
> I have configured /tmp/mapred/local on each node from mapred-site.xml.
>
> I really don't understand why mappers are misplacing these files? Or
> am I missing something in configuration?
>
> If someone wants to look @ configurations, I have pasted that below.
>
> Thanks,
> Siddharth
>
>
> Configurations
> ==========
>
> conf/core-site.xml
> ---------------------------
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>  <property>
>    <name>fs.default.name</name>
>    <value>hdfs://192.168.2.115/</value>
>  </property>
> </configuration>
>
>
> conf/hdfs-site.xml
> --------------------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>  <property>
>    <name>fs.default.name</name>
>    <value>hdfs://192.168.2.115</value>
>  </property>
>  <property>
>    <name>dfs.data.dir</name>
>    <value>/home/siddharth/hdfs/data</value>
>  </property>
>  <property>
>    <name>dfs.name.dir</name>
>    <value>/home/siddharth/hdfs/name</value>
>  </property>
>  <property>
>    <name>dfs.replication</name>
>    <value>3</value>
>  </property>
> </configuration>
>
> conf/mapred-site.xml
> ------------------------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>  <property>
>    <name>mapred.job.tracker</name>
>    <value>192.168.2.115:8021</value>
>  </property>
>  <property>
>    <name>mapred.local.dir</name>
>    <value>/tmp/mapred/local</value>
>    <final>true</final>
>  </property>
>  <property>
>    <name>mapred.system.dir</name>
>    <value>hdfs://192.168.2.115/maperdsystem</value>
>    <final>true</final>
>  </property>
>  <property>
>    <name>mapred.tasktracker.map.tasks.maximum</name>
>    <value>4</value>
>    <final>true</final>
>  </property>
>  <property>
>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>    <value>4</value>
>    <final>true</final>
>  </property>
>  <property>
>    <name>mapred.child.java.opts</name>
>    <value>-Xmx512m</value>
>    <!-- Not marked as final so jobs can include JVM debugging options -->
>  </property>
> </configuration>
>