You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Shrikrishna Shrin <kr...@cooliris.com> on 2009/08/10 21:59:31 UTC

Jobs failing during map phase due to low memory

Hi,

I haven't seen this before but nightly jobs failed over the weekend because
due to memory issues. The weird part is the jobs failed during the map phase
(at about ~98% complete).

The task tracker for the failed map jobs shows the following errors:

Task attempt_200908100026_0065_m_000002_0 failed to report status for
602 seconds. Killing!
Task attempt_200908100026_0065_m_000002_1 failed to report status for
603 seconds. Killing!

The logs indicate memory to be the issue:

2009-08-10 11:53:37.829 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
287290336(280556K) committed = 363593728(355072K) max =
536870912(524288K)

2009-08-10 11:53:43.522 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
350217672(342009K) committed = 422510592(412608K) max =
536870912(524288K)

2009-08-10 11:53:45.290 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Usage threshold exceeded) init = 5439488(5312K) used =
376781240(367950K) committed = 422510592(412608K) max =
536870912(524288K)

2009-08-10 11:53:45.290 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
380504752(371586K) committed = 456720384(446016K) max =
536870912(524288K)

2009-08-10 11:53:46.752 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
401755464(392339K) committed = 482344960(471040K) max =
536870912(524288K)

2009-08-10 11:53:50.599 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
443763584(433362K) committed = 527171584(514816K) max =
536870912(524288K)

2009-08-10 11:53:54.686 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
491575560(480054K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:56.414 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
514928920(502860K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:57.553 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
520781832(508576K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:58.747 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
526636552(514293K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:53:59.935 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
532493568(520013K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:01.158 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
536870904(524287K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:02.389 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
536870904(524287K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 11:54:03.778 INFO [Low Memory Detector]
org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
called (Collection threshold exceeded) init = 5439488(5312K) used =
489852536(478371K) committed = 536870912(524288K) max =
536870912(524288K)

2009-08-10 12:03:40.298 WARN [Comm thread for
attempt_200908100026_0065_m_000077_1]
org.apache.hadoop.mapred.TaskRunner - Parent died.  Exiting
attempt_200908100026_0065_m_000077_1

I have seen this before when jobs fail on the reduce phase but this is the
first time I am noticing jobs failing during the map phase.  Surprisingly,
jobs that load and process much more data ran successfully but when I tried
running the ones that failed, they failed again. Some of the jobs that
failed do nothing more than, load, filter and write out the filtered data.
This leads me to believe that the problem is more specific than I had
originally thought. Any pointers on what the issue might be will be
extremely helpful.

Thanks,

Krishna

Re: Jobs failing during map phase due to low memory

Posted by Dmitriy Ryaboy <dv...@cloudera.com>.
Krishna,
Any chance you can find a small subset of your input data that this is
reproducible on, and send that along with the script?



On Mon, Aug 10, 2009 at 5:21 PM, Shrikrishna Shrin<kr...@cooliris.com> wrote:
> Dmitriy,
>
> I don't think that is the issue because the same logs were processed
> successfully using other scripts. I copy pasted the logs below and it looks
> like Hadoop was unable to read/write to /tmp for some reason.
>
> Eg: Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does
> not exist.
>
> Should I try modifying some property in hadoop-site.xml?
>
> Thanks,
>
> Krishna
>
>
> *FULL ERROR LOG:*
>
> ERROR 2998: Unhandled internal error. Task
> attempt_200908101519_0005_m_000002_0 failed to report status for 602
> seconds. Killing!
> java.lang.Exception: Task attempt_200908101519_0005_m_000002_0 failed to
> report status for 602 seconds. Killing!
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:230)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> ERROR 2998: Unhandled internal error.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062
> does not exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> ERROR 2998: Unhandled internal error.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
> does not exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> ERROR 2998: Unhandled internal error.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
> does not exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:619)
>
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> ERROR 2056: Cannot create exception from empty string.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backed error: Task
> attempt_200908101519_0005_m_000002_0 failed to report status for 602
> seconds. Killing!
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
> Cannot create exception from empty string.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
>    ... 13 more
> ERROR 2056: Cannot create exception from empty string.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backed error: Task
> attempt_200908101519_0005_m_000002_0 failed to report status for 602
> seconds. Killing!
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
> Cannot create exception from empty string.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
>    ... 13 more
> ERROR 2056: Cannot create exception from empty string.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backed error: Task
> attempt_200908101519_0005_m_000002_0 failed to report status for 602
> seconds. Killing!
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
> Cannot create exception from empty string.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
>    ... 13 more
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
> does not exist.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
> does not exist.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
> does not exist.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
> recreate exception from backend error:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
>    at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
>    at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
>    at org.apache.pig.PigServer.execute(PigServer.java:760)
>    at org.apache.pig.PigServer.access$100(PigServer.java:89)
>    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
>    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
>    at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
>    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>    at org.apache.pig.Main.main(Main.java:384)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
> exist.
>    at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
>    at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>
>
>
> On Mon, Aug 10, 2009 at 1:09 PM, Dmitriy Ryaboy <dv...@cloudera.com>wrote:
>
>> Krishna,
>> Is it possible that the data you are reading in is malformed in such a
>> way that a mapper doesn't see an end of record for a very long time,
>> and keeps reading your input? Did any other jobs that read the same
>> input but perform different operations, succeed?
>>
>> -Dmitriy
>>
>> On Mon, Aug 10, 2009 at 12:59 PM, Shrikrishna Shrin<kr...@cooliris.com>
>> wrote:
>> > Hi,
>> >
>> > I haven't seen this before but nightly jobs failed over the weekend
>> because
>> > due to memory issues. The weird part is the jobs failed during the map
>> phase
>> > (at about ~98% complete).
>> >
>> > The task tracker for the failed map jobs shows the following errors:
>> >
>> > Task attempt_200908100026_0065_m_000002_0 failed to report status for
>> > 602 seconds. Killing!
>> > Task attempt_200908100026_0065_m_000002_1 failed to report status for
>> > 603 seconds. Killing!
>> >
>> > The logs indicate memory to be the issue:
>> >
>> > 2009-08-10 11:53:37.829 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 287290336(280556K) committed = 363593728(355072K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:43.522 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 350217672(342009K) committed = 422510592(412608K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Usage threshold exceeded) init = 5439488(5312K) used =
>> > 376781240(367950K) committed = 422510592(412608K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 380504752(371586K) committed = 456720384(446016K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:46.752 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 401755464(392339K) committed = 482344960(471040K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:50.599 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 443763584(433362K) committed = 527171584(514816K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:54.686 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 491575560(480054K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:56.414 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 514928920(502860K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:57.553 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 520781832(508576K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:58.747 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 526636552(514293K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:53:59.935 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 532493568(520013K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:54:01.158 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 536870904(524287K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:54:02.389 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 536870904(524287K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 11:54:03.778 INFO [Low Memory Detector]
>> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
>> > called (Collection threshold exceeded) init = 5439488(5312K) used =
>> > 489852536(478371K) committed = 536870912(524288K) max =
>> > 536870912(524288K)
>> >
>> > 2009-08-10 12:03:40.298 WARN [Comm thread for
>> > attempt_200908100026_0065_m_000077_1]
>> > org.apache.hadoop.mapred.TaskRunner - Parent died.  Exiting
>> > attempt_200908100026_0065_m_000077_1
>> >
>> > I have seen this before when jobs fail on the reduce phase but this is
>> the
>> > first time I am noticing jobs failing during the map phase.
>>  Surprisingly,
>> > jobs that load and process much more data ran successfully but when I
>> tried
>> > running the ones that failed, they failed again. Some of the jobs that
>> > failed do nothing more than, load, filter and write out the filtered
>> data.
>> > This leads me to believe that the problem is more specific than I had
>> > originally thought. Any pointers on what the issue might be will be
>> > extremely helpful.
>> >
>> > Thanks,
>> >
>> > Krishna
>> >
>>
>

Fwd: Amazon Elastic MapReduce Now Supports Apache Pig

Posted by Alan Gates <ga...@yahoo-inc.com>.

Begin forwarded message:

> From: "Sirota, Peter" <si...@amazon.com>
> Date: August 10, 2009 6:17:26 PM PDT
> To: "pig-user@hadoop.apache.org" <pi...@hadoop.apache.org>
> Subject: Amazon Elastic MapReduce Now Supports Apache Pig
> Reply-To: pig-user@hadoop.apache.org
>
> Dear Pig Users,
>
> We are excited to announce that Amazon Elastic MapReduce now  
> supports Apache Pig - making the service even more compelling for  
> large data set processing and analytics. Apache Pig is a platform  
> for analyzing large data sets that consists of a high-level language  
> for expressing data analysis programs, coupled with infrastructure  
> for evaluating these programs.
>
> Learn more in the Pig and Amazon Elastic MapReduce tutorial (http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2729&categoryID=269 
> )
>
> Watch a video (http://s3.amazonaws.com/awsVideos/AmazonElasticMapReduce/ElasticMapReduce-PigTutorial.html 
> )
>
> Use a sample Apache log processing application (http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2728&categoryID=263 
> )
>
>
> Sincerely,
> The Amazon Elastic MapReduce Team


Amazon Elastic MapReduce Now Supports Apache Pig

Posted by "Sirota, Peter" <si...@amazon.com>.
Dear Pig Users,

We are excited to announce that Amazon Elastic MapReduce now supports Apache Pig - making the service even more compelling for large data set processing and analytics. Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.

Learn more in the Pig and Amazon Elastic MapReduce tutorial (http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2729&categoryID=269) 

Watch a video (http://s3.amazonaws.com/awsVideos/AmazonElasticMapReduce/ElasticMapReduce-PigTutorial.html)

Use a sample Apache log processing application (http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2728&categoryID=263) 


Sincerely,
The Amazon Elastic MapReduce Team

Re: Jobs failing during map phase due to low memory

Posted by Shrikrishna Shrin <kr...@cooliris.com>.
Dmitriy,

I don't think that is the issue because the same logs were processed
successfully using other scripts. I copy pasted the logs below and it looks
like Hadoop was unable to read/write to /tmp for some reason.

Eg: Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does
not exist.

Should I try modifying some property in hadoop-site.xml?

Thanks,

Krishna


*FULL ERROR LOG:*

ERROR 2998: Unhandled internal error. Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
java.lang.Exception: Task attempt_200908101519_0005_m_000002_0 failed to
report status for 602 seconds. Killing!
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:230)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp1635201062
does not exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
    at java.lang.Thread.run(Thread.java:619)

    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:170)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
    ... 13 more
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
    ... 13 more
ERROR 2056: Cannot create exception from empty string.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backed error: Task
attempt_200908101519_0005_m_000002_0 failed to report status for 602
seconds. Killing!
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:234)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2056:
Cannot create exception from empty string.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromStrings(Launcher.java:509)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getExceptionFromString(Launcher.java:323)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:226)
    ... 13 more
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
ERROR 2100: hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229
does not exist.
org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:174)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:767)
    at org.apache.pig.PigServer.execute(PigServer.java:760)
    at org.apache.pig.PigServer.access$100(PigServer.java:89)
    at org.apache.pig.PigServer$Graph.execute(PigServer.java:931)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:243)
    at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
    at org.apache.pig.Main.main(Main.java:384)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
hdfs://server.domain.com:xxxxx/tmp/temp-946912307/tmp480709229 does not
exist.
    at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:126)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
    at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:228)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)



On Mon, Aug 10, 2009 at 1:09 PM, Dmitriy Ryaboy <dv...@cloudera.com>wrote:

> Krishna,
> Is it possible that the data you are reading in is malformed in such a
> way that a mapper doesn't see an end of record for a very long time,
> and keeps reading your input? Did any other jobs that read the same
> input but perform different operations, succeed?
>
> -Dmitriy
>
> On Mon, Aug 10, 2009 at 12:59 PM, Shrikrishna Shrin<kr...@cooliris.com>
> wrote:
> > Hi,
> >
> > I haven't seen this before but nightly jobs failed over the weekend
> because
> > due to memory issues. The weird part is the jobs failed during the map
> phase
> > (at about ~98% complete).
> >
> > The task tracker for the failed map jobs shows the following errors:
> >
> > Task attempt_200908100026_0065_m_000002_0 failed to report status for
> > 602 seconds. Killing!
> > Task attempt_200908100026_0065_m_000002_1 failed to report status for
> > 603 seconds. Killing!
> >
> > The logs indicate memory to be the issue:
> >
> > 2009-08-10 11:53:37.829 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 287290336(280556K) committed = 363593728(355072K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:43.522 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 350217672(342009K) committed = 422510592(412608K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Usage threshold exceeded) init = 5439488(5312K) used =
> > 376781240(367950K) committed = 422510592(412608K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 380504752(371586K) committed = 456720384(446016K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:46.752 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 401755464(392339K) committed = 482344960(471040K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:50.599 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 443763584(433362K) committed = 527171584(514816K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:54.686 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 491575560(480054K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:56.414 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 514928920(502860K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:57.553 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 520781832(508576K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:58.747 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 526636552(514293K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:53:59.935 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 532493568(520013K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:54:01.158 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 536870904(524287K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:54:02.389 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 536870904(524287K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 11:54:03.778 INFO [Low Memory Detector]
> > org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> > called (Collection threshold exceeded) init = 5439488(5312K) used =
> > 489852536(478371K) committed = 536870912(524288K) max =
> > 536870912(524288K)
> >
> > 2009-08-10 12:03:40.298 WARN [Comm thread for
> > attempt_200908100026_0065_m_000077_1]
> > org.apache.hadoop.mapred.TaskRunner - Parent died.  Exiting
> > attempt_200908100026_0065_m_000077_1
> >
> > I have seen this before when jobs fail on the reduce phase but this is
> the
> > first time I am noticing jobs failing during the map phase.
>  Surprisingly,
> > jobs that load and process much more data ran successfully but when I
> tried
> > running the ones that failed, they failed again. Some of the jobs that
> > failed do nothing more than, load, filter and write out the filtered
> data.
> > This leads me to believe that the problem is more specific than I had
> > originally thought. Any pointers on what the issue might be will be
> > extremely helpful.
> >
> > Thanks,
> >
> > Krishna
> >
>

Re: Jobs failing during map phase due to low memory

Posted by Dmitriy Ryaboy <dv...@cloudera.com>.
Krishna,
Is it possible that the data you are reading in is malformed in such a
way that a mapper doesn't see an end of record for a very long time,
and keeps reading your input? Did any other jobs that read the same
input but perform different operations, succeed?

-Dmitriy

On Mon, Aug 10, 2009 at 12:59 PM, Shrikrishna Shrin<kr...@cooliris.com> wrote:
> Hi,
>
> I haven't seen this before but nightly jobs failed over the weekend because
> due to memory issues. The weird part is the jobs failed during the map phase
> (at about ~98% complete).
>
> The task tracker for the failed map jobs shows the following errors:
>
> Task attempt_200908100026_0065_m_000002_0 failed to report status for
> 602 seconds. Killing!
> Task attempt_200908100026_0065_m_000002_1 failed to report status for
> 603 seconds. Killing!
>
> The logs indicate memory to be the issue:
>
> 2009-08-10 11:53:37.829 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 287290336(280556K) committed = 363593728(355072K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:43.522 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 350217672(342009K) committed = 422510592(412608K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Usage threshold exceeded) init = 5439488(5312K) used =
> 376781240(367950K) committed = 422510592(412608K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:45.290 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 380504752(371586K) committed = 456720384(446016K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:46.752 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 401755464(392339K) committed = 482344960(471040K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:50.599 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 443763584(433362K) committed = 527171584(514816K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:54.686 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 491575560(480054K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:56.414 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 514928920(502860K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:57.553 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 520781832(508576K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:58.747 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 526636552(514293K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:53:59.935 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 532493568(520013K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:54:01.158 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 536870904(524287K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:54:02.389 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 536870904(524287K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 11:54:03.778 INFO [Low Memory Detector]
> org.apache.pig.impl.util.SpillableMemoryManager - low memory handler
> called (Collection threshold exceeded) init = 5439488(5312K) used =
> 489852536(478371K) committed = 536870912(524288K) max =
> 536870912(524288K)
>
> 2009-08-10 12:03:40.298 WARN [Comm thread for
> attempt_200908100026_0065_m_000077_1]
> org.apache.hadoop.mapred.TaskRunner - Parent died.  Exiting
> attempt_200908100026_0065_m_000077_1
>
> I have seen this before when jobs fail on the reduce phase but this is the
> first time I am noticing jobs failing during the map phase.  Surprisingly,
> jobs that load and process much more data ran successfully but when I tried
> running the ones that failed, they failed again. Some of the jobs that
> failed do nothing more than, load, filter and write out the filtered data.
> This leads me to believe that the problem is more specific than I had
> originally thought. Any pointers on what the issue might be will be
> extremely helpful.
>
> Thanks,
>
> Krishna
>