You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Yesha Vora (JIRA)" <ji...@apache.org> on 2014/08/07 20:48:11 UTC

[jira] [Created] (TEZ-1387) Add proper diagonstics message for disk issues

Yesha Vora created TEZ-1387:
-------------------------------

             Summary: Add proper diagonstics message for disk issues
                 Key: TEZ-1387
                 URL: https://issues.apache.org/jira/browse/TEZ-1387
             Project: Apache Tez
          Issue Type: Sub-task
            Reporter: Yesha Vora


Tez prints 'java.io.IOException: Spill failed' message where disks are full. It should print better diagnostic message such as  "disk is full" .

{noformat}
2014-06-13 12:09:37,202 INFO [main] org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter: (EQUATOR) 146679246 kvi 36669804(146679216)
2014-06-13 12:09:37,383 WARN [SpillThread [finalreduce_] org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter: Got an exception in sortAndSpill
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for attempt_1402677732456_0109_1_00_000003_1_10003_spill_0.out
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
        at org.apache.tez.runtime.library.common.task.local.output.TezTaskOutputFiles.getSpillFileForWrite(TezTaskOutputFiles.java:183)
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.spill(DefaultSorter.java:739)
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.sortAndSpill(DefaultSorter.java:723)
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter$SpillThread.run(DefaultSorter.java:655)
2014-06-13 12:09:37,389 INFO [main] org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Final Counters : Counters: 10 [[org.apache.tez.common.counters.TaskCounter SPLIT_RAW_BYTES=221, SPILLED_RECORDS=0, INPUT_RECORDS_PROCESSED=16398, OUTPUT_RECORDS=16397, OUTPUT_BYTES=173778252, OUTPUT_BYTES_WITH_OVERHEAD=0, OUTPUT_BYTES_PHYSICAL=0, ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0, ADDITIONAL_SPILL_COUNT=0]]
2014-06-13 12:09:37,390 INFO [Tez Container Heartbeat Thread [container_1402677732456_0109_01_000021]] org.apache.hadoop.mapred.YarnTezDagChild: Heartbeat thread interrupted.  stopped: true error: false
2014-06-13 12:09:37,390 INFO [Tez Container Heartbeat Thread [container_1402677732456_0109_01_000021]] org.apache.hadoop.mapred.YarnTezDagChild: Current task marked as complete. Stopping heartbeat thread and allowing normal container shutdown
2014-06-13 12:09:37,390 FATAL [main] org.apache.hadoop.mapred.YarnTezDagChild: Error running child : java.io.IOException: Spill failed
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.checkSpillException(DefaultSorter.java:686)
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.collect(DefaultSorter.java:211)
        at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.write(DefaultSorter.java:185)
        at org.apache.tez.runtime.library.output.OnFileSortedOutput$1.write(OnFileSortedOutput.java:116)
        at org.apache.tez.mapreduce.processor.map.MapProcessor$NewOutputCollector.write(MapProcessor.java:373)
        at org.apache.tez.mapreduce.hadoop.mapreduce.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:90)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
        at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:247)
        at org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:134)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)