You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Michael Fuchs (JIRA)" <ji...@apache.org> on 2009/01/20 10:59:59 UTC

[jira] Updated: (HADOOP-5084) Reduce output data is not written to disk

     [ https://issues.apache.org/jira/browse/HADOOP-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Fuchs updated HADOOP-5084:
----------------------------------

    Summary: Reduce output data is not written to disk  (was: Output data not written to disk)

> Reduce output data is not written to disk
> -----------------------------------------
>
>                 Key: HADOOP-5084
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5084
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.2
>         Environment: Linux version 2.6.22-12-generic (buildd@vernadsky) (gcc version 4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)) #1 SMP Sun Sep 23 18:11:30 GMT 2007 running Hadoop 18.2 on two nodes
>            Reporter: Michael Fuchs
>            Priority: Critical
>
> I run into an critical issues with Hadoop 18.2 on my Linux boxes:
> The jobs executes without any complains and they are listed in the
> succeeded list but there is no output data beside the "_logs" directory.
> The same code works with .17.2.1
>  
> Here are some sections of the logs:
> [logfile]
> hadoop@bock:~/logs$ tail hadoop-hadoop-jobtracker-bock.log
> 2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing a data-local task task_200812231229_0031_m_000001 for
> speculation
> 2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200812231229_0031_m_000001_1' to tip
> task_200812231229_0031_m_000001, for tracker
> 'tracker_bock:localhost/127.0.0.1:15260'
> 2008-12-23 13:31:01,065 INFO org.apache.hadoop.mapred.JobInProgress:
> Task 'attempt_200812231229_0031_m_000001_1' has completed
> task_200812231229_0031_m_000001 successfully.
> 2008-12-23 13:31:03,177 INFO org.apache.hadoop.mapred.TaskRunner: Saved
> output of task 'attempt_200812231229_0031_r_000000_0' to
> hdfs://BOCK:9000/ana/oiprocessed/2008/12/23/Sen1/92a74190-2038-4c79-82c4-2de6fdc615db
> [/logfile]
> But the folder contains only a "_logs" folder which has a history file
> which contains:
> [logfile]
> Job JOBID="job_200812231415_0001" FINISH_TIME="1230038377844"
> JOB_STATUS="SUCCESS" FINISHED_MAPS="2" FINISHED_REDUCES="1"
> FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Data-local
> map tasks:2,Job Counters .Launched reduce tasks:1,Job Counters .Launched
> map tasks:3,Map-Reduce Framework.Reduce input records:61,Map-Reduce
> Framework.Map output records:61,Map-Reduce Framework.Map output
> bytes:7194,Map-Reduce Framework.Combine output records:0,Map-Reduce
> Framework.Map input records:61,Map-Reduce Framework.Reduce input
> groups:12,Map-Reduce Framework.Combine input records:0,Map-Reduce
> Framework.Map input bytes:36396,Map-Reduce Framework.Reduce output
> records:12,File Systems.HDFS bytes written:1533,File Systems.Local bytes
> written:14858,File Systems.HDFS bytes read:38679,File Systems.Local
> bytes
> read:7388,com..ana.scheduling.HadoopTask$Counter.MAPPEED:61
> "
> [/logfile]
> So what I see is that the system runs successful and it even says it
> writes data! ("Map-Reduce Framework.Reduce output records:12,File Systems.HDFS bytes written:1533")
> If I run the same code with .17.2.1 or in local mode with .18.2 it works
> and I get a part-0000 file with the expected data.
>  
> Please tell me if you need additional information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.