You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2008/07/25 02:41:31 UTC

[jira] Created: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Jobs with empty map-outputs and intermediate compression fail
-------------------------------------------------------------

                 Key: HADOOP-3827
                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.18.0
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
            Priority: Blocker
             Fix For: 0.18.0


The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.

The straight-forward fix is to pass the codec:
{noformat}
           Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
-                                                 keyClass, valClass, null);
+                                                 keyClass, valClass, codec);
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617007#action_12617007 ] 

Arun C Murthy commented on HADOOP-3827:
---------------------------------------

The test failure in TestCopyFiles is unrelated - I've checked that it works locally.

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624692#action_12624692 ] 

Hudson commented on HADOOP-3827:
--------------------------------

Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616847#action_12616847 ] 

Devaraj Das commented on HADOOP-3827:
-------------------------------------

+1 (not committing it as hudson hasn't run it yet).

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3827:
----------------------------------

    Status: Patch Available  (was: Open)

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3827:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616880#action_12616880 ] 

Hadoop QA commented on HADOOP-3827:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12386848/HADOOP-3827_0_20080724.patch
  against trunk revision 679743.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2950/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2950/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2950/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2950/console

This message is automatically generated.

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3827:
----------------------------------

    Attachment: HADOOP-3827_0_20080724.patch

Straight-forward fix, along with a test case.

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617002#action_12617002 ] 

Viraj Bhat commented on HADOOP-3827:
------------------------------------

Here are the output and error logs for the maps and reduces which can result from this bug
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Logs of the output from killed map - "m_005937_0"  with zero input and output bytes to hdfs
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
attempt_200807242354_0001_m_005937_0: No outputs to promote from hdfs://ymachine.mydomain.com/myhome/dir/_temporary/_attempt_200807242354_0001_m_005937_0
2008-07-25 00:05:55,986 INFO org.apache.hadoop.mapred.TaskRunner: Task 'attempt_200807242354_0001_m_005937_0' done.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Error on map-side
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Too many fetch-failures
Too many fetch-failures
Too many fetch-failures

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Logs of the output from killed reduce "attempt_200807242354_0001_r_000001_0 " as a result of  - map "m_005937_0"  providing zero output bytes to the reducers
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
2008-07-25 00:06:00,618 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 2 bytes (2 raw bytes) into RAM from attempt_200807242354_0001_m_005937_0
2008-07-25 00:06:00,618 INFO org.apache.hadoop.mapred.ReduceTask: Read 0 bytes from map-output for attempt_200807242354_0001_m_005937_0
2008-07-25 00:06:00,618 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200807242354_0001_r_000001_0 copy failed: attempt_200807242354_0001_m_005937_0 from mymachine.mydomain.com
2008-07-25 00:06:00,618 WARN org.apache.hadoop.mapred.ReduceTask: java.io.IOException: Incomplete map output received for attempt_200807242354_0001_m_005937_0 from http://mymachine.mydomain.com:55279/mapOutput?job=job_200807242354_0001&map=attempt_200807242354_0001_m_005937_0&reduce=1 (0 instead of 2)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1248)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1093)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:983)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:932)
........
........
2008-07-25 00:06:37,696 INFO org.apache.hadoop.mapred.ReduceTask: Failed to fetch map-output from attempt_200807242354_0001_m_005937_0 even after MAX_FETCH_RETRIES_PER_MAP retries...  reporting to the JobTracker
--------------------------------------------------------------------------------------------------------------------------------------------------------------------

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3827_0_20080724.patch
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3827) Jobs with empty map-outputs and intermediate compression fail

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616727#action_12616727 ] 

Arun C Murthy commented on HADOOP-3827:
---------------------------------------

I've tested that the above fix works, I'll upload the patch with a test case.

> Jobs with empty map-outputs and intermediate compression fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-3827
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3827
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>
> The corner case where there are zero map-outputs doesn't pass the codec to the IFile.Writer leading to un-compressed data and subsequently failure on the reduce when it tries to decompress that data.
> The straight-forward fix is to pass the codec:
> {noformat}
>            Writer<K, V> writer = new Writer<K, V>(job, finalOut, 
> -                                                 keyClass, valClass, null);
> +                                                 keyClass, valClass, codec);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.