You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2008/09/29 12:42:44 UTC

[jira] Created: (HADOOP-4302) TestReduceFetch fails intermittently

TestReduceFetch fails intermittently
------------------------------------

                 Key: HADOOP-4302
                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.19.0
            Reporter: Devaraj Das
            Assignee: Chris Douglas
            Priority: Blocker
             Fix For: 0.19.0


I see TestReduceFetch failing once in a while. Here is one such failure 
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/

Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635644#action_12635644 ] 

Hadoop QA commented on HADOOP-4302:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391170/4302-0.patch
  against trunk revision 700322.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3399/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3399/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3399/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3399/console

This message is automatically generated.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Status: Patch Available  (was: Open)

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635892#action_12635892 ] 

Arun C Murthy commented on HADOOP-4302:
---------------------------------------

+1

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Status: Patch Available  (was: Open)

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4302:
--------------------------------

    Attachment: 0J5g5b71.html.part

Attaching the complete hudson log of the test failure

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635972#action_12635972 ] 

Hadoop QA commented on HADOOP-4302:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391262/4302-2.patch
  against trunk revision 700628.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3409/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3409/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3409/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3409/console

This message is automatically generated.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch, 4302-2.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Attachment: 4302-1.patch

Size the buffer to require an in-memory merge during the shuffle, leaving one map output in memory.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Status: Open  (was: Patch Available)

That didn't work; canceling patch.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635549#action_12635549 ] 

Chris Douglas commented on HADOOP-4302:
---------------------------------------

The test is nondeterministic; though it sets mapred.inmem.merge.threshold to 2, the third may be fetched, closed, and included in the in-memory merge. In this case, all the outputs will be merged to disk. The test should probably set mapred.reduce.parallel.copies to 1 to avoid the race, here. Off-by-n misses on mapred.inmem.merge.threshold aren't a real bug, not even for the final outputs; it's a heuristic, not a contract.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Attachment: 4302-0.patch

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch, 4302-2.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-4302:
----------------------------------

    Attachment: 4302-2.patch

*sigh* There's still a race condition in the last patch. If the third output is fetching (allocated) but not closed when the second closes, it's possible to merge the first two to disk before allocating the following three, which trigger a similar fault. The reduce will begin with all segments merged to disk. The solution sets {{mapred.job.shuffle.merge.percent}} high enough to avoid an intermediate merge in the test until the fetch thread is stalled on the final output.

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch, 4302-2.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4302) TestReduceFetch fails intermittently

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636646#action_12636646 ] 

Hudson commented on HADOOP-4302:
--------------------------------

Integrated in Hadoop-trunk #622 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/622/])
    

> TestReduceFetch fails intermittently
> ------------------------------------
>
>                 Key: HADOOP-4302
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4302
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>            Assignee: Chris Douglas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch, 4302-2.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.