You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2010/07/26 20:53:16 UTC

[jira] Created: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

The efficient comparators aren't always used except for BytesWritable and Text
------------------------------------------------------------------------------

                 Key: HADOOP-6881
                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.20.0
            Reporter: Owen O'Malley
            Assignee: Owen O'Malley


When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6881:
---------------------------------

           Status: Resolved  (was: Patch Available)
     Hadoop Flags: [Reviewed]
    Fix Version/s: 0.21.0
       Resolution: Fixed

Patch merged to both 0.20 and 0.21 branches.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892537#action_12892537 ] 

Doug Cutting commented on HADOOP-6881:
--------------------------------------

I just committed this.  Thanks, Owen!

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6881:
---------------------------------

    Attachment: HADOOP-6881.patch

Slightly cleaned up the test code.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-6881:
----------------------------------

           Status: Patch Available  (was: Open)
    Fix Version/s: 0.20.3

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892488#action_12892488 ] 

Hadoop QA commented on HADOOP-6881:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12450519/h-6881.patch
  against trunk revision 979387.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/634/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/634/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/634/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/634/console

This message is automatically generated.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892521#action_12892521 ] 

Hadoop QA commented on HADOOP-6881:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12450525/HADOOP-6881.patch
  against trunk revision 979387.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/636/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/636/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/636/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/636/console

This message is automatically generated.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6881:
---------------------------------

    Status: Open  (was: Patch Available)

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6881:
---------------------------------

    Status: Patch Available  (was: Open)

Get Hudson to try the test I added.

The javadoc warnings all seem unrelated, but rather are about use of proprietary Sun APIs and cannot be supressed.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6476630


> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-6881:
----------------------------------

    Attachment: h-6881.patch

This patch does two things:
  1. If a comparator is not defined, forces the class to be initialized by the class loader.
  2. cache the comparator so that we reuse the comparator even if it is the generic one.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: h-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6881) The efficient comparators aren't always used except for BytesWritable and Text

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-6881:
---------------------------------

    Attachment: HADOOP-6881.patch

Here's a version of the patch that adds a unit test.  This test fails for me without the patch.

> The efficient comparators aren't always used except for BytesWritable and Text
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-6881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6881
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.20.3
>
>         Attachments: h-6881.patch, HADOOP-6881.patch
>
>
> When we moved from Java 4 to Java 5 (and then 6), there was a change in the JVM semantics such that references to a class such as IntWritable.class no longer forces initialization. Since all of the Writables depend on their class static blocks to register their fast comparators, that can happen *after* we look up the comparator. In that case, the framework will fall back to the generic comparator that deserializes both keys and does the object compare, which may cause a huge slow down in the sort.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.