You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2009/07/09 00:07:14 UTC

[jira] Created: (HADOOP-6133) ReflectionUtils performance regression

ReflectionUtils performance regression
--------------------------------------

                 Key: HADOOP-6133
                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
             Project: Hadoop Common
          Issue Type: Improvement
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
         Attachments: Test.java

HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:

Explicit construction (new Test): around ~1.6sec
Using Test.class.newInstance: around ~2.6sec
ReflectionUtils on 0.18.3: ~8.0sec
ReflectionUtils on 0.20.0: ~200sec

This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Status: Patch Available  (was: Open)

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752941#action_12752941 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #5 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/5/])
    . Add a caching layer to Configuration::getClassByName to
alleviate a performance regression introduced in a compatibility layer.
Contributed by Todd Lipcon


> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-6133:
----------------------------------

    Status: Open  (was: Patch Available)

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750023#action_12750023 ] 

Todd Lipcon commented on HADOOP-6133:
-------------------------------------

I think Collections.synchronizedMap(WeakHashMap) should be fine. I'm out this week but can get to this next week, or consider this a +1 if you want to make the change.

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741168#action_12741168 ] 

Hadoop QA commented on HADOOP-6133:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12416013/hadoop-6133-trunk.txt
  against trunk revision 802224.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/595/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/595/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/595/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/595/console

This message is automatically generated.

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753561#action_12753561 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #21 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/21/])
    

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752223#action_12752223 ] 

Hadoop QA commented on HADOOP-6133:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418836/hadoop-6133-trunk.txt
  against trunk revision 812127.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/22/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/22/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/22/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/22/console

This message is automatically generated.

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Attachment: hadoop-6133-trunk.txt

Here's the same patch against trunk

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Attachment: hadoop-6133-trunk.txt

Here's an updated patch using WeakHashMaps instead of ConcurrentHashMaps

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752274#action_12752274 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hadoop-Common-trunk-Commit #17 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/17/])
    . Add a caching layer to Configuration::getClassByName to
alleviate a performance regression introduced in a compatibility layer.
Contributed by Todd Lipcon


> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838062#action_12838062 ] 

Allen Wittenauer commented on HADOOP-6133:
------------------------------------------

Anyone know off hand what branch this was committed to and can set the 'fix version' field?


> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741174#action_12741174 ] 

Todd Lipcon commented on HADOOP-6133:
-------------------------------------

No additional tests required - this code path is exercised widely by all parts of Hadoop

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Status: Patch Available  (was: Open)

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728941#action_12728941 ] 

Todd Lipcon commented on HADOOP-6133:
-------------------------------------

Owen: the issue with that is that the JobConfigurable.class has to be loaded through Configuration's ClassLoader. In order to cache it in ReflectionUtils we'd need to have a Map<ClassLoader, Class> CACHE_JOBCONFIGURABLE_CLASSES as well. Otherwise we risk having multiple instances of the JobConfigurable.class and checking isAssignableFrom against the wrong one.

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Attachment: hadoop-6133-0.20.patch

Here is one possible patch to fix this issue. The benchmark results in:

ReflectionUtils on post-patch branch-0.20: ~18.1sec

(still slower than 0.18.3 by about 2.5x but at least tolerable)

This is not the most elegant fix, but the ClassLoader inside Configuration makes it slightly difficult to do this at a different layer.

As for the importance of this - despite advice not to use ReflectionUtils in any hot path, there are cases when this happens. For example, MapWritable and GenericWritable do so for every deserialization. Outside libraries like Cascading also seem to not reuse objects in WritableDeserialization, and we have reports of some jobs spending nearly 100% CPU in Class.forName when profiled.

This patch is against branch-0.20 but should also work post-split after the file rename.

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728936#action_12728936 ] 

Owen O'Malley commented on HADOOP-6133:
---------------------------------------

Have you tried just caching the JobConfigurable class in a static? That should substantially speed up the code rather than looking it up over and over again. 

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Fix Version/s: 0.22.0
                   0.21.0

Looks to me like it was committed for 21 and 22

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-6133:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Todd!

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

          Component/s: conf
    Affects Version/s: 0.20.0

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752479#action_12752479 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #23 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/23/])
    

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749771#action_12749771 ] 

Chris Douglas commented on HADOOP-6133:
---------------------------------------

Won't this prevent classes from being unloaded by retaining a reference to ClassLoaders in the cache? While HADOOP-4187 is explicitly temporary, this solution is unlikely to be removed with the cause of the regression.

I can't think of any places where there would be fierce contention for an explicit lock on the cache requiring a ConcurrentHashMap for the outer map. Would a WeakHashMap suffice, here?

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6133:
--------------------------------

    Attachment: Test.java

> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752444#action_12752444 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hadoop-Common-trunk #89 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/89/])
    . Add a caching layer to Configuration::getClassByName to
alleviate a performance regression introduced in a compatibility layer.
Contributed by Todd Lipcon


> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6133) ReflectionUtils performance regression

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752284#action_12752284 ] 

Hudson commented on HADOOP-6133:
--------------------------------

Integrated in Hadoop-Hdfs-trunk-Commit #17 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/17/])
    . Add a caching layer to Configuration::getClassByName to
alleviate a performance regression introduced in a compatibility layer.
Contributed by Todd Lipcon


> ReflectionUtils performance regression
> --------------------------------------
>
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, hadoop-6133-trunk.txt, hadoop-6133-trunk.txt, Test.java
>
>
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.