You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2007/12/11 20:55:43 UTC

[jira] Commented: (HADOOP-2399) Input key and value to combiner and reducer should be reused

    [ https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550692 ] 

Doug Cutting commented on HADOOP-2399:
--------------------------------------

+1

As a general rule, I think applications should not expect to be able to hold on to pointers to objects passed to them, but should expect to be able to hold on to pointers returned to them.  Lots of exceptions of course, but, in this case, I don't think applications should be expecting to be able to hold on to these objects, and so any that break if we reuse them were not well written.

These were originally reused.  Reuse was removed when the combiner was added, since the original combiner kept pointers to the objects.



> Input key and value to combiner and reducer should be reused
> ------------------------------------------------------------
>
>                 Key: HADOOP-2399
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2399
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.1
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.16.0
>
>
> Currently, the input key and value are recreated on every iteration for input to the combiner and reducer. It would speed up the system substantially if we reused the keys and values. The down side of doing it, is that it may break applications that count on holding references to previous keys and values, but I think it is worth doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.