You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Runping Qi (JIRA)" <ji...@apache.org> on 2008/07/03 00:02:45 UTC
[jira] Created: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
The data_join should allow the user to implement a customer cloning function
----------------------------------------------------------------------------
Key: HADOOP-3684
URL: https://issues.apache.org/jira/browse/HADOOP-3684
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Reporter: Runping Qi
Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
This amounts to a very heavy weight deep copy of the value objects.
That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
the framework should allow the user to implemet an application specific yet efficient cloning function.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624794#action_12624794 ]
Hudson commented on HADOOP-3684:
--------------------------------
Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas updated HADOOP-3684:
----------------------------------
Issue Type: Improvement (was: Bug)
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Attachment: (was: H-3684.txt)
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Chansler updated HADOOP-3684:
------------------------------------
Release Note: Allowed user to overwrite clone function in a subclass of TaggedMapOutput class. (was: make it possible for the user to overwrite clone function in a subclass of TaggedMapOutput class)
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Fix Version/s: 0.19.0
Release Note: make it possible for the user to overwrite clone function in a subclass of TaggedMapOutput class
Status: Patch Available (was: Open)
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas updated HADOOP-3684:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Runping
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Runping Qi
> Assignee: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610127#action_12610127 ]
Hadoop QA commented on HADOOP-3684:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12385161/H-3684.txt
against trunk revision 673517.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
-1 contrib tests. The patch failed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2789/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2789/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2789/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2789/console
This message is automatically generated.
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610093#action_12610093 ]
Hadoop QA commented on HADOOP-3684:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12385153/H-3684.txt
against trunk revision 673517.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
-1 patch. The patch command could not apply the patch.
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2786/console
This message is automatically generated.
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Status: Open (was: Patch Available)
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas updated HADOOP-3684:
----------------------------------
Description:
Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
This amounts to a very heavy weight deep copy of the value objects.
That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
the framework should allow the user to implemet an application specific yet efficient cloning function.
was:
Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
This amounts to a very heavy weight deep copy of the value objects.
That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
the framework should allow the user to implemet an application specific yet efficient cloning function.
Assignee: Runping Qi
Hadoop Flags: [Reviewed]
+1
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Attachment: H-3684.txt
Attach a simple patch.
This patch allows the user to overwrite the clone(JobConf job)
method in the subclass of TaggedMapOutputclass.
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Status: Patch Available (was: Open)
regenerate the patch
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3684) The data_join should allow the user
to implement a customer cloning function
Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Runping Qi updated HADOOP-3684:
-------------------------------
Attachment: H-3684.txt
> The data_join should allow the user to implement a customer cloning function
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3684
> URL: https://issues.apache.org/jira/browse/HADOOP-3684
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Fix For: 0.19.0
>
> Attachments: H-3684.txt
>
>
> Currently, the framework uses serialization/deserialization to clone the values passed to the resuce function.
> This amounts to a very heavy weight deep copy of the value objects.
> That is way too expensive. Although that may be a generic way to work for all possible value classes, thus good as a default way,
> the framework should allow the user to implemet an application specific yet efficient cloning function.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.