You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2009/04/17 13:28:15 UTC
[jira] Created: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
------------------------------------------------------------------------------
Key: HADOOP-5698
URL: https://issues.apache.org/jira/browse/HADOOP-5698
Project: Hadoop Core
Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705081#action_12705081 ]
Hadoop QA commented on HADOOP-5698:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12406761/patch-5698.txt
against trunk revision 770685.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
-1 javac. The applied patch generated 801 javac compiler warnings (more than the trunk's current 2455 warnings).
-1 findbugs. The patch appears to cause Findbugs to fail.
+1 Eclipse classpath. The patch retains Eclipse classpath integrity.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed core unit tests.
-1 contrib tests. The patch failed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/console
This message is automatically generated.
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sharad Agarwal updated HADOOP-5698:
-----------------------------------
Resolution: Fixed
Fix Version/s: 0.21.0
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I just committed this. Thanks Amareshwari!
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------
Hadoop Flags: [Incompatible change, Reviewed] (was: [Reviewed])
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714315#action_12714315 ]
dhruba borthakur commented on HADOOP-5698:
------------------------------------------
+1 Code looks great!
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------
Status: Patch Available (was: Open)
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719588#action_12719588 ]
Hudson commented on HADOOP-5698:
--------------------------------
Integrated in Hadoop-trunk #867 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/867/])
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------
Status: Patch Available (was: Open)
bq. I think it would be a better idea not to club the fixes to the existing mapred.CombineInputFormat into this Jira. That should be addressed in a separate Jira and the patch for this should be built on top of that.
Done by HADOOP-5759
bq. # Minor - Why is there a return 2 in run (instead of return 1 as in existing code)
exit code is 2 if the usage was wrong
bq. CombineFileInputFormat.createRecordReader - should this just return null or should it call super.createRecordReader
Implemented abstract method from super class to return null. commented the same
test-patch :
{noformat}
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 9 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[exec]
{noformat}
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------
Attachment: patch-5698.txt
Patch moves CombineFileInputFormat, CombineFileRecordReader, CombineFileSplit libraries and org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
In existing CombineFileInputFormat, the split location passed is rack-name. But this is resulting in IlleagalArgumentException while resolving, since it contains PATH_SEPARATOR('/') in the name. Also passing rackname (without '/'), results in wrong resolution of the node as /default-rack/<rack-name>. After offline discussion with Dhruba, I changed the input format to pass hostnames instead of racknames as a solution to this problem.
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714392#action_12714392 ]
Amareshwari Sriramadasu commented on HADOOP-5698:
-------------------------------------------------
ant test passed on my machine
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------
Attachment: patch-5698-1.txt
Patch with review comments incorporated.
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698-1.txt, patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5698) Change
org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
Posted by "Jothi Padmanabhan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jothi Padmanabhan updated HADOOP-5698:
--------------------------------------
Status: Open (was: Patch Available)
I think it would be a better idea not to club the fixes to the existing mapred.CombineInputFormat into this Jira. That should be addressed in a separate Jira and the patch for this should be built on top of that.
Some other points:
# MultiFileWordCount -- I do not think we should use the MultiFileLineRecordReader to read from a CombineSplit. It is guaranteed to work only if the start offset is 0, which is not necessarily true. Instead the CombineFileRecordReader should be used
# Minor -- Why is there a return 2 in run (instead of return 1 as in existing code)
# CombineFileInputFormat.createRecordReader -- should this just return null or should it call super.createRecordReader ?
# Minor -- CombineFileRecordReader -- Remove unused exports
# Minor -- Where ever possible, keep the code/comments restricted to 80 columns
> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5698
> URL: https://issues.apache.org/jira/browse/HADOOP-5698
> Project: Hadoop Core
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-5698.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.