You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2009/04/17 13:28:15 UTC

[jira] Created: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
------------------------------------------------------------------------------

                 Key: HADOOP-5698
                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
             Project: Hadoop Core
          Issue Type: Sub-task
            Reporter: Amareshwari Sriramadasu
            Assignee: Amareshwari Sriramadasu




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705081#action_12705081 ] 

Hadoop QA commented on HADOOP-5698:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12406761/patch-5698.txt
  against trunk revision 770685.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    -1 javac.  The applied patch generated 801 javac compiler warnings (more than the trunk's current 2455 warnings).

    -1 findbugs.  The patch appears to cause Findbugs to fail.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/269/console

This message is automatically generated.

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal updated HADOOP-5698:
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.21.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks Amareshwari!

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------

    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714315#action_12714315 ] 

dhruba borthakur commented on HADOOP-5698:
------------------------------------------

+1 Code looks great!

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719588#action_12719588 ] 

Hudson commented on HADOOP-5698:
--------------------------------

Integrated in Hadoop-trunk #867 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/867/])
    

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------

    Status: Patch Available  (was: Open)

bq. I think it would be a better idea not to club the fixes to the existing mapred.CombineInputFormat into this Jira. That should be addressed in a separate Jira and the patch for this should be built on top of that.
Done by HADOOP-5759

bq. # Minor - Why is there a return 2 in run (instead of return 1 as in existing code)
exit code is 2 if the usage was wrong

bq. CombineFileInputFormat.createRecordReader - should this just return null or should it call super.createRecordReader
Implemented abstract method from super class to return null. commented the same

test-patch :
{noformat}
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
{noformat}

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------

    Attachment: patch-5698.txt

Patch moves CombineFileInputFormat, CombineFileRecordReader, CombineFileSplit libraries and org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
In existing CombineFileInputFormat, the split location passed is rack-name. But this is resulting in IlleagalArgumentException while resolving, since it contains PATH_SEPARATOR('/') in the name. Also passing rackname (without '/'), results in wrong resolution of the node as /default-rack/<rack-name>.  After offline discussion with Dhruba, I changed the input format to pass hostnames instead of racknames as a solution to this problem.

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714392#action_12714392 ] 

Amareshwari Sriramadasu commented on HADOOP-5698:
-------------------------------------------------

ant test passed on my machine

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5698:
--------------------------------------------

    Attachment: patch-5698-1.txt

Patch with review comments incorporated.

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698-1.txt, patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5698) Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.

Posted by "Jothi Padmanabhan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jothi Padmanabhan updated HADOOP-5698:
--------------------------------------

    Status: Open  (was: Patch Available)

I think it would be a better idea not to club the fixes to the existing mapred.CombineInputFormat into this Jira. That should be addressed in a separate Jira and the patch for this should be built on top of that.

Some other points:
# MultiFileWordCount -- I do not think we should use the MultiFileLineRecordReader to read from a CombineSplit. It is guaranteed to work only if the start offset is 0, which is not necessarily true. Instead the CombineFileRecordReader should be used
# Minor -- Why is there a return 2 in run (instead of return 1 as in existing code)
# CombineFileInputFormat.createRecordReader -- should this just return null or should it call super.createRecordReader ?
# Minor -- CombineFileRecordReader -- Remove unused exports
# Minor -- Where ever possible, keep the code/comments restricted to 80 columns

> Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5698
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5698
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-5698.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.