You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "John George (JIRA)" <ji...@apache.org> on 2011/08/12 17:58:27 UTC

[jira] [Created] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

merge hadoop archive goodness from trunk to .20
-----------------------------------------------

                 Key: HADOOP-7539
                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.20.205.0
            Reporter: John George
            Assignee: John George
             Fix For: 0.20.205.0


hadoop archive in branch-20 is outdated. When run by recently produced the following bugs which were all fixed in trunk. This JIRA aims to bring in all these changes to branch-20.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089510#comment-13089510 ] 

John George commented on HADOOP-7539:
-------------------------------------

1. Create HAR file using version 1
{quote}
$ hadoop fs -cat /tmp/thisis1.har/_masterindex
1 
0 2127535165 0 1856 
{quote}

2. Install version 3 of HAR
{quote}
$ hadoop fs -cat /tmp/thisis3.har/_masterindex
3 
0 2127535165 0 2610 
{quote}

3. Run ls and wordcount on VERSION 1
{quote}
$ hadoop fs -ls har:///tmp/thisis1.har
$ hadoop jar hadoop-examples.jar wordcount har:///tmp/thisis1.har/x.sh /tmp/out.2
{quote}

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John George updated HADOOP-7539:
--------------------------------

    Attachment: HADOOP-7539-1.patch

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-20 is outdated. When run by recently produced the following bugs which were all fixed in trunk. This JIRA aims to bring in all these changes to branch-20.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088950#comment-13088950 ] 

Mahadev konar commented on HADOOP-7539:
---------------------------------------

Looks like I might be wrong. The patch seems to be able to read the old har archives as well. John, mind testing it out? 

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084265#comment-13084265 ] 

Mahadev konar commented on HADOOP-7539:
---------------------------------------

John,
 Since this is a big patch, can you please do some manual testing on a real cluster (could be a single node cluster)? Just run a archive job and then a map reduce job to use the archives as input and verify the results. That should suffice.

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084217#comment-13084217 ] 

John George commented on HADOOP-7539:
-------------------------------------

The following JIRAs were the most interesting ones, but it made sense to bring in most of the others as well, not only because a bunch of them are dependencies of the JIRAs that were needed, but also because it is easier to merge.

MAPREDUCE-1425 :archive throws OutOfMemoryError
MAPREDUCE-2317 :HadoopArchives throwing NullPointerException while creating hadoop archives
MAPREDUCE-1399 : The archive command shows a null error message
MAPREDUCE-1752 :Implement getFileBlockLocations in HarFilesystem



> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>
> hadoop archive in branch-20 is outdated. When run by recently produced the following bugs which were all fixed in trunk. This JIRA aims to bring in all these changes to branch-20.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088819#comment-13088819 ] 

Mahadev konar commented on HADOOP-7539:
---------------------------------------

Maybe we want to add a utility to upconvert from 1 to 3 version?

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084239#comment-13084239 ] 

Owen O'Malley commented on HADOOP-7539:
---------------------------------------

No one has proposed making any more releases out of branch-0.20. Can you generate a patch for the branch-0.20-security line?

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-20 is outdated. When run by recently produced the following bugs which were all fixed in trunk. This JIRA aims to bring in all these changes to branch-20.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084271#comment-13084271 ] 

John George commented on HADOOP-7539:
-------------------------------------

Yes, I will run the manual testing and post the results here.

I ran "ant test" and it failed the same test that failed without the patch. The results of test-patch is as follows:


   [exec] BUILD SUCCESSFUL
     [exec] Total time: 6 minutes 23 seconds
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================




> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084254#comment-13084254 ] 

John George commented on HADOOP-7539:
-------------------------------------

Sorry Owen,  I meant to say branch-20-security (not branch-0.20). Fixed "Description". The patch is also meant for branch-.20-security.

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John George updated HADOOP-7539:
--------------------------------

    Status: Patch Available  (was: Open)

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093358#comment-13093358 ] 

Mahadev konar commented on HADOOP-7539:
---------------------------------------

looks good to me. Ill run some ant tests and check it in the 0.20 security branch.

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088817#comment-13088817 ] 

Mahadev konar commented on HADOOP-7539:
---------------------------------------

The only issue I see is that hadoop archives that already existed on the cluster will become obsolete since the new archive code wont be able to read it? 

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John George updated HADOOP-7539:
--------------------------------

    Description: 
hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.


  was:
hadoop archive in branch-20 is outdated. When run by recently produced the following bugs which were all fixed in trunk. This JIRA aims to bring in all these changes to branch-20.



> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "John George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085154#comment-13085154 ] 

John George commented on HADOOP-7539:
-------------------------------------

Manual tests run:

- created a har file as follows:
   - hadoop fs -put test /tmp
   - hadoop archive -archiveName test.har -p /tmp test /tmp

- ran the following manual tests:
   - wordcount on a couple of har files
   - streaming on the same har file with: hadoop jar hadoop-streaming.jar -Dmapred.reduce.tasks=1 -input har:///tmp/test.har/test/aa -output /tmp/aaa.2 -mapper cat -reducer "wc -l"


Both of the above jobs completed successfully and had outputs in the corresponding output directory.

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089455#comment-13089455 ] 

Hadoop QA commented on HADOOP-7539:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12490263/HADOOP-7539-1.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/67//console

This message is automatically generated.

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7539) merge hadoop archive goodness from trunk to .20

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated HADOOP-7539:
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks a lot John!

> merge hadoop archive goodness from trunk to .20
> -----------------------------------------------
>
>                 Key: HADOOP-7539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7539
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.20.205.0
>
>         Attachments: HADOOP-7539-1.patch
>
>
> hadoop archive in branch-0.20-security is outdated. When run recently, it produced  some bugs which were all fixed in trunk. This JIRA aims to bring in all these JIRAs to branch-0.20-security.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira