You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2008/09/15 20:31:44 UTC

[jira] Created: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Implement getFileChecksum(Path) in HftpFileSystem
-------------------------------------------------

                 Key: HADOOP-4176
                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
             Project: Hadoop Core
          Issue Type: New Feature
          Components: fs, tools/distcp
            Reporter: Tsz Wo (Nicholas), SZE


In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Attachment: 4176_20080918.patch

> Would you expect the load on the namenode to be significant for large numbers of checksum requests? Might it make sense to redirect this request to a datanode that can compute and return the checksum?

Yes, we should do URL redirect.

4176_20080918.patch: Incorporated all Chris's comments.



> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632044#action_12632044 ] 

Chris Douglas commented on HADOOP-4176:
---------------------------------------

This looks good. Only a couple minor suggestions in DFSClient:
* Since callGetBlockLocations is private, there's no compatibility issue and its signature can be changed and all the calls updated instead of redirecting again.
* In the javadoc for getFileChecksum, {{@see}} is more standard than "Similar to..."

Would you expect the load on the namenode to be significant for large numbers of checksum requests? Might it make sense to redirect this request to a datanode that can compute and return the checksum?

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632551#action_12632551 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4176:
------------------------------------------------

I tested the patch locally.  It only failed on TestLimitTasksPerJobTaskScheduler, which is not related to the patch.  See HADOOP-4213.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch, 4176_20080919.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Attachment: 4176_20080917.patch

4176_20080917.patch: first complete patch for reviewing.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632393#action_12632393 ] 

Chris Douglas commented on HADOOP-4176:
---------------------------------------

Looks good; +1, assuming tests all pass.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Attachment: 4176_20080918b.patch

4176_20080918b.patch: The Servlet classes should be static.

Forgot to upload the latest patch.  The previous test-patch results was run on this patch

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

      Component/s:     (was: fs)
                   dfs
    Fix Version/s: 0.19.0
         Assignee: Tsz Wo (Nicholas), SZE

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632385#action_12632385 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4176:
------------------------------------------------

{noformat}
     [exec] +1 overall.  

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{noformat}

Testing it locally.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-4176:
------------------------------------

    Release Note: Implemented getFileChecksum(Path) in HftpFileSystemfor distcp support.  (was: Implemented getFileChecksum(Path) in HftpFileSystem.)

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch, 4176_20080919.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Attachment: 4176_20080919.patch

4176_20080919.patch: I should set the conf value in the test but not change MiniDFSCluster.  Otherwise, other tests will fail.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch, 4176_20080919.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

      Resolution: Fixed
    Release Note: Implemented getFileChecksum(Path) in HftpFileSystem.
          Status: Resolved  (was: Patch Available)

I just committed this.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch, 4176_20080919.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633305#action_12633305 ] 

Hudson commented on HADOOP-4176:
--------------------------------

Integrated in Hadoop-trunk #611 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/])

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch, 4176_20080919.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

Submitting...

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4176) Implement getFileChecksum(Path) in HftpFileSystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4176:
-------------------------------------------

    Attachment: 4176_20080918c.patch

4176_20080918c.patch: fixed a bug in the test.

> Implement getFileChecksum(Path) in HftpFileSystem
> -------------------------------------------------
>
>                 Key: HADOOP-4176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4176
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, tools/distcp
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.0
>
>         Attachments: 4176_20080917.patch, 4176_20080918.patch, 4176_20080918b.patch, 4176_20080918c.patch
>
>
> In order to use FileChecksum in DistCp, we should implement getFileChecksum(Path) in HftpFileSystem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.