You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2007/06/28 20:12:05 UTC

[jira] Created: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
-----------------------------------------------------------------

                 Key: HADOOP-1544
                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Tsz Wo (Nicholas), SZE
            Assignee: Tsz Wo (Nicholas), SZE
            Priority: Minor


- Two methods provide similar functions

- ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system

- It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal

- FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508931 ] 

Tsz Wo (Nicholas), SZE edited comment on HADOOP-1544 at 8/1/07 10:08 AM:
-------------------------------------------------------------------------

- There are quite a few methods check whether the src or dst is a file or directory.  It is better to make such checking in one place.


 was:
- I think we mainly need a single copy method in FileSystem.  Since FileSystem.copyToLocal and FileSystem.copyFromLocal ultimately call 
FileUtil.copy(FileSystem srcFS, Path src, FileSystem dstFS, Path dst, boolean deleteSource, Configuration conf),
which don't care whether the file systems are local or not.

- If we want to keep FileSystem.copyToLocal and FileSystem.copyFromLocal, make them final and calling FileSystem.copy (a new method).

- In FsShell, the command copyToLocal first copy src to a tmp file and then rename it (see HADOOP-1292).  So, it depends on whether we want this behavior everywhere.

- One more concern is in ChecksumFileSystem:  It checks whether the CRC files should be copied.

- There are quite a few methods check whether the src or dst is a file or directory.  It is better to make such checking in one place.

> combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1544
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>
> - Two methods provide similar functions
> - ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system
> - It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal
> - FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508918 ] 

Andrzej Bialecki  commented on HADOOP-1544:
-------------------------------------------

Not sure if I understand this ... FileSystem.copyToLocal is the primary means of moving things out of DFS .. so definitely yes, we are using it - what could we use instead? If there's an equivalent API , which is now considered more kosher, we can adjust Nutch when upgrading to the next Hadoop release.

> combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1544
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>
> - Two methods provide similar functions
> - ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system
> - It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal
> - FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508914 ] 

Doug Cutting commented on HADOOP-1544:
--------------------------------------

Other similar methods are FileSystem#copyToLocal() and FileUtil#copy().

Do we still need FileSystem#copyToLocal()?  Who uses this?  Nutch?

Does the FsShell stuff need to be different from FileUtil?


> combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1544
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>
> - Two methods provide similar functions
> - ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system
> - It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal
> - FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE reassigned HADOOP-1544:
----------------------------------------------

    Assignee:     (was: Tsz Wo (Nicholas), SZE)

> combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1544
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>
> - Two methods provide similar functions
> - ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system
> - It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal
> - FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1544) combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508931 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-1544:
------------------------------------------------

- I think we mainly need a single copy method in FileSystem.  Since FileSystem.copyToLocal and FileSystem.copyFromLocal ultimately call 
FileUtil.copy(FileSystem srcFS, Path src, FileSystem dstFS, Path dst, boolean deleteSource, Configuration conf),
which don't care whether the file systems are local or not.

- If we want to keep FileSystem.copyToLocal and FileSystem.copyFromLocal, make them final and calling FileSystem.copy (a new method).

- In FsShell, the command copyToLocal first copy src to a tmp file and then rename it (see HADOOP-1292).  So, it depends on whether we want this behavior everywhere.

- One more concern is in ChecksumFileSystem:  It checks whether the CRC files should be copied.

- There are quite a few methods check whether the src or dst is a file or directory.  It is better to make such checking in one place.

> combine FsShell.copyToLocal to ChecksumFileSystem.copyToLocalFile
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1544
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1544
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Minor
>
> - Two methods provide similar functions
> - ChecksumFileSystem.copyToLocalFile(Path src, Path dst, boolean copyCrc) is no longer used anywhere in the system
> - It is better to use ChecksumFileSystem.getRawFileSystem() for copying crc in FsShell.copyToLocal
> - FileSystem.isDirectory(Path) used in FsShell.copyToLocal is deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.