You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Mithun Radhakrishnan (JIRA)" <ji...@apache.org> on 2014/10/30 01:05:35 UTC

[jira] [Commented] (HADOOP-8143) Change distcp to have -pb on by default

    [ https://issues.apache.org/jira/browse/HADOOP-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189345#comment-14189345 ] 

Mithun Radhakrishnan commented on HADOOP-8143:
----------------------------------------------

[~aw]
bq. forcing block size will break non-HDFS methods in surprising ways.

Here's the code in DistCp that is affected by preserving block-size:
{code:java}
  private static long getBlockSize(
          EnumSet<FileAttribute> fileAttributes,
          FileStatus sourceFile, FileSystem targetFS, Path tmpTargetPath) {
    boolean preserve = fileAttributes.contains(FileAttribute.BLOCKSIZE)
        || fileAttributes.contains(FileAttribute.CHECKSUMTYPE);
    return preserve ? sourceFile.getBlockSize() : targetFS
        .getDefaultBlockSize(tmpTargetPath);
  }
{code}

Would the concern be that {{FileStatus.getBlockSize()}} might conk if the source-file isn't on HDFS? It's more likely that {{FileSystem.getDefaultBlockSize()}} is being called for a non-HDFS file-system as well, by default. 

> Change distcp to have -pb on by default
> ---------------------------------------
>
>                 Key: HADOOP-8143
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8143
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Dave Thompson
>            Assignee: Mithun Radhakrishnan
>            Priority: Minor
>         Attachments: HADOOP-8143.1.patch
>
>
> We should have the preserve blocksize (-pb) on in distcp by default.        
> checksum which is on by default will always fail if blocksize is not the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)