You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/07 15:47:00 UTC

[jira] [Commented] (HDFS-16911) Distcp with snapshot diff to support Ozone filesystem.

    [ https://issues.apache.org/jira/browse/HDFS-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685364#comment-17685364 ] 

ASF GitHub Bot commented on HDFS-16911:
---------------------------------------

sadanand48 opened a new pull request, #5364:
URL: https://github.com/apache/hadoop/pull/5364

   ### Description of PR
   
   Currently in `DistcpSync` i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with   `-diff ` option, only `DistributedFilesystem` and `WebHDFS` filesystems are supported.
   Now that Ozone supports snapshots, the change here is to make DistcpSync also support OzoneFilesystem (ofs).
   If we add the check here in `DistcpSync `to check whether filesystem is instance of `BasicRootedOzoneFilesystem`,
   we would need to introduce ozone dependency in hadoop code - but Ozone already has a hadoop dependency? To solve this , this PR introduces a job config by the name distcp.sync.class and the default value for this is the `DistcpSync` class itself. If a new filesystem wants to introduce their own implementation of  `DistcpSync`,  it just needs to extend the `DistcpSync `class and provide the class name to the config. The class will be initialised at run time, if no class is provided , it falls back to the original DistcpSync implementation and won't affect any current code.
   
   The changes in this PR are mostly to be able to extend `DistcpSync` class in Ozone by making changing some of the access modifiers from private to public/protected so that it can be accessed from elsewhere.
   
   ### How was this patch tested?
   
   Tested by building hadoop code and replacing distcp jars in Ozone via Unit tests in Ozone.
   Also tested on a cluster.
   
   ### For code changes:
   
   
   
   
   




> Distcp with snapshot diff to support Ozone filesystem.
> ------------------------------------------------------
>
>                 Key: HDFS-16911
>                 URL: https://issues.apache.org/jira/browse/HDFS-16911
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: distcp
>            Reporter: Sadanand Shenoy
>            Priority: Major
>
> Currently in DistcpSync i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with -diff option, only DistributedFilesystem and WebHDFS filesytems are supported.
>  
> {code:java}
> // currently we require both the source and the target file system are
> // DistributedFileSystem or (S)WebHdfsFileSystem.
> if (!(srcFs instanceof DistributedFileSystem
>         || srcFs instanceof WebHdfsFileSystem)) {
>   throw new IllegalArgumentException("Unsupported source file system: "
>       + srcFs.getScheme() + "://. " +
>       "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }
> if (!(tgtFs instanceof DistributedFileSystem
>     || tgtFs instanceof WebHdfsFileSystem)) {
>   throw new IllegalArgumentException("Unsupported target file system: "
>       + tgtFs.getScheme() + "://. " +
>       "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }{code}
> As Ozone now supports snapshot feature after HDDS-6517, add support to use distcp with ozone snapshots.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org