You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/07 15:47:00 UTC
[jira] [Commented] (HDFS-16911) Distcp with snapshot diff to support Ozone filesystem.
[ https://issues.apache.org/jira/browse/HDFS-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685364#comment-17685364 ]
ASF GitHub Bot commented on HDFS-16911:
---------------------------------------
sadanand48 opened a new pull request, #5364:
URL: https://github.com/apache/hadoop/pull/5364
### Description of PR
Currently in `DistcpSync` i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with `-diff ` option, only `DistributedFilesystem` and `WebHDFS` filesystems are supported.
Now that Ozone supports snapshots, the change here is to make DistcpSync also support OzoneFilesystem (ofs).
If we add the check here in `DistcpSync `to check whether filesystem is instance of `BasicRootedOzoneFilesystem`,
we would need to introduce ozone dependency in hadoop code - but Ozone already has a hadoop dependency? To solve this , this PR introduces a job config by the name distcp.sync.class and the default value for this is the `DistcpSync` class itself. If a new filesystem wants to introduce their own implementation of `DistcpSync`, it just needs to extend the `DistcpSync `class and provide the class name to the config. The class will be initialised at run time, if no class is provided , it falls back to the original DistcpSync implementation and won't affect any current code.
The changes in this PR are mostly to be able to extend `DistcpSync` class in Ozone by making changing some of the access modifiers from private to public/protected so that it can be accessed from elsewhere.
### How was this patch tested?
Tested by building hadoop code and replacing distcp jars in Ozone via Unit tests in Ozone.
Also tested on a cluster.
### For code changes:
> Distcp with snapshot diff to support Ozone filesystem.
> ------------------------------------------------------
>
> Key: HDFS-16911
> URL: https://issues.apache.org/jira/browse/HDFS-16911
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: distcp
> Reporter: Sadanand Shenoy
> Priority: Major
>
> Currently in DistcpSync i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with -diff option, only DistributedFilesystem and WebHDFS filesytems are supported.
>
> {code:java}
> // currently we require both the source and the target file system are
> // DistributedFileSystem or (S)WebHdfsFileSystem.
> if (!(srcFs instanceof DistributedFileSystem
> || srcFs instanceof WebHdfsFileSystem)) {
> throw new IllegalArgumentException("Unsupported source file system: "
> + srcFs.getScheme() + "://. " +
> "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }
> if (!(tgtFs instanceof DistributedFileSystem
> || tgtFs instanceof WebHdfsFileSystem)) {
> throw new IllegalArgumentException("Unsupported target file system: "
> + tgtFs.getScheme() + "://. " +
> "Supported file systems: hdfs://, webhdfs:// and swebhdfs://.");
> }{code}
> As Ozone now supports snapshot feature after HDDS-6517, add support to use distcp with ozone snapshots.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org