You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Virajith Jalaparti (Jira)" <ji...@apache.org> on 2020/06/17 02:27:00 UTC

[jira] [Created] (HADOOP-17072) Add getClusterRoot and getClusterRoots methods to FileSystem and ViewFilesystem

Virajith Jalaparti created HADOOP-17072:
-------------------------------------------

             Summary: Add getClusterRoot and getClusterRoots methods to FileSystem and ViewFilesystem
                 Key: HADOOP-17072
                 URL: https://issues.apache.org/jira/browse/HADOOP-17072
             Project: Hadoop Common
          Issue Type: Task
          Components: fs, viewfs
            Reporter: Virajith Jalaparti


In a federated setting (HDFS federation, federation across multiple buckets on S3, multiple containers across Azure storage), certain system tools/pipelines require the ability to map paths to the clusters/accounts.

Consider GDPR compliance/retention jobs need to go over the datasets ingested over a period of T days and remove/quarantine datasets that are not properly annotated/have reached their retention period. Such jobs can rely on renames to a global trash/quarantine directory to accomplish their task. However, in a federated setting, efficient, atomic renames (as those within a single HDFS cluster) are not supported across the different clusters/shards in federation. As a result, such jobs will need to get the clusters to which different paths map to.

To address such cases, this JIRA proposed to get add two new methods to {{FileSystem}}: {{getClusterRoot}} and {{getClusterRoots()}}.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org