You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2022/10/08 00:53:33 UTC

[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3577: [GOBBLIN-1720]Add ancestors owner permissions preservations for iceberg distcp

Will-Lo commented on code in PR #3577:
URL: https://github.com/apache/gobblin/pull/3577#discussion_r990564886


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDataset.java:
##########
@@ -137,13 +139,17 @@ Collection<CopyEntity> generateCopyEntities(FileSystem targetFs, CopyConfigurati
       // TODO: determine whether unnecessarily expensive to repeatedly re-create what should be the same FS: could it
       // instead be created once and reused thereafter?
       FileSystem actualSourceFs = getSourceFileSystemFromFileStatus(srcFileStatus, defaultHadoopConfiguration);
+      Path toPath = PathUtils.getRootPathParent(srcPath);

Review Comment:
   I'm unsure if we need to recursively ensure that all permissions are the same until root, I think this can cause issues cross cluster. At least with hive copy, we only validate the table path and its parent (not saying that this is the most robust way of doing it though, we've seen issues with that approach too).
   
   I think the safest and most predictable approach is to ensure that all the directories that Gobblin will create will have the same permissions as the source.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org