You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/03 21:52:04 UTC

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6354: Spark: Check fileIO instead of reading location when determining locality enabled

amogh-jahagirdar commented on code in PR #6354:
URL: https://github.com/apache/iceberg/pull/6354#discussion_r1038876030


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##########
@@ -67,10 +67,9 @@ public boolean caseSensitive() {
   }
 
   public boolean localityEnabled() {
-    InputFile file = table.io().newInputFile(table.location());
-
-    if (file instanceof HadoopInputFile) {
-      String scheme = ((HadoopInputFile) file).getFileSystem().getScheme();
+    if (table.io() instanceof HadoopFileIO) {
+      HadoopInputFile file = (HadoopInputFile) table.io().newInputFile(table.location());
+      String scheme = file.getFileSystem().getScheme();
       boolean defaultValue = LOCALITY_WHITELIST_FS.contains(scheme);
       return PropertyUtil.propertyAsBoolean(readOptions, SparkReadOptions.LOCALITY, defaultValue);
     }

Review Comment:
   Another approach is to treat this as best effort and in case of any errors during this process just return false. This also looks to be done in the flink case as well but I'm not sure if this is desired for Spark; I think it should be since in the worst case whether locality is enabled or not cannot be determined and just opt for assuming it's not. This shouldn't have any correctness impact, but maybe @aokolnychyi @RussellSpitzer can help me out here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org