You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2020/05/23 21:54:00 UTC
[jira] [Commented] (HUDI-539) RO Path filter does not pick up
hadoop configs from the spark context
[ https://issues.apache.org/jira/browse/HUDI-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114961#comment-17114961 ]
sivabalan narayanan commented on HUDI-539:
------------------------------------------
[~ssomuah] / [~vinoth] : is this still open. May I know if someone is working on it ?
> RO Path filter does not pick up hadoop configs from the spark context
> ---------------------------------------------------------------------
>
> Key: HUDI-539
> URL: https://issues.apache.org/jira/browse/HUDI-539
> Project: Apache Hudi
> Issue Type: Bug
> Components: Common Core
> Affects Versions: 0.5.1
> Environment: Spark version : 2.4.4
> Hadoop version : 2.7.3
> Databricks Runtime: 6.1
> Reporter: Sam Somuah
> Assignee: Vinoth Chandar
> Priority: Major
> Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.6.0, 0.5.3
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Hi,
> I'm trying to use hudi to write to one of the Azure storage container file systems, ADLS Gen 2 (abfs://). ABFS:// is one of the whitelisted file schemes. The issue I'm facing is that in {{HoodieROTablePathFilter}} it tries to get a file path passing in a blank hadoop configuration. This manifests as {{java.io.IOException: No FileSystem for scheme: abfss}} because it doesn't have any of the configuration in the environment.
> The problematic line is
> [https://github.com/apache/incubator-hudi/blob/2bb0c21a3dd29687e49d362ed34f050380ff47ae/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java#L96]
>
> {code:java}
> Stacktrace
> java.io.IOException: No FileSystem for scheme: abfss
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at org.apache.hudi.hadoop.HoodieROTablePathFilter.accept(HoodieROTablePathFilter.java:96)
> at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$$anonfun$16.apply(InMemoryFileIndex.scala:349){code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)