You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Hari Sekhon (JIRA)" <ji...@apache.org> on 2019/03/15 11:26:00 UTC

[jira] [Created] (NIFI-6123) NiFi HDFS processors don't allow using nameservice / namenode address that is not the defaultFS

Hari Sekhon created NIFI-6123:
---------------------------------

             Summary: NiFi HDFS processors don't allow using nameservice / namenode address that is not the defaultFS
                 Key: NIFI-6123
                 URL: https://issues.apache.org/jira/browse/NIFI-6123
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Configuration
    Affects Versions: 1.7.0
         Environment: Hortonworks
            Reporter: Hari Sekhon


NiFi HDFS processors (ListHDFS, PutHDFS etc) don't allow using a nameservice / namenode address present in provided hdfs-site.xml if it is not the hdfs defaultFS:
{code:java}
2019-03-15 11:10:15,589 ERROR [Timer-Driven Process Thread-11] o.apache.nifi.processors.hadoop.ListHDFS ListHDFS[id=8101dc52-0169-1000-ffff-ffffe716ad6
e] Failed to perform listing of HDFS due to java.lang.IllegalArgumentException: Wrong FS: hdfs://<fqdn>:8020/HDFSTest3/<custom>-NiFi-test/OneFS-in, expected: hdfs://<hostname>: java.lang.IllegalArgumentException: Wrong FS: hdfs://<fqdn>:8020/HDFSTest3/<custom>-NiFi-test/OneFS-in, expected: hdfs://<hostname>
java.lang.IllegalArgumentException: Wrong FS: hdfs://<fqdn>:8020/HDFSTest3/<custom>-NiFi-test/OneFS-in, expected: hdfs://<hostname>
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:780)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:226)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:974)
        at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:118)
        at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1041)
        at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1038)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1048)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1853)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1895)
        at org.apache.nifi.processors.hadoop.ListHDFS.lambda$getStatuses$0(ListHDFS.java:398)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
        at org.apache.nifi.processors.hadoop.ListHDFS.getStatuses(ListHDFS.java:398)
        at org.apache.nifi.processors.hadoop.ListHDFS.onTrigger(ListHDFS.java:347)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}
This works ok with other Hadoop services like MapReduce / DistCP, Spark etc so it appears to be a limitation of the NiFi HDFS processors.

It can be worked around by using separate configuration per HDFS processor but should ideally be able to use a normal hdfs configuration containing more than 1 nameservice which is common in environments with data transfer workflows between clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)