You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Shashank Dwivedi (Jira)" <ji...@apache.org> on 2021/10/20 13:54:00 UTC

[jira] [Created] (CONNECTORS-1676) Support for H.A. Hadoop Cluster in HDFS output Connector

Shashank Dwivedi created CONNECTORS-1676:
--------------------------------------------

             Summary: Support for H.A. Hadoop Cluster in HDFS output Connector
                 Key: CONNECTORS-1676
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1676
             Project: ManifoldCF
          Issue Type: Improvement
          Components: HDFS connector
    Affects Versions: ManifoldCF 2.20
            Reporter: Shashank Dwivedi


Currently HDFS output Connector configuration uses only a Single Name Node URI. However, after Hadoop 2.0, the concept of High Availability in Hadoop Cluster was introduced which allows us to have multiple Name Nodes. At any given instance, only one of the name nodes will be active, and the others will on standby. Connecting to standby node will result in no output in HDFS and the Job getting stuck.

Is there any way we can configure multiple Name Nodes in Manifold HDFS Output Connector today, which will automatically use the active Node?

If it is not supported, the existing HDFS Output Connector can be enhanced to handle multiple name nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)