You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2015/03/16 16:34:39 UTC

[jira] [Created] (SQOOP-2201) Sqoop2: Add possibility to read Hadoop configuration files to HFDS connector

Jarek Jarcec Cecho created SQOOP-2201:
-----------------------------------------

             Summary: Sqoop2: Add possibility to read Hadoop configuration files to HFDS connector
                 Key: SQOOP-2201
                 URL: https://issues.apache.org/jira/browse/SQOOP-2201
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.99.5
            Reporter: Jarek Jarcec Cecho
            Assignee: Jarek Jarcec Cecho
             Fix For: 1.99.6


Currently the HDFS connector is not explicitly reading Hadoop configuration files. During [Initialization|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsToInitializer.java] phase it doesn't do anything, so the configuration files are not needed. During other parts of the workflow, we're [explicitly casting|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsExtractor.java#L61] the general {{Context}} object to Hadoop {{Configuration}}.

This is unfortunate because:

* It couples HDFS connector to Mapreduce execution engine. It will break with adding non mapreduce based execution engine.
* We can't do any HDFS specific checks in {{Initializer}} as the Hadoop {{Configuration}} object is not available there.

As a result I would like to propose breaking this coupling between HDFS connector and Mapreduce execution engine and add configuration option to HDFS Link to specify directory from which we should read the appropriate Hadoop configuration files (with reasonable defaults such as {{/etc/conf/hadoop}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)