You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2015/03/16 16:34:39 UTC
[jira] [Created] (SQOOP-2201) Sqoop2: Add possibility to read
Hadoop configuration files to HFDS connector
Jarek Jarcec Cecho created SQOOP-2201:
-----------------------------------------
Summary: Sqoop2: Add possibility to read Hadoop configuration files to HFDS connector
Key: SQOOP-2201
URL: https://issues.apache.org/jira/browse/SQOOP-2201
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.99.5
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Fix For: 1.99.6
Currently the HDFS connector is not explicitly reading Hadoop configuration files. During [Initialization|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsToInitializer.java] phase it doesn't do anything, so the configuration files are not needed. During other parts of the workflow, we're [explicitly casting|https://github.com/apache/sqoop/blob/sqoop2/connector/connector-hdfs/src/main/java/org/apache/sqoop/connector/hdfs/HdfsExtractor.java#L61] the general {{Context}} object to Hadoop {{Configuration}}.
This is unfortunate because:
* It couples HDFS connector to Mapreduce execution engine. It will break with adding non mapreduce based execution engine.
* We can't do any HDFS specific checks in {{Initializer}} as the Hadoop {{Configuration}} object is not available there.
As a result I would like to propose breaking this coupling between HDFS connector and Mapreduce execution engine and add configuration option to HDFS Link to specify directory from which we should read the appropriate Hadoop configuration files (with reasonable defaults such as {{/etc/conf/hadoop}}).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)