You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Pavel Benes (JIRA)" <ji...@apache.org> on 2015/08/19 17:46:45 UTC
[jira] [Updated] (SQOOP-2471) Support arrays and structs datatypes
with Sqoop Hcatalog integration
[ https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Benes updated SQOOP-2471:
-------------------------------
Summary: Support arrays and structs datatypes with Sqoop Hcatalog integration (was: Support complex datatypes with Sqoop Hcatalog integration)
> Support arrays and structs datatypes with Sqoop Hcatalog integration
> --------------------------------------------------------------------
>
> Key: SQOOP-2471
> URL: https://issues.apache.org/jira/browse/SQOOP-2471
> Project: Sqoop
> Issue Type: New Feature
> Components: hive-integration
> Affects Versions: 1.4.6
> Reporter: Pavel Benes
> Priority: Critical
>
> Currently sqoop import is not able to handle any complex type. On the other side the hive already has support for the following complex types:
> - arrays: ARRAY<data_type>
> - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
> - maps: MAP<primitive_type, data_type>
> - union: UNIONTYPE<data_type, data_type, ...>
> the most frequent/important is probably the ARRAY type followed by the STRUCT type.
> Since it is probably not possible to obtain all necessary information about those types from general JDBC database, this feature should somehow use an external information provided by arguments --map-column-java and --map-column-hive.
> For example it could look like this:
> --map-column-java item='inventory_item(name text, supplier_id integer,price numeric)'
> --map-column-hive item='STRUCT<name : string, supplier_id : int, price : decimal>'
> In case no additional information is provided some more general type should be created if possible.
> It should be possible to serialize the complex datatypes values into strings when the Hive target column's type is explicitly set to 'STRING'.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)