You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Pavel Benes (JIRA)" <ji...@apache.org> on 2015/08/19 17:46:45 UTC

[jira] [Updated] (SQOOP-2471) Support arrays and structs datatypes with Sqoop Hcatalog integration

     [ https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Benes updated SQOOP-2471:
-------------------------------
    Summary: Support arrays and structs datatypes with Sqoop Hcatalog integration  (was: Support complex datatypes with Sqoop Hcatalog integration)

> Support arrays and structs datatypes with Sqoop Hcatalog integration
> --------------------------------------------------------------------
>
>                 Key: SQOOP-2471
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2471
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: hive-integration
>    Affects Versions: 1.4.6
>            Reporter: Pavel Benes
>            Priority: Critical
>
> Currently sqoop import is not able to handle any complex type. On the other side the hive already has support for the following complex types:
>  - arrays: ARRAY<data_type>
>  - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
>  - maps: MAP<primitive_type, data_type>
>  - union: UNIONTYPE<data_type, data_type, ...> 
> the most frequent/important is probably the ARRAY type followed by the STRUCT type. 
> Since it is probably not possible to obtain all necessary information about those types from general JDBC database, this feature should somehow use an external information provided by arguments --map-column-java and --map-column-hive. 
> For example it could look like this:
>  --map-column-java item='inventory_item(name text, supplier_id integer,price numeric)'
>  --map-column-hive item='STRUCT<name : string, supplier_id : int, price : decimal>'
> In case no additional information is provided some more general type should be created if possible.
> It should be possible to serialize the complex datatypes values into strings when the Hive target column's type is explicitly set to 'STRING'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)