You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Matt Burgess (JIRA)" <ji...@apache.org> on 2018/08/14 14:44:00 UTC

[jira] [Assigned] (NIFI-5517) PutHive3Streaming does not correctly support all Hive types

     [ https://issues.apache.org/jira/browse/NIFI-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Burgess reassigned NIFI-5517:
----------------------------------

    Assignee: Matt Burgess

> PutHive3Streaming does not correctly support all Hive types
> -----------------------------------------------------------
>
>                 Key: NIFI-5517
>                 URL: https://issues.apache.org/jira/browse/NIFI-5517
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>            Priority: Major
>
> NIFI-5475 upgraded the version of Hive 3 to Apache Hive 3.1.0, and some code changes had to be made as there were new Writable types to be used in place of the old ones. As a result of the upgrade, NIFI-5491 was discovered and included some fixes for PutHive3Streaming to support various primitive data types as well as structs.
> However it appears that NIFI-5491 did not cover all the available Hive types, and NIFI-5475's changes were to PutORC for time/date types, but similar changes have been made to the Hive writer as well.
> PutHive3Streaming should support all available Hive types where prudent (i.e. where NiFi Record Field Types can be converted to Hive Column Types). The current list of "top-level" types include:
> PRIMITIVE, LIST, MAP, STRUCT, UNION
> And the current list of PRIMITIVE types include:
> VOID, BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, STRING,
>     DATE, TIMESTAMP, TIMESTAMPLOCALTZ, BINARY, DECIMAL, VARCHAR, CHAR,
>     INTERVAL_YEAR_MONTH, INTERVAL_DAY_TIME, UNKNOWN
> As of NIFI-5475 I believe PutHive3Streaming supports BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, STRING, VARCHAR, and CHAR for primitive types (VOID and UNKNOWN are used "under the hood" I believe), and with respect to the supported primitive types, PutHive3Streaming supports LIST, STRUCT, and UNION.
> The remaining list is MAP, DATE, TIMESTAMP, TIMESTAMPLOCALTZ, BINARY, DECIMAL, INTERVAL_YEAR_MONTH, and INTERVAL_DAY_TIME. Some of these may already been supported (such as the INTERVALs if the incoming data type is INT or LONG) but need to be confirmed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)