You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Matt Burgess (Jira)" <ji...@apache.org> on 2023/06/15 19:18:00 UTC

[jira] [Resolved] (NIFI-6435) ConvertAvroToORC Should support Date and Timestamp type

     [ https://issues.apache.org/jira/browse/NIFI-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Burgess resolved NIFI-6435.
--------------------------------
    Resolution: Won't Fix

The version of Avro used in ConvertAvroToORC does not support logical types such as date and timestamp. The entire Hive 1 NAR has been deprecated and removed from the Apache NiFi binaries. As you mentioned PutORC is the replacement, but if the Hive 3 NAR is not available in Apache NiFi binary it can be downloaded using https://repository.apache.org/service/local/repositories/releases/content/org/apache/nifi/nifi-hive3-nar/ as the base for you to select the corresponding version and find the NAR artifact.

> ConvertAvroToORC Should support Date and Timestamp type
> -------------------------------------------------------
>
>                 Key: NIFI-6435
>                 URL: https://issues.apache.org/jira/browse/NIFI-6435
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: archon gum
>            Priority: Major
>
> h1. From AVRO
> data
>  
> {code:java}
> {
> "date_type": 17897,  // 2019-01-01
> "timestamp_type": 1546300800000 // 2019-01-01 00:00:00 +0000
> }
> {code}
>  
> schema
>  
> {code:java}
> {
> "name": "test_types",
> "type": "record",
> "fields": [
>  {
>    "name": "date_type",
>    "type": "int",
>    "logicalType": "date"
>  },
>  {
>    "name": "timestamp_type",
>    "type": "long",
>    "logicalType": "timestamp-millis" // or others
>  }
> ]
> }
> {code}
>  or schema
>  
> {code:java}
> {
> "name": "test_types",
> "type": "record",
> "fields": [
>  {
>    "name": "date_type",
>    "type": {
>        "type": "int",
>        "logicalType": "date"
>    }
>  },
>  {
>    "name": "timestamp_type",
>    "type": {
>        "type": "long",
>        "logicalType": "timestamp-millis" // or others
>    }
>  }
> ]
> }
> {code}
>  
> h1. To ORC
> schema should be
> {code:java}
> struct<date_type:date, timestamp_type:timestamp>
> {code}
>  
> ---
>  
> Update: 2019-07-15
>  
> h1. PutORC
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hive-bundle/nifi-hive3-processors/src/main/java/org/apache/nifi/processors/orc/PutORC.java]
>  
> This processor can do the trick.
> *PutORC*: Read from RecordReader(eg JsonReader, AvroReader), convert to ORC and put the ORC file to HDFS.
> *PutORC* can convert Avro logical type such like:
> From Avro schema
> {code:java}
> {
> "name": "test_types",
> "type": "record",
> "fields": [
>  {
>    "name": "date_type",
>    "type": {
>        "type": "int",
>        "logicalType": "date"
>    }
>  },
>  {
>    "name": "timestamp_type",
>    "type": {
>        "type": "long",
>        "logicalType": "timestamp-millis"
>    }
>  }
> ]
> }{code}
> And the result ORC schema would be:
> {code:java}
> struct<date_type:date, timestamp_type:timestamp>{code}
>  
> h1. Question?
> The *PutORC* still in development? Because *nifi-hive3-processors* module is not including in build.(At least in NiFi-1.9.2)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)