You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Sven Van Kerrebroeck (Jira)" <ji...@apache.org> on 2024/01/24 14:36:00 UTC

[jira] [Commented] (NIFI-11470) SQL Record query TimeZone issue

    [ https://issues.apache.org/jira/browse/NIFI-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810422#comment-17810422 ] 

Sven Van Kerrebroeck commented on NIFI-11470:
---------------------------------------------

We also ran into some serious problems with our data processing because of this bug. 

We are in "Europe\Brussels" timezone, and each time the data passes through one of these processors (like QueryRecord), the timestamps are moved 1 or 2 hours backwards (depending on summer or winter time). 
Depending on how these timestamps are used afterwards, this can have serious consequences (for us it nearly resulted in data loss further down the line). 

The danger is also that people don't know about the bug untill problems are detected in the final output of a flow. And then it's quite a challenge to find where things are going wrong... 

It would be good to have this resolved (even in the 1.x versions, because not everyone is able to switch to 2.x yet...). For the moment it's really dangerous to use these processors without checking if there are any timestamps in the dataset. 

> SQL Record query TimeZone issue
> -------------------------------
>
>                 Key: NIFI-11470
>                 URL: https://issues.apache.org/jira/browse/NIFI-11470
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.19.1, 1.21.0, 1.23.2
>            Reporter: Julien G.
>            Priority: Major
>         Attachments: Example_Issue_Timestamp_SQL_Query_on_record.json, TIMEZONE_ISSUE.json
>
>
> In the case of a cluster with a timezone different from UTC (+0 hours) like CEST (+2 hours), in processors like QueryRecord or JoinEnrichment that use an SQL query to manipulate the record, the TIMESTAMP type field will be converted again and again to UTC.
> So, for example, if you have JSON with a field with the value 2023/04/19 18:04:00 +0200 and you say it's a TIMESTAMP field in the Avro schema and convert it to Avro, the field will be set to UTC (2023/04/19 16:04:00 +0000). But if you then use a QueryRecord you will have 2023/04/19 14:04:00 +0000 and if you put another QueryRecord you will have 2023/04/19 14:04:00 +0000, ...
> The field is reinterpreted as CEST time zone instead of UTC each time.
> Same issue with SQL join strategy in the JoinEnrichment.
> You can find a dataflow illustrating the point attached to the Jira.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)