You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Arek Burdach <ar...@gmail.com> on 2021/04/19 15:54:54 UTC

NIFI-8161 PR 4773 migration from SimpleDateFormat to DateTimeFormatter in NiFi expressions

Hi,

Some time ago I've prepared a PR: 
https://github.com/apache/nifi/pull/4773 that changes formatter used for 
formatting/printing date from/to string from SimpleDateFormat to 
DateTimeFormatter.
I did it because I've made some benchmarks and figure out that it is 
quite important bottle neck in my flows.
It is because SimpleDateFormat is not thread safe and is created for 
each format/parse. In the other hand DateTimeFormatter can be created 
once for expression and used many times.

The problem with that approach is that, DateTimeFormatter has not 
exactly the same format. It provides almost backward compatible version 
of formatter thanks to:
DateTimeFormatterBuilder()
                 .parseLenient()
                 .parseCaseInsensitive()

, but for some age cases the behaviour of both formatters are different 
and can cause some modifications in flows. Some examples:
- yyyy-MM-dd'T'HH:mm:ss.SSSX with input 2021-01-28T15:00:14.270+01:00 -> 
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
- dd/MMM/yyyy:HH:mm:ss with input: 28/Jan/2021:14:58:00 +0100 -> 
dd/MMM/yyyy:HH:mm:ss X

Due to this differences and discussions with @exceptionfactory , 
@turcsanyip and @turcsanyip we see two options of handling this issue:
1. To modify implementation of current format / toDate functions - won't 
mess up api, but will need some date format modifications on users side 
(which will be described in tests and migration guide). Additional 
question is in which version should be introduced?
2. To add new formatDateTime, toDateTime functions (using 
DateTimeFormatter) next to existing format, toDate (using 
SimpleDateFormat, will be deprecated) - won't break flows using current 
functions, but will add some complexity in api which will vanish after 
removal of deprecated functions.

What do you guys think about both options?

Cheers,
Arek