You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Peter Turcsanyi (Jira)" <ji...@apache.org> on 2021/02/04 16:11:03 UTC

[jira] [Resolved] (NIFI-8081) List[S]FTP can miss files when multiple subdirectories are written while listing

     [ https://issues.apache.org/jira/browse/NIFI-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Turcsanyi resolved NIFI-8081.
-----------------------------------
    Fix Version/s: 1.14.0
       Resolution: Fixed

> List[S]FTP can miss files when multiple subdirectories are written while listing
> --------------------------------------------------------------------------------
>
>                 Key: NIFI-8081
>                 URL: https://issues.apache.org/jira/browse/NIFI-8081
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Tamas Palfy
>            Assignee: Tamas Palfy
>            Priority: Major
>             Fix For: 1.14.0
>
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> ListFTP and ListSFTP scans subdirectories one after the other and because of this they can have the following issue when using 'Tracking Timestamps' as 'Listing Strategy':
> # Processor starts and finishes listing directory1
> # Processor starts listing directory2
> # file1 arrives in directory1 with ts(timestamp)=1
> # file2 arrives in directory2 (or any other, not yet listed directory) with ts=2
> # Processor finishes listing director2
> # Processor returns result which will contain file2(ts=2) but not file1(ts=1)
> # Processor stores ts=2 as the latest seen timestamp
> # file1 will be filtered out next time (and every subsequent listing) because it's timestamp is less than the lates seen timestamp
> Fix: Leave 'Tracking Timestamps' behaviour as it is (just update documentation) and create a new strategy. This strategy checks the current time in each cycle and lists all files that have arrived before the current time (but after the previous cycle). Compares file timestamps to the current time so it needs to be adjusted with the timezone difference of NiFi and the file hosting system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)