You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2019/02/05 16:27:00 UTC

[jira] [Reopened] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart

     [ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Payne reopened NIFI-5997:
------------------------------

I am re-opening this because I found an edge condition. If we stop NiFi with data swapped out, then copy one of the swap files, leaving the copy in the same directory but changing the first digit of the filename (filenames matter) we will end up in a situation where we increment the size of the queue by the counts in the swap file but don't swap them in.

> If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-5997
>                 URL: https://issues.apache.org/jira/browse/NIFI-5997
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Blocker
>             Fix For: 1.9.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If a queue writes out a Swap File but then the FlowFile Repository throws an Exception when attempting to update, we end up with a scenario where the size of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be written to the swap file) as well as the corresponding size of the FlowFiles. We also have a Swap File that is written out to disk but the FlowFile Repo didn't get updated so on restart we have those FlowFiles in the FlowFile Repo as well as in the Swap File, so we end up with two of the same FlowFile. This can then cause some odd behavior because two FlowFiles exist with the same ID and the counts on the queues are very wrong, which also causes a lot of confusion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)