You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Adam Debreceni (Jira)" <ji...@apache.org> on 2020/06/29 07:50:00 UTC

[jira] [Updated] (MINIFICPP-1274) Flow restart could double-spend flowfiles

     [ https://issues.apache.org/jira/browse/MINIFICPP-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam Debreceni updated MINIFICPP-1274:
--------------------------------------
    Description: 
Flowfiles are async-deleted from the FlowFileRepository and no flush happens after shutdown, leaving these marked files in the repository. If we restart the agent, this allows these zombie files to be resurrected and be put back into their last connections. This could cause files to be processed multiple times even if we marked them for deletion.

Solution proposal: flush the FlowFileRepository after shutdown, so all marked files are actually deleted (this won't save us from double-processing flowfiles after a crash)

(also make sure that the FlowFileRepository shutdown happens after no more processors are running)

  was:
Flowfiles are async-deleted from the FlowFileRepository and no flush happens after shutdown leaving these marked files in the repository. If we restart the agent, this allows these zombie files to be resurrected and be put back into their last connections. This could cause files to be processed multiple times even if we marked them for deletion.

Solution proposal: flush the FlowFileRepository after shutdown, so all marked files are actually deleted (this won't save us from double-processing flowfiles after a crash)


> Flow restart could double-spend flowfiles
> -----------------------------------------
>
>                 Key: MINIFICPP-1274
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-1274
>             Project: Apache NiFi MiNiFi C++
>          Issue Type: Improvement
>            Reporter: Adam Debreceni
>            Priority: Major
>
> Flowfiles are async-deleted from the FlowFileRepository and no flush happens after shutdown, leaving these marked files in the repository. If we restart the agent, this allows these zombie files to be resurrected and be put back into their last connections. This could cause files to be processed multiple times even if we marked them for deletion.
> Solution proposal: flush the FlowFileRepository after shutdown, so all marked files are actually deleted (this won't save us from double-processing flowfiles after a crash)
> (also make sure that the FlowFileRepository shutdown happens after no more processors are running)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)