You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Michael Moser (JIRA)" <ji...@apache.org> on 2017/04/10 19:04:41 UTC

[jira] [Commented] (NIFI-3686) EOFException on swap in causes tight loop in polling for flowfiles

    [ https://issues.apache.org/jira/browse/NIFI-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963355#comment-15963355 ] 

Michael Moser commented on NIFI-3686:
-------------------------------------

Note: I didn't encounter this on a production system, I simulated this happening by truncating a swap file while NiFi was not running.

I have a simple code patch to StandardFlowFileQueue that will remove the swap contents from the swapQueue if the swap summary is valid.  This fixes the user experience by logging the EOFException ERROR to the nifi-app.log, then the queue size goes to 0 and the processor reading from this queue is not triggered.  On the next NiFi restart, if the corrupt swap file is still there, the EOFException ERROR happens again.  I'm not sure this is the desired approach, though.

[~markap14] if you can ponder this, please let me know if I should submit this as a PR or if it should be resolved in another way.  Thanks!

> EOFException on swap in causes tight loop in polling for flowfiles
> ------------------------------------------------------------------
>
>                 Key: NIFI-3686
>                 URL: https://issues.apache.org/jira/browse/NIFI-3686
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.1.1
>            Reporter: Michael Moser
>
> If flowfile_repository partition fills 100% while swapping files out to a new swap file, then this swap file becomes corrupt (partially written).  When NiFi tries to swap this file in, EOFException happens and we get following ERROR, which is nice.
> 2017-04-10 18:02:58,855 ERROR [Timer-Driven Process Thread-3] o.a.n.controller.StandardFlowFileQueue Failed to swap in FlowFiles from Swap File /local/mwmoser/nifi-1.2.0-SNAPSHOT/./flowfile_repository/swap/1491574631605-2840b630-57fc-4f49-615b-0b37d77bec66-5dbc0ad0-921c-483e-a05d-5c65d014fa48.swap; Swap File appears to be corrupt!
> However, once all other dataflow stops, the queue now shows 10000 flowfiles in it.  The processor reading from this queue constantly has its onTrigger() called, and session.get() polls the queue and gets 0 files returned.  This happens in a tight loop, with no other errors.
> To a user it appears that the processor is doing lots of work but just not processing those 10000 files.  The error message above only appears once in the nifi-app.log, so you don't see anything wrong if you tail the log. 
>  When you restart NiFi, the error message above appears again, but the user experience of 10000 files not processing remains.
> The new SchemaSwapDeserializer does not (and perhaps cannot) implement the IncompleteSwapFileException that the old SimpleSwapDeserializer does.  So, reading a swap file is currently all-or-nothing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)