You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Toivo Adams (JIRA)" <ji...@apache.org> on 2015/11/12 20:15:11 UTC

[jira] [Commented] (NIFI-1008) NiFi should swap out FlowFiles to disk even before the session is committed

    [ https://issues.apache.org/jira/browse/NIFI-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002679#comment-15002679 ] 

Toivo Adams commented on NIFI-1008:
-----------------------------------

Mark,

Do you have this under work currently?
Or have plan to resolve this soon?

If not, I may try to tackle this.
Any hints or new ideas how to start?

Thanks
Toivo


> NiFi should swap out FlowFiles to disk even before the session is committed
> ---------------------------------------------------------------------------
>
>                 Key: NIFI-1008
>                 URL: https://issues.apache.org/jira/browse/NIFI-1008
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Mark Payne
>
> Currently, NiFi will swap out FlowFiles if there are a large number in a FlowFile Queue. This is done to avoid running out of JVM heap space. However, if we have a simple flow like GetFile -> SplitText and GetFile pulls in a large file, SplitText can quickly cause OutOfMemoryError. This is not because it buffers the content of the FlowFile in memory but rather because it holds the millions of FlowFile objects in memory. We can do better.
> When we call session.transfer for the FlowFiles, once we hit a magical threshold (say 10,000), we should swap those FlowFiles to disk and the session should transfer them to the queue "swapped out" flowfiles, rather than having to buffer all of these in memory and then swapping them out once they land in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)