You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2016/02/29 22:06:18 UTC

[jira] [Commented] (NIFI-1577) NiFi holds open too many files when using a Run Duration > 0 ms and calling session.append

    [ https://issues.apache.org/jira/browse/NIFI-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172612#comment-15172612 ] 

Mark Payne commented on NIFI-1577:
----------------------------------

The easiest way that I've found to test this is to run a Processor like ListenSyslog that supports batching and calls session.append(). If the Run Duration is set to 25 ms and a fairly large amount of data is pushed to it, the logs will start being filled with errors about Too Many Open Files. Once this patch is applied, that goes away.

Unfortunately, the patch does not lend itself well to unit tests, as it would require inspecting a lot of internal private state about the StandardProcessSession, which would result in very brittle unit tests. However, since checkpoint() clears the 'records' map, those streams that would be accessible will no longer be accessible anyway because the Mapping is from ContentClaim (which belongs to exactly 1 RepositoryRecord in the 'records' Map) to an OutputStream. Since the 'records' map is cleared, we cannot access the OutputStream, so they were being held open without any benefit.

> NiFi holds open too many files when using a Run Duration > 0 ms and calling session.append
> ------------------------------------------------------------------------------------------
>
>                 Key: NIFI-1577
>                 URL: https://issues.apache.org/jira/browse/NIFI-1577
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>         Attachments: 0001-NIFI-1577-Close-any-streams-that-are-left-open-for-a.patch
>
>
> If a Processor calls ProcessSession.append() and has a Run Duration scheduled > 0 ms, we quickly end up with "Too many open files" exceptions.
> This appears to be due to the fact that calling append() holds the content repository's stream open so that the session can keep appending to it, but on checkpoint() the session does not close these streams. It should close these streams on checkpoint, since the Processor is no longer allowed to reference these FlowFiles anyway at that point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)