You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (Jira)" <ji...@apache.org> on 2020/06/18 16:53:00 UTC

[jira] [Created] (NIFI-7557) Cache large/common FlowFile attributes when restoring FlowFile Repository

Mark Payne created NIFI-7557:
--------------------------------

             Summary: Cache large/common FlowFile attributes when restoring FlowFile Repository
                 Key: NIFI-7557
                 URL: https://issues.apache.org/jira/browse/NIFI-7557
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
            Reporter: Mark Payne
            Assignee: Mark Payne


When NiFi is restarted, it restores FlowFiles from the repository. Each attribute on a FlowFile is read from disk and put into a HashMap. There are times when a Processor will add a large attribute to every FlowFile that it sees, and this results in using much more heap upon NiFi restart to store FlowFiles than it does while NiFi is running. This is because the Processor holds the value of that FlowFile as a single String object and adds that String to the HashMap of attributes on every FlowFile.

However, on restart, NiFi deserializes a byte stream to come up with the attribute value. As a result, each FlowFile that has that attribute value ends up with its own String object, even though the same value is repeated many times.

As a result, a huge amount of heap may be used on restart, causing NiFi to encounter OOME when attempting to restore the FlowFile Repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)