You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2017/03/08 22:41:38 UTC

[jira] [Created] (NIFI-3577) Provide ability to migrate session from one component to another

Mark Payne created NIFI-3577:
--------------------------------

             Summary: Provide ability to migrate session from one component to another
                 Key: NIFI-3577
                 URL: https://issues.apache.org/jira/browse/NIFI-3577
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Core Framework
            Reporter: Mark Payne
            Assignee: Mark Payne


Quite often, dataflows are created to process a very large volume of small FlowFiles. When this is the case, NiFi's architecture of continually updating the FlowFile Repository, Provenance Repository, Content Repository for each FlowFile and then queuing the data in between components and waiting for scheduling to occur can result in a significant amount of overhead. In many cases, we can improve the handling of these FlowFiles.

When we scale up the number of threads on a processor, specifically, the lock contention on the queue can become quite significant. In addition, there is significant overhead in the scheduling mechanism used to schedule Processors.

When a component runs and commits its session, if there is only a single FlowFile in or the component is a 'source' and only a single FlowFile out, we can often avoid queuing the FlowFile into a FlowFile queue. Instead, we can use the same thread that called Processor 1 to call Processor 2 and pass along a specialized session to Processor 2. This specialized session would return only a single FlowFile when session.get() is called - the FlowFile that was transferred by Processor 1. If Processor 2 then transfers this FlowFile to exactly 1 (uncloned) relationship without creating any child FlowFiles, we can then migrate the session further. This will cut down on both queuing overhead and scheduling overhead and should provide both better efficiency and lower latency.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)