You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/06/04 21:25:00 UTC

[jira] [Commented] (NIFI-7476) Allow users to configure FlowFile Concurrency on a Process Group

    [ https://issues.apache.org/jira/browse/NIFI-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126194#comment-17126194 ] 

ASF subversion and git services commented on NIFI-7476:
-------------------------------------------------------

Commit 359fd3ff299c6abb7e7b4d5dfb99e48570aeede5 in nifi's branch refs/heads/master from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=359fd3f ]

NIFI-7476: Implemented FlowFileGating / FlowFileConcurrency at the ProcessGroup level
Added FlowFileOutboundPolicy to ProcessGroups and updated LocalPort to make use of it
Persisted FlowFile Concurrency and FlowFile Output Policy to flow.xml.gz and included in flow fingerprint
Added configuration for FlowFile concurrency and outbound policy to UI for configuration of Process Groups
Added system tests. Fixed a couple of bugs that were found
Fixed a couple of typos in the RecordPath guide

Signed-off-by: Pierre Villard <pi...@gmail.com>

This closes #4306.


> Allow users to configure FlowFile Concurrency on a Process Group
> ----------------------------------------------------------------
>
>                 Key: NIFI-7476
>                 URL: https://issues.apache.org/jira/browse/NIFI-7476
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework, Core UI, Documentation &amp; Website
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Wait/Notify processors are used quite heavily. These processors are very powerful and allow for many different use cases. However, offering this power is done at the expense of making the processors difficult to configure.
> The most common use case, it seems, is to simply allow a Process Group to process only a single FlowFile at a time. We see questions about how to accomplish this fairly frequently in Slack and on the mailing list.
> I propose that we add a new feature to NiFi so that when a user configures a Process Group, they can configure the FlowFile Concurrency: either unbounded (which is the current behavior) or a single FlowFile at a time on each node. In the latter case, only a single FlowFile will be ingested by a Local Input Port, and no more FlowFiles will be ingested as long as there is data queued in the Process Group. Once all data has left the Process Group, the next FlowFile will be allowed through.
> This has several advantages over the Wait/Notify pair of Processors. Firstly, there's no need to create a pair of two Processors and ensure that they are used in concert together properly. Secondly, there aren't a lot of properties to configure. Thirdly, implementing this at the framework level and with limited features means the implementation can be much simpler than that of Wait/Notify, which means it is much easier to maintain.
> Additionally, a related concept can be easily introduced: the notion of a FlowFile Outbound Policy. This is analogous to the FlowFile Concurrency but is related to Output Ports. Here, the use could configure the group such that data should be transferred out of the Process Group as soon as it's available (which is the current behavior) or could be transferred as a batch. In the batch mode, the Output Ports would not transfer any data out of the Process Group until all FlowFiles are queued up at an Output Port (i.e., all processing has finished).
> This allows for very simple configuration for an oft-requested capability: the ability to perform some action only after processing of a batch of data has completed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)