You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Joseph Percivall (JIRA)" <ji...@apache.org> on 2016/12/19 17:08:58 UTC

[jira] [Commented] (NIFI-3225) Abstract Processor type that batches session.get() and session.commit() calls

    [ https://issues.apache.org/jira/browse/NIFI-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761716#comment-15761716 ] 

Joseph Percivall commented on NIFI-3225:
----------------------------------------

Calling "get(X)" with a hardcoded number is how stateless processors used to do things[1] before the ability to set the "Run Duration"[2] was added. The run duration takes care of the batching together the checkpoint and commit. Also if there is setup that isn't dependent on the attributes or content and can be re-used, shouldn't it already be done in the OnScheduled?

Is there a specific use-case you have come across recently that warrants a need for this?


[1] https://github.com/apache/nifi/blob/0.x/nifi-nar-bundles/nifi-update-attribute-bundle/nifi-update-attribute-processor/src/main/java/org/apache/nifi/processors/attributes/UpdateAttribute.java#L338
[2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab

> Abstract Processor type that batches session.get() and session.commit() calls
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-3225
>                 URL: https://issues.apache.org/jira/browse/NIFI-3225
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Bryan Rosander
>            Assignee: Bryan Rosander
>            Priority: Minor
>
> For processors that are stateless and support batching, it should be safe to get and process multiple input FlowFiles for each onTrigger() call.  
> This should amortize the cost of session.get(), session.checkpoint(), session.commit() as well as any setup in onTrigger() that isn't dependent on the FlowFile(s) attributes or content.
> An AbstractBatchingProcessor type should reduce boilerplate code in candidate processors and encourage uniform configurability via a property to control batch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)