You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Brandon DeVries (JIRA)" <ji...@apache.org> on 2019/03/29 17:47:00 UTC

[jira] [Commented] (NIFI-6157) StandardFunnel transferring too slowly

    [ https://issues.apache.org/jira/browse/NIFI-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805250#comment-16805250 ] 

Brandon DeVries commented on NIFI-6157:
---------------------------------------

i'm leaning towards increasing the FlowFile transfer limit to 100K, and adding a property to configure the number of concurrent tasks for funnels/ports.  Allowing configuration for the concurrent tasks seems necessary, because even just increasing the throughput allowed in each onTrigger still results in bursty transfer behavior, which is undesirable.  And the more work the instance is doing, the further separated those bursts are.  "1" is a good default, but its not always sufficient.

100K as a FlowFile transfer limit seems like a more reasonable number than 10K, which has proven to be too low.  On a system with default back pressure set to non-zero, that will be the defacto transfer limit, as there generally won't be more room than that on the output queue.  Additionally, the previous behavior was "loop indefinitely while there's work to do".  If the new behavior was "loop indefinitely while you're getting full transfer batches of 1000", that would probably be fine by itself.  Then limit is essentially just a sanity check... 1M would even be fine.

> StandardFunnel transferring too slowly
> --------------------------------------
>
>                 Key: NIFI-6157
>                 URL: https://issues.apache.org/jira/browse/NIFI-6157
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Brandon DeVries
>            Priority: Major
>
> NIFI-6068 made modifications such that a funnel wouldn't hold on to a TimerDriven thread excessively.  However, now it isn't holding on to the thread long enough...
> Since Funnels and Local Ports are scheduled with the timer driven thread pool, they're competing for threads with all of the other processors on the graph.  In a large flow with a large number of processors, potentially with multiple assigned concurrent tasks, funnels and ports get to run less and less frequently, since they are hard coded to 1 concurrent task.
> I'm open to implementation options, but a couple of possibilities are:
>  * Increase the transferred FlowFilecap from 10K to 100K.  The thread will still be released if less than the requested 1000 FlowFiles are moved in a loop, so it won't hold on inappropriately, but it will still have the opportunity to move the files that need to be moved.  Furthermore, if the back pressure on the outgoing relationship is engaged, it will cause the thread to be released.  Effectively, the amount transferred would be limited by the max of 100K and outgoing queue capacity.
>  * Like above, but add a property to specify the max number of FlowFiles transferred per run.  Removing hard coded magic numbers is good... but cluttering nifi.properties is bad, so its a trade off.
>  * Increase the number of concurrent threads for funnels / ports.  This probably would want to be a configurable property, as the value should really likely be proportional to the "size" of your flow, whatever that means for the system in question.
>  * Increase the "run duration"... but i don't think i like that.
>  * If session.getQueueSize exceeds some threshold, spin off a new thread to transfer those files... but that could be dangerous.
>  * Create a new thread pool for ports / funnels, so they aren't starved by processors.  Similar to above, but reuses resources.  Still would need to determine the correct size of the pool.  This could be the best answer in theory, but would also require the most code work.
> [~markap14], thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)