You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Dennis Jaheruddin (Jira)" <ji...@apache.org> on 2020/06/02 03:10:00 UTC
[jira] [Created] (NIFI-7501) Generate Flowfile does not scale
Dennis Jaheruddin created NIFI-7501:
---------------------------------------
Summary: Generate Flowfile does not scale
Key: NIFI-7501
URL: https://issues.apache.org/jira/browse/NIFI-7501
Project: Apache NiFi
Issue Type: Improvement
Components: Extensions
Affects Versions: 1.11.4
Reporter: Dennis Jaheruddin
Attachments: generationperformance.xml
One of the purposes of Generate Flowfile is load testing. However, unfortunately it often appears to become the bottleneck itself. I have found it not to scale well.
Example result from my laptop:
I want to generate messages and bring them to a single processor, lets call it processor X.
With 1 concurrent task, and a batch size of 1, and a message size of 10MB and uniqueness false it can generate approximately 2 GB/sec.
When allowing for more concurrent tasks, or a larger batch size, no noticeable change is found.
However, if instead of increasing the batchsize I route the success relationship to multiple processors that do 'nothing' (like updateattribute), and then bring the success relations of all these to processor X, I can get much more than 2 GB/sec.
In conclusion: I don't appear to be hitting a hardware limit as I am able to generate the number of messages in this inelegant way, but no matter how I set up my generateflowfile processor, it just will not scale. Suggesting there may be a smarter way to generate data when uniqueness is not required.
I have attached a template to illustrate my findings.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)