You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com> on 2018/06/13 08:10:02 UTC

RE: NiFi Performance Analysis Clarification

Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58 checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

Re: NiFi Performance Analysis Clarification

Posted by Mark Payne <ma...@hotmail.com>.

Prashanth,

"will it will it spread out the stop-the-world time across the intervals. In that case, my average would fall to same figures right?

It's hard to say - you'd have to give it a try and see if it improves. There are a lot of different optimizations, both at the JVM
and the Operating System level, that come into play here. It may give much better performance. Or perhaps worse performance,
but it's certainly worth trying out.

Thanks
-Mark


On Jun 13, 2018, at 1:04 PM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Mark,
Thanks for the reply. Please find the comments inline.

Thanks & Regards,
Prashanth

From: Mark Payne [mailto:markap14@hotmail.com]
Sent: Wednesday, June 13, 2018 6:07 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Prashanth,

Whenever the FlowFile Repository performs a Checkpoint, it has to ensure that it has flushed all data to disk
before continuing, so it performs an fsync() call so that any data buffered by the Operating System is flushed
to disk as well. If you're using the same physical drive / physical partition for FlowFile Repository as you are
for content, provenance, logs, etc. then this can be very costly.

It is always a best practice for any production system to try to isolate the FlowFile Repository to its own physical
partition, the Content Repository to its own physical partition (or multiple partitions) and the Provenance Repository
to its own physical partition (or multiple partitions). Placing the FlowFile Repo on its own partition is likely to address
the issue on its own (Update the value of the "nifi.flowfile.repository.directory" property in nifi.properties - but be warned,
you'll lose any data in your flow if you point to an empty directory so you'll need to also move the contents of ./flowfile_repository
to the new directory or stop your source processors and bleed out all the data from your flow first).  I tried once by giving flowfile repo in one partition and content& provenance in other. But I think still I faced this problem. But didn’t remember well.

Additionally, you may see better results by adjusting the value of the "nifi.flowfile.repository.checkpoint.interval" property
from "2 mins" to something smaller like "15 secs". Oh thats nice. I will try this config. But , just curious, will it spread out the stop-the-world time across the intervals. In that case, my average would fall to same figures right?

Thanks
-Mark




On Jun 13, 2018, at 8:10 AM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Mark,
Thanks for the reply. Please find the comments inline.

Thanks & Regards,
Prashanth

From: Mark Payne [mailto:markap14@hotmail.com]
Sent: Wednesday, June 13, 2018 6:07 PM
To: users@nifi.apache.org
Subject: Re: NiFi Performance Analysis Clarification

Prashanth,

Whenever the FlowFile Repository performs a Checkpoint, it has to ensure that it has flushed all data to disk
before continuing, so it performs an fsync() call so that any data buffered by the Operating System is flushed
to disk as well. If you're using the same physical drive / physical partition for FlowFile Repository as you are
for content, provenance, logs, etc. then this can be very costly.

It is always a best practice for any production system to try to isolate the FlowFile Repository to its own physical
partition, the Content Repository to its own physical partition (or multiple partitions) and the Provenance Repository
to its own physical partition (or multiple partitions). Placing the FlowFile Repo on its own partition is likely to address
the issue on its own (Update the value of the "nifi.flowfile.repository.directory" property in nifi.properties - but be warned,
you'll lose any data in your flow if you point to an empty directory so you'll need to also move the contents of ./flowfile_repository
to the new directory or stop your source processors and bleed out all the data from your flow first).  I tried once by giving flowfile repo in one partition and content& provenance in other. But I think still I faced this problem. But didn’t remember well.

Additionally, you may see better results by adjusting the value of the "nifi.flowfile.repository.checkpoint.interval" property
from "2 mins" to something smaller like "15 secs". Oh thats nice. I will try this config. But , just curious, will it spread out the stop-the-world time across the intervals. In that case, my average would fall to same figures right?

Thanks
-Mark

On Jun 13, 2018, at 8:10 AM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

Re: NiFi Performance Analysis Clarification

Posted by Mark Payne <ma...@hotmail.com>.

Prashanth,

Whenever the FlowFile Repository performs a Checkpoint, it has to ensure that it has flushed all data to disk
before continuing, so it performs an fsync() call so that any data buffered by the Operating System is flushed
to disk as well. If you're using the same physical drive / physical partition for FlowFile Repository as you are
for content, provenance, logs, etc. then this can be very costly.

It is always a best practice for any production system to try to isolate the FlowFile Repository to its own physical
partition, the Content Repository to its own physical partition (or multiple partitions) and the Provenance Repository
to its own physical partition (or multiple partitions). Placing the FlowFile Repo on its own partition is likely to address
the issue on its own (Update the value of the "nifi.flowfile.repository.directory" property in nifi.properties - but be warned,
you'll lose any data in your flow if you point to an empty directory so you'll need to also move the contents of ./flowfile_repository
to the new directory or stop your source processors and bleed out all the data from your flow first).

Additionally, you may see better results by adjusting the value of the "nifi.flowfile.repository.checkpoint.interval" property
from "2 mins" to something smaller like "15 secs".

Thanks
-Mark



On Jun 13, 2018, at 8:10 AM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Hi Jeremy,

With build-in processor[UpdateRecord] with controller service CsvReader & AvroSetWriter. I can send average of ~50MBps to kafka.  I also created custom processor for my business logic with internal avro conversion(not using controller service) , I can push it to average of ~80Mbps.

Note: Average is based on dstat results network send. However, I can assure only my application runs in my machine. I have to write kafka consumer to validate it properly.

Thanks & Regards,
Prashanth

From: Jeremy Dyer [mailto:jdye64@gmail.com]
Sent: Wednesday, June 13, 2018 5:43 PM
To: users@nifi.apache.org; Mike Thomsen <mi...@gmail.com>
Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
Subject: Re: NiFi Performance Analysis Clarification

Prashanth - just out of curiosity could you share the average size of those Avro files you are pushing to Kafka? It would be nice to know for some other benchmark tests I am doing

Thanks,
Jeremy Dyer

Thanks - Jeremy Dyer
________________________________
From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
Sent: Wednesday, June 13, 2018 8:10:27 AM
To: Mike Thomsen
Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
Subject: RE: NiFi Performance Analysis Clarification

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

I am updating & adding few fields in csv. Hence used UpdateRecord..

Thanks & Regards,
Prashanth

From: Mark Payne [mailto:markap14@hotmail.com]
Sent: Wednesday, June 13, 2018 10:49 PM
To: users@nifi.apache.org
Subject: Re: NiFi Performance Analysis Clarification

Prashanth,

Also of note, are you actually updating any fields in the CSV that you receive with UpdateRecord / your custom processor?
Or are you just using that to convert the CSV to Avro? If the latter, you can actually just remove this processor from your flow
entirely and simply use PublishKafkaRecord processor with a CSV Reader and an Avro Writer.

Thanks
-Mark



On Jun 13, 2018, at 12:56 PM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Joe,
Thanks for the reply.  Please find the answers inline.

Thanks & Regards,
Prashanth

-----Original Message-----
From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: Wednesday, June 13, 2018 6:04 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Prasanth

I strongly recommend you reduce your JVM heap size for NiFi to 2 or 4 and no more than 8GB.  The flow, well configured, will certainly not need anywhere near that much and the more ram you give it the more work GC has to do (some GCs are different and can be tuned/etc.. but ...that is for another day).  I used 4GB buffer memory for kafka. So , I think I will retry once with reducing my heap memory to 8 GB

You are absolutely right that the log entries you showed are really problematic and performance in nifi can be dramatically improved.

The flow I think you're describing is:
- ListSFTP  (Tried with GetSFTP also)
- FetchSFTP
- Convert to Avro (Tried with update record & also created custom processor)
- Publish to Kafka

First, we should look at nifi.properties (Didn’t update much)

Second, we should focus on the processors employed

Third, we should look at the configuration of those processors (I have 12 core machine. I give 4 threads to getSFTP -  12 threads to updateRecord – 8 threads to publishKafka)

Sounds like you only turned off flow archival and changed heap.  Any other settings changes?  I'd recommend putting archival back on as it can allow nifi to more efficiently remove data. (I thought that is slowing down .. I will enable it and try)

Please list precisely which processors you have and how they're connected.  Sharing a flow template would be extremely helpful. (Think I answered in above questions.)

Thanks

On Wed, Jun 13, 2018 at 8:13 AM, Jeremy Dyer <jd...@gmail.com>> wrote:
> Prashanth - just out of curiosity could you share the average size of
> those Avro files you are pushing to Kafka? It would be nice to know
> for some other benchmark tests I am doing
>
> Thanks,
> Jeremy Dyer
>
> Thanks - Jeremy Dyer
> ________________________________
> From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
> Sent: Wednesday, June 13, 2018 8:10:27 AM
> To: Mike Thomsen
> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
> Subject: RE: NiFi Performance Analysis Clarification
>
>
> Hi Mike,
>
>
>
> Thanks for the reply. Actually , we did all those optimisations with kafka.
> I am converting to avro, also I configured kafka producer properties
> accordingly. I believe kafka is not a bottleneck.
>
> I am sure because, I can see pretty good throughput with my flow. But
> average throughput is reduced as stop-the-world signal happening for
> long time. Correct me if I am wrong..
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 4:23 PM
> To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Relevant:
> http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/
>
>
>
> If you're throwing 1MB and bigger files at Kafka, that's probably
> where your slowdown is occurring. Particularly if you're running a
> single node or just two nodes. Kafka was designed to process extremely
> high volumes of small messages (at most 10s of kb, not MB and
> certainly not GB). What you can try is building an Avro schema for
> your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.
>
>
>
> On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>> wrote:
>
> Please find answers inline
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Pierre Villard [mailto:pierre.villard.fr@gmail.com]
> Sent: Wednesday, June 13, 2018 3:56 PM
>
>
> To: users@nifi.apache.org<ma...@nifi.apache.org>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Hi,
>
>
>
> What's the version of NiFi you're using?  1.6.0
>
> What are the file systems you're using for the repositories? Local
> rhel file system (/home dir)
>
>
>
> I think that changing the heap won't make any different in this case.
> I'd keep it to something like 8GB (unless you're doing very specific
> stuff that are memory consuming) and let the remaining to OS and disk caching.
>
> I think NiFi holds the snapshotmap in memory.. since we are dealing
> with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi].
> Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?
>
>
>
> Pierre
>
>
>
> 2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>>:
>
> Hi Mike,
>
>
>
> I am retrieving many small csv files each of size 1MB (total folder
> size around ~100GB). In update step, I am doing some enrichment on ingress csv.
> Anyway my flow doesn’t do anything with the stop the world time right?
>
>
>
> Can you please tell me about flowfile checkpointing related tunings?
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 2:33 PM
> To: users@nifi.apache.org<ma...@nifi.apache.org>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> What are you retrieving (particularly size) and what happens in the "update"
> step?
>
>
>
> Thanks,
>
>
>
> Mike
>
>
>
> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>> wrote:
>
> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is GetSFTP ->
> update
> -> PutKafka. I want to tune my setup to achieve high throughput
> -> without much
> queuing.
>
> But my throughput average drops during flowfile checkpointing
> duration. I believe stop-the-world  is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka.
> But every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
> FlowFile Repository
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23
> Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time =
> 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID
> 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log
> with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world
> time = 28275 milliseconds), max Transaction ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
> FlowFile Repository with 7 records in 39008 milliseconds
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please
> guide how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G
> machine]. I disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>

Re: NiFi Performance Analysis Clarification

Posted by Mark Payne <ma...@hotmail.com>.

Prashanth,

Also of note, are you actually updating any fields in the CSV that you receive with UpdateRecord / your custom processor?
Or are you just using that to convert the CSV to Avro? If the latter, you can actually just remove this processor from your flow
entirely and simply use PublishKafkaRecord processor with a CSV Reader and an Avro Writer.

Thanks
-Mark


On Jun 13, 2018, at 12:56 PM, V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:

Joe,
Thanks for the reply.  Please find the answers inline.

Thanks & Regards,
Prashanth

-----Original Message-----
From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: Wednesday, June 13, 2018 6:04 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Prasanth

I strongly recommend you reduce your JVM heap size for NiFi to 2 or 4 and no more than 8GB.  The flow, well configured, will certainly not need anywhere near that much and the more ram you give it the more work GC has to do (some GCs are different and can be tuned/etc.. but ...that is for another day).  I used 4GB buffer memory for kafka. So , I think I will retry once with reducing my heap memory to 8 GB

You are absolutely right that the log entries you showed are really problematic and performance in nifi can be dramatically improved.

The flow I think you're describing is:
- ListSFTP  (Tried with GetSFTP also)
- FetchSFTP
- Convert to Avro (Tried with update record & also created custom processor)
- Publish to Kafka

First, we should look at nifi.properties (Didn’t update much)

Second, we should focus on the processors employed

Third, we should look at the configuration of those processors (I have 12 core machine. I give 4 threads to getSFTP -  12 threads to updateRecord – 8 threads to publishKafka)

Sounds like you only turned off flow archival and changed heap.  Any other settings changes?  I'd recommend putting archival back on as it can allow nifi to more efficiently remove data. (I thought that is slowing down .. I will enable it and try)

Please list precisely which processors you have and how they're connected.  Sharing a flow template would be extremely helpful. (Think I answered in above questions.)

Thanks

On Wed, Jun 13, 2018 at 8:13 AM, Jeremy Dyer <jd...@gmail.com>> wrote:
> Prashanth - just out of curiosity could you share the average size of
> those Avro files you are pushing to Kafka? It would be nice to know
> for some other benchmark tests I am doing
>
> Thanks,
> Jeremy Dyer
>
> Thanks - Jeremy Dyer
> ________________________________
> From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
> Sent: Wednesday, June 13, 2018 8:10:27 AM
> To: Mike Thomsen
> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
> Subject: RE: NiFi Performance Analysis Clarification
>
>
> Hi Mike,
>
>
>
> Thanks for the reply. Actually , we did all those optimisations with kafka.
> I am converting to avro, also I configured kafka producer properties
> accordingly. I believe kafka is not a bottleneck.
>
> I am sure because, I can see pretty good throughput with my flow. But
> average throughput is reduced as stop-the-world signal happening for
> long time. Correct me if I am wrong..
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 4:23 PM
> To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>
> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Relevant:
> http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/
>
>
>
> If you're throwing 1MB and bigger files at Kafka, that's probably
> where your slowdown is occurring. Particularly if you're running a
> single node or just two nodes. Kafka was designed to process extremely
> high volumes of small messages (at most 10s of kb, not MB and
> certainly not GB). What you can try is building an Avro schema for
> your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.
>
>
>
> On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>> wrote:
>
> Please find answers inline
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Pierre Villard [mailto:pierre.villard.fr@gmail.com]
> Sent: Wednesday, June 13, 2018 3:56 PM
>
>
> To: users@nifi.apache.org<ma...@nifi.apache.org>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Hi,
>
>
>
> What's the version of NiFi you're using?  1.6.0
>
> What are the file systems you're using for the repositories? Local
> rhel file system (/home dir)
>
>
>
> I think that changing the heap won't make any different in this case.
> I'd keep it to something like 8GB (unless you're doing very specific
> stuff that are memory consuming) and let the remaining to OS and disk caching.
>
> I think NiFi holds the snapshotmap in memory.. since we are dealing
> with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi].
> Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?
>
>
>
> Pierre
>
>
>
> 2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>>:
>
> Hi Mike,
>
>
>
> I am retrieving many small csv files each of size 1MB (total folder
> size around ~100GB). In update step, I am doing some enrichment on ingress csv.
> Anyway my flow doesn’t do anything with the stop the world time right?
>
>
>
> Can you please tell me about flowfile checkpointing related tunings?
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 2:33 PM
> To: users@nifi.apache.org<ma...@nifi.apache.org>
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> What are you retrieving (particularly size) and what happens in the "update"
> step?
>
>
>
> Thanks,
>
>
>
> Mike
>
>
>
> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>> wrote:
>
> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is GetSFTP ->
> update
> -> PutKafka. I want to tune my setup to achieve high throughput
> -> without much
> queuing.
>
> But my throughput average drops during flowfile checkpointing
> duration. I believe stop-the-world  is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka.
> But every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of
> FlowFile Repository
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23
> Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time =
> 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID
> 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log
> with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world
> time = 28275 milliseconds), max Transaction ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
> FlowFile Repository with 7 records in 39008 milliseconds
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please
> guide how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G
> machine]. I disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Joe,

Thanks for the reply.  Please find the answers inline.



Thanks & Regards,

Prashanth



-----Original Message-----
From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: Wednesday, June 13, 2018 6:04 PM
To: users@nifi.apache.org
Subject: Re: NiFi Performance Analysis Clarification



Prasanth



I strongly recommend you reduce your JVM heap size for NiFi to 2 or 4 and no more than 8GB.  The flow, well configured, will certainly not need anywhere near that much and the more ram you give it the more work GC has to do (some GCs are different and can be tuned/etc.. but ...that is for another day).  I used 4GB buffer memory for kafka. So , I think I will retry once with reducing my heap memory to 8 GB



You are absolutely right that the log entries you showed are really problematic and performance in nifi can be dramatically improved.



The flow I think you're describing is:

- ListSFTP  (Tried with GetSFTP also)

- FetchSFTP

- Convert to Avro (Tried with update record & also created custom processor)

- Publish to Kafka



First, we should look at nifi.properties (Didn’t update much)



Second, we should focus on the processors employed



Third, we should look at the configuration of those processors (I have 12 core machine. I give 4 threads to getSFTP -  12 threads to updateRecord – 8 threads to publishKafka)



Sounds like you only turned off flow archival and changed heap.  Any other settings changes?  I'd recommend putting archival back on as it can allow nifi to more efficiently remove data. (I thought that is slowing down .. I will enable it and try)



Please list precisely which processors you have and how they're connected.  Sharing a flow template would be extremely helpful. (Think I answered in above questions.)



Thanks



On Wed, Jun 13, 2018 at 8:13 AM, Jeremy Dyer <jd...@gmail.com>> wrote:

> Prashanth - just out of curiosity could you share the average size of

> those Avro files you are pushing to Kafka? It would be nice to know

> for some other benchmark tests I am doing

>

> Thanks,

> Jeremy Dyer

>

> Thanks - Jeremy Dyer

> ________________________________

> From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>

> Sent: Wednesday, June 13, 2018 8:10:27 AM

> To: Mike Thomsen

> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>

> Subject: RE: NiFi Performance Analysis Clarification

>

>

> Hi Mike,

>

>

>

> Thanks for the reply. Actually , we did all those optimisations with kafka.

> I am converting to avro, also I configured kafka producer properties

> accordingly. I believe kafka is not a bottleneck.

>

> I am sure because, I can see pretty good throughput with my flow. But

> average throughput is reduced as stop-the-world signal happening for

> long time. Correct me if I am wrong..

>

>

>

> Thanks & Regards,

>

> Prashanth

>

>

>

> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]

> Sent: Wednesday, June 13, 2018 4:23 PM

> To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>

> Cc: users@nifi.apache.org<ma...@nifi.apache.org>; pierre.villard.fr@gmail.com<ma...@gmail.com>

> Subject: Re: NiFi Performance Analysis Clarification

>

>

>

> Relevant:

> http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

>

>

>

> If you're throwing 1MB and bigger files at Kafka, that's probably

> where your slowdown is occurring. Particularly if you're running a

> single node or just two nodes. Kafka was designed to process extremely

> high volumes of small messages (at most 10s of kb, not MB and

> certainly not GB). What you can try is building an Avro schema for

> your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

>

>

>

> On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore)

> <pr...@nokia.com>> wrote:

>

> Please find answers inline

>

>

>

> Thanks & Regards,

>

> Prashanth

>

>

>

> From: Pierre Villard [mailto:pierre.villard.fr@gmail.com]

> Sent: Wednesday, June 13, 2018 3:56 PM

>

>

> To: users@nifi.apache.org<ma...@nifi.apache.org>

> Subject: Re: NiFi Performance Analysis Clarification

>

>

>

> Hi,

>

>

>

> What's the version of NiFi you're using?  1.6.0

>

> What are the file systems you're using for the repositories? Local

> rhel file system (/home dir)

>

>

>

> I think that changing the heap won't make any different in this case.

> I'd keep it to something like 8GB (unless you're doing very specific

> stuff that are memory consuming) and let the remaining to OS and disk caching.

>

> I think NiFi holds the snapshotmap in memory.. since we are dealing

> with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi].

> Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

>

>

>

> Pierre

>

>

>

> 2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore)

> <pr...@nokia.com>>:

>

> Hi Mike,

>

>

>

> I am retrieving many small csv files each of size 1MB (total folder

> size around ~100GB). In update step, I am doing some enrichment on ingress csv.

> Anyway my flow doesn’t do anything with the stop the world time right?

>

>

>

> Can you please tell me about flowfile checkpointing related tunings?

>

>

>

> Thanks & Regards,

>

> Prashanth

>

>

>

> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]

> Sent: Wednesday, June 13, 2018 2:33 PM

> To: users@nifi.apache.org<ma...@nifi.apache.org>

> Subject: Re: NiFi Performance Analysis Clarification

>

>

>

> What are you retrieving (particularly size) and what happens in the "update"

> step?

>

>

>

> Thanks,

>

>

>

> Mike

>

>

>

> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore)

> <pr...@nokia.com>> wrote:

>

> Hi Team,

>

>

>

> I am doing some performance testing in NiFi. WorkFlow is GetSFTP ->

> update

> -> PutKafka. I want to tune my setup to achieve high throughput

> -> without much

> queuing.

>

> But my throughput average drops during flowfile checkpointing

> duration. I believe stop-the-world  is happening during that time.

>

>

>

> I can roughly read ~100MB/s from SFTP and send almost same to Kafka.

> But every 2 mins, it stops the complete execution. Check below logs

>

>

>

> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]

> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of

> FlowFile Repository

>

> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider

> Maintenance] org.wali.MinimalLockingWriteAheadLog

> org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23

> Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time =

> 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID

> 68

>

> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]

> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log

> with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world

> time = 28275 milliseconds), max Transaction ID 316705

>

> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]

> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed

> FlowFile Repository with 7 records in 39008 milliseconds

>

>

>

> I think all processor goes in idle state for 39 seconds ☹ .. Please

> guide how to tune it..

>

> I changed the heap memory with 32G [I am testing on 12 core, 48G

> machine]. I disabled content-repository archiving. All other properties remains same.

>

>

>

> Thanks & Regards,

>

> Prashanth

>

>

Re: NiFi Performance Analysis Clarification

Posted by Joe Witt <jo...@gmail.com>.

Prasanth

I strongly recommend you reduce your JVM heap size for NiFi to 2 or 4
and no more than 8GB.  The flow, well configured, will certainly not
need anywhere near that much and the more ram you give it the more
work GC has to do (some GCs are different and can be tuned/etc.. but
...that is for another day).

You are absolutely right that the log entries you showed are really
problematic and performance in nifi can be dramatically improved.

The flow I think you're describing is:
- ListSFTP
- FetchSFTP
- Convert to Avro
- Publish to Kafka

First, we should look at nifi.properties

Second, we should focus on the processors employed

Third, we should look at the configuration of those processors

Sounds like you only turned off flow archival and changed heap.  Any
other settings changes?  I'd recommend putting archival back on as it
can allow nifi to more efficiently remove data.

Please list precisely which processors you have and how they're
connected.  Sharing a flow template would be extremely helpful.

Thanks

On Wed, Jun 13, 2018 at 8:13 AM, Jeremy Dyer <jd...@gmail.com> wrote:
> Prashanth - just out of curiosity could you share the average size of those
> Avro files you are pushing to Kafka? It would be nice to know for some other
> benchmark tests I am doing
>
> Thanks,
> Jeremy Dyer
>
> Thanks - Jeremy Dyer
> ________________________________
> From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>
> Sent: Wednesday, June 13, 2018 8:10:27 AM
> To: Mike Thomsen
> Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
> Subject: RE: NiFi Performance Analysis Clarification
>
>
> Hi Mike,
>
>
>
> Thanks for the reply. Actually , we did all those optimisations with kafka.
> I am converting to avro, also I configured kafka producer properties
> accordingly. I believe kafka is not a bottleneck.
>
> I am sure because, I can see pretty good throughput with my flow. But
> average throughput is reduced as stop-the-world signal happening for long
> time. Correct me if I am wrong..
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 4:23 PM
> To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>
> Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/
>
>
>
> If you're throwing 1MB and bigger files at Kafka, that's probably where your
> slowdown is occurring. Particularly if you're running a single node or just
> two nodes. Kafka was designed to process extremely high volumes of small
> messages (at most 10s of kb, not MB and certainly not GB). What you can try
> is building an Avro schema for your CSV files and using PublishKafkaRecord
> to break everything down into records that are an appropriate fit for Kafka.
>
>
>
> On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com> wrote:
>
> Please find answers inline
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Pierre Villard [mailto:pierre.villard.fr@gmail.com]
> Sent: Wednesday, June 13, 2018 3:56 PM
>
>
> To: users@nifi.apache.org
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> Hi,
>
>
>
> What's the version of NiFi you're using?  1.6.0
>
> What are the file systems you're using for the repositories? Local rhel file
> system (/home dir)
>
>
>
> I think that changing the heap won't make any different in this case. I'd
> keep it to something like 8GB (unless you're doing very specific stuff that
> are memory consuming) and let the remaining to OS and disk caching.
>
> I think NiFi holds the snapshotmap in memory.. since we are dealing with
> pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I
> increased so.  Does this has anything to do with flowfile checkpoint delay?
>
>
>
> Pierre
>
>
>
> 2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com>:
>
> Hi Mike,
>
>
>
> I am retrieving many small csv files each of size 1MB (total folder size
> around ~100GB). In update step, I am doing some enrichment on ingress csv.
> Anyway my flow doesn’t do anything with the stop the world time right?
>
>
>
> Can you please tell me about flowfile checkpointing related tunings?
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
> Sent: Wednesday, June 13, 2018 2:33 PM
> To: users@nifi.apache.org
> Subject: Re: NiFi Performance Analysis Clarification
>
>
>
> What are you retrieving (particularly size) and what happens in the "update"
> step?
>
>
>
> Thanks,
>
>
>
> Mike
>
>
>
> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore)
> <pr...@nokia.com> wrote:
>
> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update
> -> PutKafka. I want to tune my setup to achieve high throughput without much
> queuing.
>
> But my throughput average drops during flowfile checkpointing duration. I
> believe stop-the-world  is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But
> every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile
> Repository
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance]
> org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@cf82c58 checkpointed with 23 Records
> and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3
> milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7
> Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275
> milliseconds), max Transaction ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile
> Repository with 7 records in 39008 milliseconds
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please guide
> how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I
> disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>

Re: NiFi Performance Analysis Clarification

Posted by Jeremy Dyer <jd...@gmail.com>.

Prashanth - just out of curiosity could you share the average size of those Avro files you are pushing to Kafka? It would be nice to know for some other benchmark tests I am doing

Thanks,
Jeremy Dyer

Thanks - Jeremy Dyer
________________________________
From: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>
Sent: Wednesday, June 13, 2018 8:10:27 AM
To: Mike Thomsen
Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
Subject: RE: NiFi Performance Analysis Clarification

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>
Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Hi Mike,

Thanks for the reply. Actually , we did all those optimisations with kafka. I am converting to avro, also I configured kafka producer properties accordingly. I believe kafka is not a bottleneck.
I am sure because, I can see pretty good throughput with my flow. But average throughput is reduced as stop-the-world signal happening for long time. Correct me if I am wrong..

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 4:23 PM
To: V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>
Cc: users@nifi.apache.org; pierre.villard.fr@gmail.com
Subject: Re: NiFi Performance Analysis Clarification

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where your slowdown is occurring. Particularly if you're running a single node or just two nodes. Kafka was designed to process extremely high volumes of small messages (at most 10s of kb, not MB and certainly not GB). What you can try is building an Avro schema for your CSV files and using PublishKafkaRecord to break everything down into records that are an appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 3:56 PM

To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

Re: NiFi Performance Analysis Clarification

Posted by Mike Thomsen <mi...@gmail.com>.

Relevant: http://www.idata.co.il/2016/09/moving-binary-data-with-kafka/

If you're throwing 1MB and bigger files at Kafka, that's probably where
your slowdown is occurring. Particularly if you're running a single node or
just two nodes. Kafka was designed to process extremely high volumes of
small messages (at most 10s of kb, not MB and certainly not GB). What you
can try is building an Avro schema for your CSV files and using
PublishKafkaRecord to break everything down into records that are an
appropriate fit for Kafka.

On Wed, Jun 13, 2018 at 6:38 AM V, Prashanth (Nokia - IN/Bangalore) <
prashanth.v@nokia.com> wrote:

> Please find answers inline
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> *From:* Pierre Villard [mailto:pierre.villard.fr@gmail.com]
> *Sent:* Wednesday, June 13, 2018 3:56 PM
>
>
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi Performance Analysis Clarification
>
>
>
> Hi,
>
>
>
> What's the version of NiFi you're using?  *1.6.0*
>
> What are the file systems you're using for the repositories? *Local rhel
> file system (/home dir)*
>
>
>
> I think that changing the heap won't make any different in this case. I'd
> keep it to something like 8GB (unless you're doing very specific stuff that
> are memory consuming) and let the remaining to OS and disk caching.
>
> *I think NiFi holds the snapshotmap in memory.. since we are dealing with
> pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I
> increased so.  Does this has anything to do with flowfile checkpoint delay?*
>
>
>
> Pierre
>
>
>
> 2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <
> prashanth.v@nokia.com>:
>
> Hi Mike,
>
>
>
> I am retrieving many small csv files each of size 1MB (total folder size
> around ~100GB). In update step, I am doing some enrichment on ingress csv.
> Anyway my flow doesn’t do anything with the *stop the world* time right?
>
>
>
> Can you please tell me about flowfile checkpointing related tunings?
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> *From:* Mike Thomsen [mailto:mikerthomsen@gmail.com]
> *Sent:* Wednesday, June 13, 2018 2:33 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi Performance Analysis Clarification
>
>
>
> What are you retrieving (particularly size) and what happens in the
> "update" step?
>
>
>
> Thanks,
>
>
>
> Mike
>
>
>
> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <
> prashanth.v@nokia.com> wrote:
>
> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is *GetSFTP ->
> update -> PutKafka. *I want to tune my setup to achieve high throughput
> without much queuing.
>
> But my throughput average drops during flowfile checkpointing duration. I
> believe *stop-the-world * is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But
> every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository *Initiating checkpoint of FlowFile
> Repository*
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@cf82c58 checkpointed with 23 Records
> and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3
> milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with
> 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time =
> 28275 milliseconds), max Transaction ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository *Successfully checkpointed
> FlowFile Repository with 7 records in 39008 milliseconds*
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please guide
> how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G machine].
> I disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Please find answers inline

Thanks & Regards,
Prashanth

From: Pierre Villard [mailto:pierre.villard.fr@gmail.com]
Sent: Wednesday, June 13, 2018 3:56 PM
To: users@nifi.apache.org
Subject: Re: NiFi Performance Analysis Clarification

Hi,

What's the version of NiFi you're using?  1.6.0
What are the file systems you're using for the repositories? Local rhel file system (/home dir)

I think that changing the heap won't make any different in this case. I'd keep it to something like 8GB (unless you're doing very specific stuff that are memory consuming) and let the remaining to OS and disk caching.
I think NiFi holds the snapshotmap in memory.. since we are dealing with pretty huge ingress data [I allocated 32GB out of 42GB to NiFi]. Hence, I increased so.  Does this has anything to do with flowfile checkpoint delay?

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>>:
Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com<ma...@gmail.com>]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

Re: NiFi Performance Analysis Clarification

Posted by Pierre Villard <pi...@gmail.com>.

Hi,

What's the version of NiFi you're using?
What are the file systems you're using for the repositories?

I think that changing the heap won't make any different in this case. I'd
keep it to something like 8GB (unless you're doing very specific stuff that
are memory consuming) and let the remaining to OS and disk caching.

Pierre

2018-06-13 11:58 GMT+02:00 V, Prashanth (Nokia - IN/Bangalore) <
prashanth.v@nokia.com>:

> Hi Mike,
>
>
>
> I am retrieving many small csv files each of size 1MB (total folder size
> around ~100GB). In update step, I am doing some enrichment on ingress csv.
> Anyway my flow doesn’t do anything with the *stop the world* time right?
>
>
>
> Can you please tell me about flowfile checkpointing related tunings?
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>
>
> *From:* Mike Thomsen [mailto:mikerthomsen@gmail.com]
> *Sent:* Wednesday, June 13, 2018 2:33 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi Performance Analysis Clarification
>
>
>
> What are you retrieving (particularly size) and what happens in the
> "update" step?
>
>
>
> Thanks,
>
>
>
> Mike
>
>
>
> On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <
> prashanth.v@nokia.com> wrote:
>
> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is *GetSFTP ->
> update -> PutKafka. *I want to tune my setup to achieve high throughput
> without much queuing.
>
> But my throughput average drops during flowfile checkpointing duration. I
> believe *stop-the-world * is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But
> every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> *Initiating checkpoint of FlowFile Repository*
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.
> MinimalLockingWriteAheadLog@cf82c58 checkpointed with 23 Records and 0
> Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds,
> Clear Edit Logs time = 3 millis), max Transaction ID 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog
> Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002
> milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction
> ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> *Successfully checkpointed FlowFile Repository with 7 records in 39008
> milliseconds*
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please guide
> how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G machine].
> I disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>

RE: NiFi Performance Analysis Clarification

Posted by "V, Prashanth (Nokia - IN/Bangalore)" <pr...@nokia.com>.

Hi Mike,

I am retrieving many small csv files each of size 1MB (total folder size around ~100GB). In update step, I am doing some enrichment on ingress csv. Anyway my flow doesn’t do anything with the stop the world time right?

Can you please tell me about flowfile checkpointing related tunings?

Thanks & Regards,
Prashanth

From: Mike Thomsen [mailto:mikerthomsen@gmail.com]
Sent: Wednesday, June 13, 2018 2:33 PM
To: users@nifi.apache.org
Subject: Re: NiFi Performance Analysis Clarification

What are you retrieving (particularly size) and what happens in the "update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <pr...@nokia.com>> wrote:
Hi Team,

I am doing some performance testing in NiFi. WorkFlow is GetSFTP -> update -> PutKafka. I want to tune my setup to achieve high throughput without much queuing.
But my throughput average drops during flowfile checkpointing duration. I believe stop-the-world  is happening during that time.

I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But every 2 mins, it stops the complete execution. Check below logs

2018-06-13 13:24:21,160 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@cf82c58<ma...@cf82c58> checkpointed with 23 Records and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3 milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
2018-06-13 13:25:00,165 INFO [pool-10-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time = 28275 milliseconds), max Transaction ID 316705
2018-06-13 13:25:00,169 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 7 records in 39008 milliseconds

I think all processor goes in idle state for 39 seconds ☹ .. Please guide how to tune it..
I changed the heap memory with 32G [I am testing on 12 core, 48G machine]. I disabled content-repository archiving. All other properties remains same.

Thanks & Regards,
Prashanth

Re: NiFi Performance Analysis Clarification

Posted by Mike Thomsen <mi...@gmail.com>.

What are you retrieving (particularly size) and what happens in the
"update" step?

Thanks,

Mike

On Wed, Jun 13, 2018 at 4:10 AM V, Prashanth (Nokia - IN/Bangalore) <
prashanth.v@nokia.com> wrote:

> Hi Team,
>
>
>
> I am doing some performance testing in NiFi. WorkFlow is *GetSFTP ->
> update -> PutKafka. *I want to tune my setup to achieve high throughput
> without much queuing.
>
> But my throughput average drops during flowfile checkpointing duration. I
> believe *stop-the-world * is happening during that time.
>
>
>
> I can roughly read ~100MB/s from SFTP and send almost same to Kafka. But
> every 2 mins, it stops the complete execution. Check below logs
>
>
>
> 2018-06-13 13:24:21,160 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository *Initiating checkpoint of FlowFile
> Repository*
>
> 2018-06-13 13:24:49,420 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@cf82c58 checkpointed with 23 Records
> and 0 Swap Files in 39353 milliseconds (Stop-the-world time = 3
> milliseconds, Clear Edit Logs time = 3 millis), max Transaction ID 68
>
> 2018-06-13 13:25:00,165 INFO [pool-10-thread-1]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with
> 7 Records and 0 Swap Files in 39002 milliseconds (Stop-the-world time =
> 28275 milliseconds), max Transaction ID 316705
>
> 2018-06-13 13:25:00,169 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository *Successfully checkpointed
> FlowFile Repository with 7 records in 39008 milliseconds*
>
>
>
> I think all processor goes in idle state for 39 seconds ☹ .. Please guide
> how to tune it..
>
> I changed the heap memory with 32G [I am testing on 12 core, 48G machine].
> I disabled content-repository archiving. All other properties remains same.
>
>
>
> Thanks & Regards,
>
> Prashanth
>
>