You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "Peter Wicks (pwicks)" <pw...@micron.com> on 2018/10/08 17:02:15 UTC

RE: [EXT] Re: Maximum Memory for NiFi?

Bryan,

Our Min is set to 32GB's. Under normal situations the heap does not exceed roughly 50% usage (out of 70 GB's), and many times is lower. We collect and track these metrics, and in the last 30 days it's been closer to 35% usage.

But during database maintenance we have to shut down a lot of processors. Flowfile's start to backup in the system across lots of different feeds. Then, when the database comes back online, the combined processing of all these separate feeds catching up on backlog (lots of different processors, not a single processor), causes the heap usage to spike. What we saw in GC logging was we would reach 70 GB's, then GC would do a stop the world pause and bring us down to about 65 GB's, then we'd reach 70 GB's, and GC would get us down to 68 GB's. This would repeat until GC was only trimming off a few MB's and having to run full cleanups every few seconds; thus leaving the system inoperable.

We brought our cluster back online by:
 1. Shutting everything down
 2. Going into a single node and setting NiFi to not auto-resume state; we also set the maximum thread count to 10.
 3. We turned on a single node and verified we could process a single feed without crashing. We then synchronized the flow to the rest of the nodes and brought them back online.
 4. We then manually turned feeds on to flush out backlogged data, of course more data was backlogging on our edge servers while we did this.
 5. We decided to set threads to 140 per node (significantly lower than the 1500 threads we used to have), and Heap to 200 GB's. We did 2x threads per virtual core, plus enough threads to cover all of the site-to-site input ports. It's weird, because NiFi used to happily run 1000+ threads per node all the time, but is able to keep up just as well now with 140 threads...
 6. With these settings in place we caught up on our backlog without running out of heap. We maxed out around 100 GB's of Heap usage per node.

--Peter

-----Original Message-----
From: Bryan Bende [mailto:bbende@gmail.com] 
Sent: Friday, October 5, 2018 7:26 AM
To: users@nifi.apache.org
Subject: [EXT] Re: Maximum Memory for NiFi?

Generally the larger the heap, the more likely to have long GC pauses.

I'm surprised that you would need a 70GB heap given NiFi's design where the content of the flow files is generally not held in memory, unless many of the processors you are using are not written in an optimal way to process the content in a streaming fashion.

Did you initially start out lower than 70GB and head to increase it to that point? Just wondering what happens at lower levels like maybe 32GB.

On Thu, Oct 4, 2018 at 4:20 PM Peter Wicks (pwicks) <pw...@micron.com> wrote:
>
> We’ve had some more clustering issues, and found that some nodes are running out of memory when we have unexpected spikes in data, then we run into a GC stop-the-world event... We lowered our thread count, and that has allowed the cluster to stabilize for the time being.
>
>
>
> Our hardware is pretty robust, we usually have 1000+ threads running on each node in the cluster (cumulative ~4,000 threads). Each node has about 500G’s of RAM. But we’ve only been running NiFi with 70G’s of RAM, and it usually uses only 50G’s.
>
>
>
> I enabled GC logging and after analyzing the data we decided to increase the heap size. We are experimenting with upping the max to 200G of heap to better absorb spikes in data. We are using the default G1GC.
>
>
>
> Also, how much impact is there from doing GC logging all the time? The metrics we are getting are really helpful for debugging/analyzing, but we don’t want to slowdown the cluster too much.
>
>
>
> Thoughts on issues we might encounter? Things we should consider?
>
>
>
> --Peter