You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Shawn Weeks <sw...@weeksconsulting.us> on 2021/11/08 15:08:54 UTC

ReplaceText Potential Memory Leak

I'm seeing an issue on 1.14.0 where if ReplaceText continuously hits a buffer overflow it will eventually overrun the memory on the server it's running on and cause the server to stop responding at the OS Level. In my instance I've got vm.swappiness off and am running on the latest OpenJDK 11 on RedHat 7. I was wondering if anyone had seen this before. I've found some old java message boards that talk about text buffers bypassing the normal Java memory constraints but nothing all that recent.

Example flow will be oncoming once I work out the details.

Thanks
Shawn

RE: ReplaceText Potential Memory Leak

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
What has made this difficult to track down is your usually suspects like heap usage aren’t what’s using all the memory. My production systems that are locking up like this show less than 50% heap usage and garbage collection seems to be functioning normally up until the point the box has run out of memory. Even docker stats shows the container not using all the ram while something like pmap shows the process consuming nearly all the memory on the box. Once I get a reproducible example I’ll start adding things like native memory tracking to see where the memory might be going.

Thanks
Shawn

From: Chris Sampson <ch...@naimuri.com>
Sent: Tuesday, November 23, 2021 4:01 PM
To: users@nifi.apache.org
Subject: Re: ReplaceText Potential Memory Leak

Is this a "JVM doesn't play well with containers"[1] sort of situation?

I've read a couple of things around this area in the past but not too much in-depth because so far I've been lucky enough to avoid such issues (so far as I'm aware).

Generally articles suggest that the various machinations of the different GCs often play a part - I suspect the default GC has changed between 8 & 11 (and even if it's the same GC, likely the implementation is different). So maybe it's a case of looking at the JVM flags at in bootstrap.conf and considering making these more easily configurable when running in docker (although I think nifi does have at least a little configuration in that area already using env vars)?

Some applications ship with different default settings depending on the installation/deployment mechanism the user chooses (e.g. RPM vs. Docker) - Elasticsearch is one such example. JVM and logging are often things that are best using different defaults in such artefacts (e.g. for Docker the recommendation would normally be to log everything to STDOUT/ERR and not files that fill up the container and cause it to crash, etc.).


[1] https://stackoverflow.com/questions/52607839/unexplained-extra-memory-consumed-by-docker-running-java-process

Cheers,

Chris Sampson

On Tue, 23 Nov 2021, 20:21 Shawn Weeks, <sw...@weeksconsulting.us>> wrote:
I’ve made some progress narrowing things down and I’m running a long term test to see if it fails. The issue seems to be an interaction between OpenJDK 11, Docker and NiFi where if you continuously run the heap usage up and down memory isn’t released and the NiFi Process slowly uses more and more memory until the host no longer functions My test is being run using the Dockerfile in NiFi’s Github repo with the only change being using the 11-jre image instead of the 8-jre base image. The memory leak seems to only be a few MB a day so it’s going to take a while to reproduce it.

Thanks
Shawn

From: Shawn Weeks <sw...@weeksconsulting.us>>
Sent: Monday, November 15, 2021 7:20 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: RE: ReplaceText Potential Memory Leak

I’m still trying to get it narrowed down to a single example. It seems to be centered around ReplaceText and some really long lines(100MB+) but when it happens the servers go completely non-responsive so it’s been hard getting thread dumps and logs from them.

Thanks
Shawn

From: Otto Fowler <ot...@gmail.com>>
Sent: Sunday, November 14, 2021 9:54 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: ReplaceText Potential Memory Leak


I think this is worth a jira. Do you have a stack trace or anything?



From: Shawn Weeks <sw...@weeksconsulting.us>
Reply: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Date: November 8, 2021 at 10:09:12
To: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Subject:  ReplaceText Potential Memory Leak

I’m seeing an issue on 1.14.0 where if ReplaceText continuously hits a buffer overflow it will eventually overrun the memory on the server it’s running on and cause the server to stop responding at the OS Level. In my instance I’ve got vm.swappiness off and am running on the latest OpenJDK 11 on RedHat 7. I was wondering if anyone had seen this before. I’ve found some old java message boards that talk about text buffers bypassing the normal Java memory constraints but nothing all that recent.

Example flow will be oncoming once I work out the details.

Thanks
Shawn

Re: ReplaceText Potential Memory Leak

Posted by Chris Sampson <ch...@naimuri.com>.
Is this a "JVM doesn't play well with containers"[1] sort of situation?

I've read a couple of things around this area in the past but not too much
in-depth because so far I've been lucky enough to avoid such issues (so far
as I'm aware).

Generally articles suggest that the various machinations of the different
GCs often play a part - I suspect the default GC has changed between 8 & 11
(and even if it's the same GC, likely the implementation is different). So
maybe it's a case of looking at the JVM flags at in bootstrap.conf and
considering making these more easily configurable when running in docker
(although I think nifi does have at least a little configuration in that
area already using env vars)?

Some applications ship with different default settings depending on the
installation/deployment mechanism the user chooses (e.g. RPM vs. Docker) -
Elasticsearch is one such example. JVM and logging are often things that
are best using different defaults in such artefacts (e.g. for Docker the
recommendation would normally be to log everything to STDOUT/ERR and not
files that fill up the container and cause it to crash, etc.).


[1]
https://stackoverflow.com/questions/52607839/unexplained-extra-memory-consumed-by-docker-running-java-process


Cheers,

Chris Sampson

On Tue, 23 Nov 2021, 20:21 Shawn Weeks, <sw...@weeksconsulting.us> wrote:

> I’ve made some progress narrowing things down and I’m running a long term
> test to see if it fails. The issue seems to be an interaction between
> OpenJDK 11, Docker and NiFi where if you continuously run the heap usage up
> and down memory isn’t released and the NiFi Process slowly uses more and
> more memory until the host no longer functions My test is being run using
> the Dockerfile in NiFi’s Github repo with the only change being using the
> 11-jre image instead of the 8-jre base image. The memory leak seems to only
> be a few MB a day so it’s going to take a while to reproduce it.
>
>
>
> Thanks
>
> Shawn
>
>
>
> *From:* Shawn Weeks <sw...@weeksconsulting.us>
> *Sent:* Monday, November 15, 2021 7:20 AM
> *To:* users@nifi.apache.org
> *Subject:* RE: ReplaceText Potential Memory Leak
>
>
>
> I’m still trying to get it narrowed down to a single example. It seems to
> be centered around ReplaceText and some really long lines(100MB+) but when
> it happens the servers go completely non-responsive so it’s been hard
> getting thread dumps and logs from them.
>
>
>
> Thanks
>
> Shawn
>
>
>
> *From:* Otto Fowler <ot...@gmail.com>
> *Sent:* Sunday, November 14, 2021 9:54 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: ReplaceText Potential Memory Leak
>
>
>
> I think this is worth a jira. Do you have a stack trace or anything?
>
>
>
>
>
>
> From: Shawn Weeks <sw...@weeksconsulting.us> <sw...@weeksconsulting.us>
> Reply: users@nifi.apache.org <us...@nifi.apache.org>
> <us...@nifi.apache.org>
> Date: November 8, 2021 at 10:09:12
> To: users@nifi.apache.org <us...@nifi.apache.org> <us...@nifi.apache.org>
> Subject:  ReplaceText Potential Memory Leak
>
>
>
> I’m seeing an issue on 1.14.0 where if ReplaceText continuously hits a
> buffer overflow it will eventually overrun the memory on the server it’s
> running on and cause the server to stop responding at the OS Level. In my
> instance I’ve got vm.swappiness off and am running on the latest OpenJDK 11
> on RedHat 7. I was wondering if anyone had seen this before. I’ve found
> some old java message boards that talk about text buffers bypassing the
> normal Java memory constraints but nothing all that recent.
>
>
>
> Example flow will be oncoming once I work out the details.
>
>
>
> Thanks
>
> Shawn
>
>

RE: ReplaceText Potential Memory Leak

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
I’ve made some progress narrowing things down and I’m running a long term test to see if it fails. The issue seems to be an interaction between OpenJDK 11, Docker and NiFi where if you continuously run the heap usage up and down memory isn’t released and the NiFi Process slowly uses more and more memory until the host no longer functions My test is being run using the Dockerfile in NiFi’s Github repo with the only change being using the 11-jre image instead of the 8-jre base image. The memory leak seems to only be a few MB a day so it’s going to take a while to reproduce it.

Thanks
Shawn

From: Shawn Weeks <sw...@weeksconsulting.us>
Sent: Monday, November 15, 2021 7:20 AM
To: users@nifi.apache.org
Subject: RE: ReplaceText Potential Memory Leak

I’m still trying to get it narrowed down to a single example. It seems to be centered around ReplaceText and some really long lines(100MB+) but when it happens the servers go completely non-responsive so it’s been hard getting thread dumps and logs from them.

Thanks
Shawn

From: Otto Fowler <ot...@gmail.com>>
Sent: Sunday, November 14, 2021 9:54 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: ReplaceText Potential Memory Leak


I think this is worth a jira. Do you have a stack trace or anything?



From: Shawn Weeks <sw...@weeksconsulting.us>
Reply: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Date: November 8, 2021 at 10:09:12
To: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Subject:  ReplaceText Potential Memory Leak

I’m seeing an issue on 1.14.0 where if ReplaceText continuously hits a buffer overflow it will eventually overrun the memory on the server it’s running on and cause the server to stop responding at the OS Level. In my instance I’ve got vm.swappiness off and am running on the latest OpenJDK 11 on RedHat 7. I was wondering if anyone had seen this before. I’ve found some old java message boards that talk about text buffers bypassing the normal Java memory constraints but nothing all that recent.

Example flow will be oncoming once I work out the details.

Thanks
Shawn

RE: ReplaceText Potential Memory Leak

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
I’m still trying to get it narrowed down to a single example. It seems to be centered around ReplaceText and some really long lines(100MB+) but when it happens the servers go completely non-responsive so it’s been hard getting thread dumps and logs from them.

Thanks
Shawn

From: Otto Fowler <ot...@gmail.com>
Sent: Sunday, November 14, 2021 9:54 PM
To: users@nifi.apache.org
Subject: Re: ReplaceText Potential Memory Leak


I think this is worth a jira. Do you have a stack trace or anything?



From: Shawn Weeks <sw...@weeksconsulting.us>
Reply: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Date: November 8, 2021 at 10:09:12
To: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>
Subject:  ReplaceText Potential Memory Leak


I’m seeing an issue on 1.14.0 where if ReplaceText continuously hits a buffer overflow it will eventually overrun the memory on the server it’s running on and cause the server to stop responding at the OS Level. In my instance I’ve got vm.swappiness off and am running on the latest OpenJDK 11 on RedHat 7. I was wondering if anyone had seen this before. I’ve found some old java message boards that talk about text buffers bypassing the normal Java memory constraints but nothing all that recent.

Example flow will be oncoming once I work out the details.

Thanks
Shawn

Re: ReplaceText Potential Memory Leak

Posted by Otto Fowler <ot...@gmail.com>.
I think this is worth a jira. Do you have a stack trace or anything?




From: Shawn Weeks <sw...@weeksconsulting.us> <sw...@weeksconsulting.us>
Reply: users@nifi.apache.org <us...@nifi.apache.org> <us...@nifi.apache.org>
Date: November 8, 2021 at 10:09:12
To: users@nifi.apache.org <us...@nifi.apache.org> <us...@nifi.apache.org>
Subject:  ReplaceText Potential Memory Leak

I’m seeing an issue on 1.14.0 where if ReplaceText continuously hits a
buffer overflow it will eventually overrun the memory on the server it’s
running on and cause the server to stop responding at the OS Level. In my
instance I’ve got vm.swappiness off and am running on the latest OpenJDK 11
on RedHat 7. I was wondering if anyone had seen this before. I’ve found
some old java message boards that talk about text buffers bypassing the
normal Java memory constraints but nothing all that recent.



Example flow will be oncoming once I work out the details.



Thanks

Shawn