You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Shawn Weeks <sw...@weeksconsulting.us> on 2019/12/23 21:11:28 UTC

Advanced QueryRecord Brings NiFi Down

I was playing around with QueryRecord this afternoon and I succeeded in bring a decent sized(16 Core 64gb AWS m5) NiFi instance down. The input file is an 8gb CSV file and I was using the row_number() analytic function. Memory usage went to 51 GB and CPU went to 100% on all cores despite only having QueryRecord running in a single thread. Has anyone else run into this with QueryRecord? I’m assuming it’s Calcite trying to run the query multi-threaded but even if it put the whole thing in memory it shouldn’t have gone through that much. Documentation on Calcite query execution is pretty much nonexistent so I’m not sure where to even begin debugging.

Thanks
Shawn Weeks

Re: Advanced QueryRecord Brings NiFi Down

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
Realized I attached my example to the wrong chain, sorry. Don’t run this on a cluster you care about as you’ll probably have to drop the flow.xml.gz to stop it.

Thanks
Shawn

From: Mike Thomsen <mi...@gmail.com>
Reply-To: "users@nifi.apache.org" <us...@nifi.apache.org>
Date: Tuesday, December 24, 2019 at 7:42 PM
To: "users@nifi.apache.org" <us...@nifi.apache.org>
Subject: Re: Advanced QueryRecord Brings NiFi Down

If you could share more details like query, schema, etc. that would be a big help toward setting up for a Jira ticket to investigate.

On Mon, Dec 23, 2019 at 4:11 PM Shawn Weeks <sw...@weeksconsulting.us>> wrote:
I was playing around with QueryRecord this afternoon and I succeeded in bring a decent sized(16 Core 64gb AWS m5) NiFi instance down. The input file is an 8gb CSV file and I was using the row_number() analytic function. Memory usage went to 51 GB and CPU went to 100% on all cores despite only having QueryRecord running in a single thread. Has anyone else run into this with QueryRecord? I’m assuming it’s Calcite trying to run the query multi-threaded but even if it put the whole thing in memory it shouldn’t have gone through that much. Documentation on Calcite query execution is pretty much nonexistent so I’m not sure where to even begin debugging.

Thanks
Shawn Weeks

Re: Advanced QueryRecord Brings NiFi Down

Posted by Mike Thomsen <mi...@gmail.com>.
If you could share more details like query, schema, etc. that would be a
big help toward setting up for a Jira ticket to investigate.

On Mon, Dec 23, 2019 at 4:11 PM Shawn Weeks <sw...@weeksconsulting.us>
wrote:

> I was playing around with QueryRecord this afternoon and I succeeded in
> bring a decent sized(16 Core 64gb AWS m5) NiFi instance down. The input
> file is an 8gb CSV file and I was using the row_number() analytic function.
> Memory usage went to 51 GB and CPU went to 100% on all cores despite only
> having QueryRecord running in a single thread. Has anyone else run into
> this with QueryRecord? I’m assuming it’s Calcite trying to run the query
> multi-threaded but even if it put the whole thing in memory it shouldn’t
> have gone through that much. Documentation on Calcite query execution is
> pretty much nonexistent so I’m not sure where to even begin debugging.
>
>
>
> Thanks
>
> Shawn Weeks
>