You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Shawn Weeks <sw...@weeksconsulting.us> on 2021/05/03 16:30:10 UTC

NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

I'm not sure if this is specific to clustering or not but using the default configuration with 50% content archiving it is possible to cause NiFi to quit processing any data by simple filling up a queue with 50% of your content_repository storage. In my example my content_repository is 1TB and once a queue get's to 500gb or so the next processor won't process any more data. Once this occurs even stopping GenerateFlowFile won't fix the problem and my CompressContent never does anything. It's my understanding that "nifi.content.repository.archive.max.usage.percentage" only set's the max amount of space that archive's will use and should never prevent new content from being written in the 1.13.2 it appears be functioning as a reserve instead. I haven't seen this in older versions of NiFi like 1.9.2 and I'm not sure when the behavior changed but even the documentation seems to indicate that this should not be happening. For example 'If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled.'

[cid:image001.png@01D7400F.ABAD56F0]

[cid:image002.png@01D7400F.ABAD56F0]

Thanks
Shawn Weeks

Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Elli Schwarz <el...@yahoo.com>.
 We are experiencing this issue as well. We just upgraded from Nifi 1.11.4 to 1.13.2, and are running in to this issue where many of our high-usage Nifi instances are just hanging. For example, we have a 7 node cluster that has flowfiles stuck in queues and not moving. We noticed that on 3 of those nodes, the flowfile content storage was over 50%, and those are the nodes that have flowfiles stuck in the queue. The other nodes have nothing on them. No new data is flowing in to the cluster at all, and nothing is moving on any of the nodes. We see this problem also on non-cluster machines; the cluster just makes it more obvious that this archive max usage percentage might be the cause.
We have a lot of merge content processors. We realize that there were a lot of I/O improvements in the newer version of Nifi - Joe, we suspect these efficiencies might be exacerbating the problem:
NiFi 1.13.1 - [full_list]
   
   - [NIFI-7646] - Improve performance of MergeContent / others that read content of many small FlowFiles   

   - [NIFI-8222] - When processing a lot of small FlowFiles, Provenance Repo spends most of its time in lock contention. That can be improved.

NiFi 1.14.0 - [full list]
   
   - [NIFI-8633] - Content Repository can be improved to make fewer disks accesses on read.
   
   - Mark Payne's notes: "For those interested in the actual performance numbers here, I ran a pretty simple flow that generated a lot of tiny JSON messages, and then used ConvertRecord to convert from JSON to Avro. Ran a profiler against it and found that about 50% of the time for ConvertRecord was spent in FileSystemRepository.read(). This is called twice - once when we read the data for inferring schema, a second time when we parse the data.   
   
Of the time spent in FileSystemRepository.read(), about 50% of that time was spent in Files.exists(). So this should improve performance of that flow by something like 25%"

We didn't know about the ...archive.backpressure.percentage property - we don't see it in the Admin guide https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html. We will set this property to a lot higher than 2% above the max usage percentage and see how it goes. Now that we think about it, we believe we've experienced this problem occasionally before the upgrade, but it has become very frequent since the upgrade.

-Elli
    On Monday, May 3, 2021, 01:09:47 PM EDT, Shawn Weeks <sw...@weeksconsulting.us> wrote:  
 
 Sorry, I wasn't saying that 'nifi.content.repository.archive.max.usage.percentage' was new I just hadn't managed to get a NiFi instance stuck this way and even the documentation says that if archive is empty and the content repo needs more room it would disable the archive. I'm having trouble find where ' nifi.content.repository.archive.backpressure.percentage' is documented.

Thanks

-----Original Message-----
From: Mark Payne <ma...@hotmail.com> 
Sent: Monday, May 3, 2021 12:00 PM
To: users@nifi.apache.org
Subject: Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Shawn,

There are a couple of properties at play. The “nifi.content.repository.archive.max.usage.percentage" property behaves as you have described. But there’s also a second property: nifi.content.repository.archive.backpressure.percentage
This controls at what point the Content Repository will actually apply back-pressure in order to avoid filling the disk. This property defaults to 2% more than the the max.usage.percentage. So by default it uses 50% and 52%.
You can adjust the backpressure percentage to something much higher like 80%. So then if you reach 50% it would start clearing things out, and if you reach 80% it’ll start applying the brakes. This is here as a safeguard because we’ve had data flows that can produce the data much faster than it could archive/delete the data. This is common for data flows that produce huge numbers of files in the content repository. So that backpressure is there to ensure that the archive has a chance to run.

This has always been here, though, ever since the initial open sourcing. Is not something new. It may be the case that in later versions we have been more efficient at creating the data, such that it’s now exceeding the rate that the cleanup can happen, not sure. But adjusting the “nifi.content.repository.archive.max.usage.percentage” property should get you into a better state.

Thanks
-Mark


> From: Shawn Weeks <sw...@weeksconsulting.us>
> Date: Mon, May 3, 2021 at 9:33 AM
> Subject: RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> To: users@nifi.apache.org <us...@nifi.apache.org>
> 
> 
> Note I have a 2 node cluster which is why it’s sitting at around 900 GB. Per node content repo is sitting at 535gb currently and I’m not sure where the rest of the space is. I have 472GB free on each node in the content_repository partition as shown in the Cluster panel.
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 
>  
> 
> From: Shawn Weeks 
> Sent: Monday, May 3, 2021 11:30 AM
> To: users@nifi.apache.org
> Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> 
>  
> 
> I’m not sure if this is specific to clustering or not but using the default configuration with 50% content archiving it is possible to cause NiFi to quit processing any data by simple filling up a queue with 50% of your content_repository storage. In my example my content_repository is 1TB and once a queue get’s to 500gb or so the next processor won’t process any more data. Once this occurs even stopping GenerateFlowFile won’t fix the problem and my CompressContent never does anything. It’s my understanding that “nifi.content.repository.archive.max.usage.percentage” only set’s the max amount of space that archive’s will use and should never prevent new content from being written in the 1.13.2 it appears be functioning as a reserve instead. I haven’t seen this in older versions of NiFi like 1.9.2 and I’m not sure when the behavior changed but even the documentation seems to indicate that this should not be happening. For example ‘If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled.’
> 
>  
> 
> <image001.png>
> 
>  
> 
> <image002.png>
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 

  

Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Ryan Hendrickson <ry...@gmail.com>.
I've added a ticket to the NiFI Jira Outlining all the missing properties
from the Sys Admin Guide.  We'd really appreciate them getting into the
documentation.

https://issues.apache.org/jira/browse/NIFI-9029

As far as this issue, it seems like a pretty repeatable process to lock-up
the canvas and freeze all processing.  It would be really great if there
was a visual indicator on the canvas to report this to users.
https://issues.apache.org/jira/browse/NIFI-9030

Perhaps the default 2% should also be changed?  That seems like a fairly
low high-watermark value.

Thanks,
Ryan

On Fri, Aug 6, 2021 at 12:08 PM Ryan Hendrickson <
ryan.andrew.hendrickson@gmail.com> wrote:

> Elli replied to this, although it looks like his email got flagged for
> spam, so I'm replying with his comments to make sure it got through:
>
> We are experiencing this issue as well. We just upgraded from Nifi 1.11.4
> to 1.13.2, and are running in to this issue where many of our high-usage
> Nifi instances are just hanging. For example, we have a 7 node cluster that
> has flowfiles stuck in queues and not moving. We noticed that on 3 of those
> nodes, the flowfile content storage was over 50%, and those are the nodes
> that have flowfiles stuck in the queue. The other nodes have nothing on
> them. No new data is flowing in to the cluster at all, and nothing is
> moving on any of the nodes. We see this problem also on non-cluster
> machines; the cluster just makes it more obvious that this archive max
> usage percentage might be the cause.
>
> We have a lot of merge content processors. We realize that there were a
> lot of I/O improvements in the newer version of Nifi - Joe, we suspect
> these efficiencies might be exacerbating the problem:
>
> *NiFi 1.13.1 - [full_list]*
>
>    - [NIFI-7646] - Improve performance of MergeContent / others that read
>    content of many small FlowFiles
>    - [NIFI-8222] - When processing a lot of small FlowFiles, Provenance
>    Repo spends most of its time in lock contention. That can be improved.
>
>
> *NiFi 1.14.0 - [full list]*
>
>    - [NIFI-8633] - Content Repository can be improved to make fewer disks
>    accesses on read.
>       - Mark Payne's notes:
>
> *"For those interested in the actual performance numbers here, I ran a
>       pretty simple flow that generated a lot of tiny JSON messages, and then
>       used ConvertRecord to convert from JSON to Avro. Ran a profiler against it
>       and found that about 50% of the time for ConvertRecord was spent in
>       FileSystemRepository.read(). This is called twice - once when we read the
>       data for inferring schema, a second time when we parse the data. Of the
>       time spent in FileSystemRepository.read(), about 50% of that time was spent
>       in Files.exists(). So this should improve performance of that flow by
>       something like 25%"*
>
>
> We didn't know about the ...archive.backpressure.percentage property - we
> don't see it in the Admin guide
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html. We
> will set this property to a lot higher than 2% above the max usage
> percentage and see how it goes. Now that we think about it, we believe
> we've experienced this problem occasionally before the upgrade, but it has
> become very frequent since the upgrade.
>
>
>
> On Mon, May 3, 2021 at 1:09 PM Shawn Weeks <sw...@weeksconsulting.us>
> wrote:
>
>> Sorry, I wasn't saying that
>> 'nifi.content.repository.archive.max.usage.percentage' was new I just
>> hadn't managed to get a NiFi instance stuck this way and even the
>> documentation says that if archive is empty and the content repo needs more
>> room it would disable the archive. I'm having trouble find where '
>> nifi.content.repository.archive.backpressure.percentage' is documented.
>>
>> Thanks
>>
>> -----Original Message-----
>> From: Mark Payne <ma...@hotmail.com>
>> Sent: Monday, May 3, 2021 12:00 PM
>> To: users@nifi.apache.org
>> Subject: Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
>>
>> Shawn,
>>
>> There are a couple of properties at play. The
>> “nifi.content.repository.archive.max.usage.percentage" property behaves as
>> you have described. But there’s also a second property:
>> nifi.content.repository.archive.backpressure.percentage
>> This controls at what point the Content Repository will actually apply
>> back-pressure in order to avoid filling the disk. This property defaults to
>> 2% more than the the max.usage.percentage. So by default it uses 50% and
>> 52%.
>> You can adjust the backpressure percentage to something much higher like
>> 80%. So then if you reach 50% it would start clearing things out, and if
>> you reach 80% it’ll start applying the brakes. This is here as a safeguard
>> because we’ve had data flows that can produce the data much faster than it
>> could archive/delete the data. This is common for data flows that produce
>> huge numbers of files in the content repository. So that backpressure is
>> there to ensure that the archive has a chance to run.
>>
>> This has always been here, though, ever since the initial open sourcing.
>> Is not something new. It may be the case that in later versions we have
>> been more efficient at creating the data, such that it’s now exceeding the
>> rate that the cleanup can happen, not sure. But adjusting the
>> “nifi.content.repository.archive.max.usage.percentage” property should get
>> you into a better state.
>>
>> Thanks
>> -Mark
>>
>>
>> > From: Shawn Weeks <sw...@weeksconsulting.us>
>> > Date: Mon, May 3, 2021 at 9:33 AM
>> > Subject: RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
>> > To: users@nifi.apache.org <us...@nifi.apache.org>
>> >
>> >
>> > Note I have a 2 node cluster which is why it’s sitting at around 900
>> GB. Per node content repo is sitting at 535gb currently and I’m not sure
>> where the rest of the space is. I have 472GB free on each node in the
>> content_repository partition as shown in the Cluster panel.
>> >
>> >
>> >
>> > Thanks
>> >
>> > Shawn Weeks
>> >
>> >
>> >
>> > From: Shawn Weeks
>> > Sent: Monday, May 3, 2021 11:30 AM
>> > To: users@nifi.apache.org
>> > Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
>> >
>> >
>> >
>> > I’m not sure if this is specific to clustering or not but using the
>> default configuration with 50% content archiving it is possible to cause
>> NiFi to quit processing any data by simple filling up a queue with 50% of
>> your content_repository storage. In my example my content_repository is 1TB
>> and once a queue get’s to 500gb or so the next processor won’t process any
>> more data. Once this occurs even stopping GenerateFlowFile won’t fix the
>> problem and my CompressContent never does anything. It’s my understanding
>> that “nifi.content.repository.archive.max.usage.percentage” only set’s the
>> max amount of space that archive’s will use and should never prevent new
>> content from being written in the 1.13.2 it appears be functioning as a
>> reserve instead. I haven’t seen this in older versions of NiFi like 1.9.2
>> and I’m not sure when the behavior changed but even the documentation seems
>> to indicate that this should not be happening. For example ‘If the archive
>> is empty and content repository disk usage is above this percentage, then
>> archiving is temporarily disabled.’
>> >
>> >
>> >
>> > <image001.png>
>> >
>> >
>> >
>> > <image002.png>
>> >
>> >
>> >
>> > Thanks
>> >
>> > Shawn Weeks
>> >
>>
>>

Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Ryan Hendrickson <ry...@gmail.com>.
Elli replied to this, although it looks like his email got flagged for
spam, so I'm replying with his comments to make sure it got through:

We are experiencing this issue as well. We just upgraded from Nifi 1.11.4
to 1.13.2, and are running in to this issue where many of our high-usage
Nifi instances are just hanging. For example, we have a 7 node cluster that
has flowfiles stuck in queues and not moving. We noticed that on 3 of those
nodes, the flowfile content storage was over 50%, and those are the nodes
that have flowfiles stuck in the queue. The other nodes have nothing on
them. No new data is flowing in to the cluster at all, and nothing is
moving on any of the nodes. We see this problem also on non-cluster
machines; the cluster just makes it more obvious that this archive max
usage percentage might be the cause.

We have a lot of merge content processors. We realize that there were a lot
of I/O improvements in the newer version of Nifi - Joe, we suspect these
efficiencies might be exacerbating the problem:

*NiFi 1.13.1 - [full_list]*

   - [NIFI-7646] - Improve performance of MergeContent / others that read
   content of many small FlowFiles
   - [NIFI-8222] - When processing a lot of small FlowFiles, Provenance
   Repo spends most of its time in lock contention. That can be improved.


*NiFi 1.14.0 - [full list]*

   - [NIFI-8633] - Content Repository can be improved to make fewer disks
   accesses on read.
      - Mark Payne's notes:

*"For those interested in the actual performance numbers here, I ran a
      pretty simple flow that generated a lot of tiny JSON messages, and then
      used ConvertRecord to convert from JSON to Avro. Ran a profiler
against it
      and found that about 50% of the time for ConvertRecord was spent in
      FileSystemRepository.read(). This is called twice - once when we read the
      data for inferring schema, a second time when we parse the data. Of the
      time spent in FileSystemRepository.read(), about 50% of that
time was spent
      in Files.exists(). So this should improve performance of that flow by
      something like 25%"*


We didn't know about the ...archive.backpressure.percentage property - we
don't see it in the Admin guide
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html. We
will set this property to a lot higher than 2% above the max usage
percentage and see how it goes. Now that we think about it, we believe
we've experienced this problem occasionally before the upgrade, but it has
become very frequent since the upgrade.



On Mon, May 3, 2021 at 1:09 PM Shawn Weeks <sw...@weeksconsulting.us>
wrote:

> Sorry, I wasn't saying that
> 'nifi.content.repository.archive.max.usage.percentage' was new I just
> hadn't managed to get a NiFi instance stuck this way and even the
> documentation says that if archive is empty and the content repo needs more
> room it would disable the archive. I'm having trouble find where '
> nifi.content.repository.archive.backpressure.percentage' is documented.
>
> Thanks
>
> -----Original Message-----
> From: Mark Payne <ma...@hotmail.com>
> Sent: Monday, May 3, 2021 12:00 PM
> To: users@nifi.apache.org
> Subject: Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
>
> Shawn,
>
> There are a couple of properties at play. The
> “nifi.content.repository.archive.max.usage.percentage" property behaves as
> you have described. But there’s also a second property:
> nifi.content.repository.archive.backpressure.percentage
> This controls at what point the Content Repository will actually apply
> back-pressure in order to avoid filling the disk. This property defaults to
> 2% more than the the max.usage.percentage. So by default it uses 50% and
> 52%.
> You can adjust the backpressure percentage to something much higher like
> 80%. So then if you reach 50% it would start clearing things out, and if
> you reach 80% it’ll start applying the brakes. This is here as a safeguard
> because we’ve had data flows that can produce the data much faster than it
> could archive/delete the data. This is common for data flows that produce
> huge numbers of files in the content repository. So that backpressure is
> there to ensure that the archive has a chance to run.
>
> This has always been here, though, ever since the initial open sourcing.
> Is not something new. It may be the case that in later versions we have
> been more efficient at creating the data, such that it’s now exceeding the
> rate that the cleanup can happen, not sure. But adjusting the
> “nifi.content.repository.archive.max.usage.percentage” property should get
> you into a better state.
>
> Thanks
> -Mark
>
>
> > From: Shawn Weeks <sw...@weeksconsulting.us>
> > Date: Mon, May 3, 2021 at 9:33 AM
> > Subject: RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> > To: users@nifi.apache.org <us...@nifi.apache.org>
> >
> >
> > Note I have a 2 node cluster which is why it’s sitting at around 900 GB.
> Per node content repo is sitting at 535gb currently and I’m not sure where
> the rest of the space is. I have 472GB free on each node in the
> content_repository partition as shown in the Cluster panel.
> >
> >
> >
> > Thanks
> >
> > Shawn Weeks
> >
> >
> >
> > From: Shawn Weeks
> > Sent: Monday, May 3, 2021 11:30 AM
> > To: users@nifi.apache.org
> > Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> >
> >
> >
> > I’m not sure if this is specific to clustering or not but using the
> default configuration with 50% content archiving it is possible to cause
> NiFi to quit processing any data by simple filling up a queue with 50% of
> your content_repository storage. In my example my content_repository is 1TB
> and once a queue get’s to 500gb or so the next processor won’t process any
> more data. Once this occurs even stopping GenerateFlowFile won’t fix the
> problem and my CompressContent never does anything. It’s my understanding
> that “nifi.content.repository.archive.max.usage.percentage” only set’s the
> max amount of space that archive’s will use and should never prevent new
> content from being written in the 1.13.2 it appears be functioning as a
> reserve instead. I haven’t seen this in older versions of NiFi like 1.9.2
> and I’m not sure when the behavior changed but even the documentation seems
> to indicate that this should not be happening. For example ‘If the archive
> is empty and content repository disk usage is above this percentage, then
> archiving is temporarily disabled.’
> >
> >
> >
> > <image001.png>
> >
> >
> >
> > <image002.png>
> >
> >
> >
> > Thanks
> >
> > Shawn Weeks
> >
>
>

RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
Sorry, I wasn't saying that 'nifi.content.repository.archive.max.usage.percentage' was new I just hadn't managed to get a NiFi instance stuck this way and even the documentation says that if archive is empty and the content repo needs more room it would disable the archive. I'm having trouble find where ' nifi.content.repository.archive.backpressure.percentage' is documented.

Thanks

-----Original Message-----
From: Mark Payne <ma...@hotmail.com> 
Sent: Monday, May 3, 2021 12:00 PM
To: users@nifi.apache.org
Subject: Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Shawn,

There are a couple of properties at play. The “nifi.content.repository.archive.max.usage.percentage" property behaves as you have described. But there’s also a second property: nifi.content.repository.archive.backpressure.percentage
This controls at what point the Content Repository will actually apply back-pressure in order to avoid filling the disk. This property defaults to 2% more than the the max.usage.percentage. So by default it uses 50% and 52%.
You can adjust the backpressure percentage to something much higher like 80%. So then if you reach 50% it would start clearing things out, and if you reach 80% it’ll start applying the brakes. This is here as a safeguard because we’ve had data flows that can produce the data much faster than it could archive/delete the data. This is common for data flows that produce huge numbers of files in the content repository. So that backpressure is there to ensure that the archive has a chance to run.

This has always been here, though, ever since the initial open sourcing. Is not something new. It may be the case that in later versions we have been more efficient at creating the data, such that it’s now exceeding the rate that the cleanup can happen, not sure. But adjusting the “nifi.content.repository.archive.max.usage.percentage” property should get you into a better state.

Thanks
-Mark


> From: Shawn Weeks <sw...@weeksconsulting.us>
> Date: Mon, May 3, 2021 at 9:33 AM
> Subject: RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> To: users@nifi.apache.org <us...@nifi.apache.org>
> 
> 
> Note I have a 2 node cluster which is why it’s sitting at around 900 GB. Per node content repo is sitting at 535gb currently and I’m not sure where the rest of the space is. I have 472GB free on each node in the content_repository partition as shown in the Cluster panel.
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 
>  
> 
> From: Shawn Weeks 
> Sent: Monday, May 3, 2021 11:30 AM
> To: users@nifi.apache.org
> Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> 
>  
> 
> I’m not sure if this is specific to clustering or not but using the default configuration with 50% content archiving it is possible to cause NiFi to quit processing any data by simple filling up a queue with 50% of your content_repository storage. In my example my content_repository is 1TB and once a queue get’s to 500gb or so the next processor won’t process any more data. Once this occurs even stopping GenerateFlowFile won’t fix the problem and my CompressContent never does anything. It’s my understanding that “nifi.content.repository.archive.max.usage.percentage” only set’s the max amount of space that archive’s will use and should never prevent new content from being written in the 1.13.2 it appears be functioning as a reserve instead. I haven’t seen this in older versions of NiFi like 1.9.2 and I’m not sure when the behavior changed but even the documentation seems to indicate that this should not be happening. For example ‘If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled.’
> 
>  
> 
> <image001.png>
> 
>  
> 
> <image002.png>
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 


Re: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Mark Payne <ma...@hotmail.com>.
Shawn,

There are a couple of properties at play. The “nifi.content.repository.archive.max.usage.percentage" property behaves as you have described. But there’s also a second property: nifi.content.repository.archive.backpressure.percentage
This controls at what point the Content Repository will actually apply back-pressure in order to avoid filling the disk. This property defaults to 2% more than the the max.usage.percentage. So by default it uses 50% and 52%.
You can adjust the backpressure percentage to something much higher like 80%. So then if you reach 50% it would start clearing things out, and if you reach 80% it’ll start applying the brakes. This is here as a safeguard because we’ve had data flows that can produce the data much faster than it could archive/delete the data. This is common for data flows that produce huge numbers of files in the content repository. So that backpressure is there to ensure that the archive has a chance to run.

This has always been here, though, ever since the initial open sourcing. Is not something new. It may be the case that in later versions we have been more efficient at creating the data, such that it’s now exceeding the rate that the cleanup can happen, not sure. But adjusting the “nifi.content.repository.archive.max.usage.percentage” property should get you into a better state.

Thanks
-Mark


> From: Shawn Weeks <sw...@weeksconsulting.us>
> Date: Mon, May 3, 2021 at 9:33 AM
> Subject: RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> To: users@nifi.apache.org <us...@nifi.apache.org>
> 
> 
> Note I have a 2 node cluster which is why it’s sitting at around 900 GB. Per node content repo is sitting at 535gb currently and I’m not sure where the rest of the space is. I have 472GB free on each node in the content_repository partition as shown in the Cluster panel.
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 
>  
> 
> From: Shawn Weeks 
> Sent: Monday, May 3, 2021 11:30 AM
> To: users@nifi.apache.org
> Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup
> 
>  
> 
> I’m not sure if this is specific to clustering or not but using the default configuration with 50% content archiving it is possible to cause NiFi to quit processing any data by simple filling up a queue with 50% of your content_repository storage. In my example my content_repository is 1TB and once a queue get’s to 500gb or so the next processor won’t process any more data. Once this occurs even stopping GenerateFlowFile won’t fix the problem and my CompressContent never does anything. It’s my understanding that “nifi.content.repository.archive.max.usage.percentage” only set’s the max amount of space that archive’s will use and should never prevent new content from being written in the 1.13.2 it appears be functioning as a reserve instead. I haven’t seen this in older versions of NiFi like 1.9.2 and I’m not sure when the behavior changed but even the documentation seems to indicate that this should not be happening. For example ‘If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled.’
> 
>  
> 
> <image001.png>
> 
>  
> 
> <image002.png>
> 
>  
> 
> Thanks
> 
> Shawn Weeks
> 


RE: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
Note I have a 2 node cluster which is why it's sitting at around 900 GB. Per node content repo is sitting at 535gb currently and I'm not sure where the rest of the space is. I have 472GB free on each node in the content_repository partition as shown in the Cluster panel.

Thanks
Shawn Weeks

From: Shawn Weeks
Sent: Monday, May 3, 2021 11:30 AM
To: users@nifi.apache.org
Subject: NiFi Get's Stuck Waiting On Non Existent Archive Cleanup

I'm not sure if this is specific to clustering or not but using the default configuration with 50% content archiving it is possible to cause NiFi to quit processing any data by simple filling up a queue with 50% of your content_repository storage. In my example my content_repository is 1TB and once a queue get's to 500gb or so the next processor won't process any more data. Once this occurs even stopping GenerateFlowFile won't fix the problem and my CompressContent never does anything. It's my understanding that "nifi.content.repository.archive.max.usage.percentage" only set's the max amount of space that archive's will use and should never prevent new content from being written in the 1.13.2 it appears be functioning as a reserve instead. I haven't seen this in older versions of NiFi like 1.9.2 and I'm not sure when the behavior changed but even the documentation seems to indicate that this should not be happening. For example 'If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled.'

[cid:image001.png@01D74010.04C961C0]

[cid:image002.png@01D74010.04C961C0]

Thanks
Shawn Weeks