You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by James McMahon <js...@gmail.com> on 2017/05/24 18:46:04 UTC

Content Repository Retention

Good afternoon. I am running Apache NiFi 0.7.x. This morning I ran a large
body of content through my workflow, and my content repo usage as reported
by 'df -kh' spiked from less than one gigabyte of used capacity to 40G.

My files fully processed without error. My processors and queues now show 0
bytes across the board, including all processor groups.

My content repository is still showing 22G usage. My expectation was that
it would reduce back down to minimal capacity usage when my workflow
completed processing all these files, but it has not.

I run nifi on my Linux box as a service. I gracefully stopped the service.
Let it sit for a minute or two. Checked my logs. No problems noted. I
started nifi again as a service. No errors in the logs. I still find that
the content repo disk device shows 22G used capacity.

Does the content repository retain content history for a period of time
after a flowfile has exited from nifi? Can that be regulated?

Thanks in advance for your insights.

Jim

Re: Content Repository Retention

Posted by James McMahon <js...@gmail.com>.
Thank you very much Matt. I will tune this property to get the behavior we
require. Thanks again. -Jim

On Wed, May 24, 2017 at 2:57 PM, Matt Gilman <ma...@gmail.com>
wrote:

> Jim,
>
> Yes, the contain can be archived once the flowfile has exited NiFi. This
> allows the content to be downloaded/viewed from provenance events for that
> flowfile. This behavior can be controlled using the nifi.content.repository.archive.*
> properties [1].
>
> Thanks
>
> Matt
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#
> file-system-content-repository-properties
>
> On Wed, May 24, 2017 at 2:46 PM, James McMahon <js...@gmail.com>
> wrote:
>
>> Good afternoon. I am running Apache NiFi 0.7.x. This morning I ran a
>> large body of content through my workflow, and my content repo usage as
>> reported by 'df -kh' spiked from less than one gigabyte of used capacity to
>> 40G.
>>
>> My files fully processed without error. My processors and queues now show
>> 0 bytes across the board, including all processor groups.
>>
>> My content repository is still showing 22G usage. My expectation was that
>> it would reduce back down to minimal capacity usage when my workflow
>> completed processing all these files, but it has not.
>>
>> I run nifi on my Linux box as a service. I gracefully stopped the
>> service. Let it sit for a minute or two. Checked my logs. No problems
>> noted. I started nifi again as a service. No errors in the logs. I still
>> find that the content repo disk device shows 22G used capacity.
>>
>> Does the content repository retain content history for a period of time
>> after a flowfile has exited from nifi? Can that be regulated?
>>
>> Thanks in advance for your insights.
>>
>> Jim
>>
>
>

Re: Content Repository Retention

Posted by Matt Gilman <ma...@gmail.com>.
Jim,

Yes, the contain can be archived once the flowfile has exited NiFi. This
allows the content to be downloaded/viewed from provenance events for that
flowfile. This behavior can be controlled using
the nifi.content.repository.archive.* properties [1].

Thanks

Matt

[1]
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#file-system-content-repository-properties

On Wed, May 24, 2017 at 2:46 PM, James McMahon <js...@gmail.com> wrote:

> Good afternoon. I am running Apache NiFi 0.7.x. This morning I ran a large
> body of content through my workflow, and my content repo usage as reported
> by 'df -kh' spiked from less than one gigabyte of used capacity to 40G.
>
> My files fully processed without error. My processors and queues now show
> 0 bytes across the board, including all processor groups.
>
> My content repository is still showing 22G usage. My expectation was that
> it would reduce back down to minimal capacity usage when my workflow
> completed processing all these files, but it has not.
>
> I run nifi on my Linux box as a service. I gracefully stopped the service.
> Let it sit for a minute or two. Checked my logs. No problems noted. I
> started nifi again as a service. No errors in the logs. I still find that
> the content repo disk device shows 22G used capacity.
>
> Does the content repository retain content history for a period of time
> after a flowfile has exited from nifi? Can that be regulated?
>
> Thanks in advance for your insights.
>
> Jim
>