You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> on 2016/04/03 22:09:26 UTC

Question about content repository archiving

I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.

$ du -sh /opt/mount2/nifi_data/*
712K /opt/mount2/nifi_data/conf
147G /opt/mount2/nifi_data/content_repository
656K /opt/mount2/nifi_data/database_repository
318M /opt/mount2/nifi_data/flowfile_repository
937M /opt/mount2/nifi_data/provenance_repository

$ df -lh | grep mount2
/dev/mapper/disk2-mount2
                      148G  148G     0 100% /opt/mount2

# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=/nifi-content-viewer/

Can anyone point out what I might be missing?

Thanks,
Chris

Re: Question about content repository archiving

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Hi Joe,

There were around 50 or so “Queued” < 100MB.

Restarting the cluster nodes cleared up the used space.  So apparently at some point the archiver stopped doing its thing.  I looked for odd messages from FileSystemRepository but didn’t see anything.  This problem looks similar to the one discussed in the the thread which can be accessed by the URL below.  The difference seems to be is that in my case just restarting with the existing settings got things back on track.

http://markmail.org/message/5kj3akgexosg4jav#query:+page:1+mid:ohxtkuayhqqhcmjc+state:results

Thanks,

Chris




On 4/3/16, 5:23 PM, "Joe Witt" <jo...@gmail.com> wrote:

>Chris,
>
>How many flow files do you have actively in the flow?  The archive
>goal says 50% of total space and the space appears to be 148GB so
>archive should start removing old files around 70ish GB.  But for data
>actually sitting in the flow NiFi will retain that.
>
>Thanks
>Joe
>
>On Sun, Apr 3, 2016 at 4:09 PM, McDermott, Chris Kevin (MSDU -
>STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>> I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.
>>
>> $ du -sh /opt/mount2/nifi_data/*
>> 712K /opt/mount2/nifi_data/conf
>> 147G /opt/mount2/nifi_data/content_repository
>> 656K /opt/mount2/nifi_data/database_repository
>> 318M /opt/mount2/nifi_data/flowfile_repository
>> 937M /opt/mount2/nifi_data/provenance_repository
>>
>> $ df -lh | grep mount2
>> /dev/mapper/disk2-mount2
>>                       148G  148G     0 100% /opt/mount2
>>
>> # Content Repository
>> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>> nifi.content.claim.max.appendable.size=10 MB
>> nifi.content.claim.max.flow.files=100
>> nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
>> nifi.content.repository.archive.max.retention.period=12 hours
>> nifi.content.repository.archive.max.usage.percentage=50%
>> nifi.content.repository.archive.enabled=true
>> nifi.content.repository.always.sync=false
>> nifi.content.viewer.url=/nifi-content-viewer/
>>
>> Can anyone point out what I might be missing?
>>
>> Thanks,
>> Chris

Re: Question about content repository archiving

Posted by Joe Witt <jo...@gmail.com>.
Chris,

How many flow files do you have actively in the flow?  The archive
goal says 50% of total space and the space appears to be 148GB so
archive should start removing old files around 70ish GB.  But for data
actually sitting in the flow NiFi will retain that.

Thanks
Joe

On Sun, Apr 3, 2016 at 4:09 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
> I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.
>
> $ du -sh /opt/mount2/nifi_data/*
> 712K /opt/mount2/nifi_data/conf
> 147G /opt/mount2/nifi_data/content_repository
> 656K /opt/mount2/nifi_data/database_repository
> 318M /opt/mount2/nifi_data/flowfile_repository
> 937M /opt/mount2/nifi_data/provenance_repository
>
> $ df -lh | grep mount2
> /dev/mapper/disk2-mount2
>                       148G  148G     0 100% /opt/mount2
>
> # Content Repository
> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
> nifi.content.claim.max.appendable.size=10 MB
> nifi.content.claim.max.flow.files=100
> nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
> nifi.content.repository.archive.max.retention.period=12 hours
> nifi.content.repository.archive.max.usage.percentage=50%
> nifi.content.repository.archive.enabled=true
> nifi.content.repository.always.sync=false
> nifi.content.viewer.url=/nifi-content-viewer/
>
> Can anyone point out what I might be missing?
>
> Thanks,
> Chris

Re: Question about content repository archiving

Posted by Joe Witt <jo...@gmail.com>.
All,

Just to help close the loop on this one the root cause of the issue
appears to have been identified and in a really specific case we can
end up in an infinite and non-productive loop in the archive cleanup
thread.  Have the patch staged for the 0.6.1 release we're working and
will make sure it is on 0.7 and 1.x line as well.

Big thanks to Chris for helping dig into the details needed to find the issue.

Thanks
Joe

On Mon, Apr 4, 2016 at 3:24 PM, Joe Witt <jo...@gmail.com> wrote:
> Sounds good.  Let's dig in on this one.  I've added some replies to the JIRA.
>
> On Mon, Apr 4, 2016 at 2:58 PM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>> The problem reoccured.  I’ve created
>>
>> 1. NIFI-1726 <https://issues.apache.org/jira/browse/NIFI-1726>
>>
>>
>> Thanks,
>> Chris
>>
>>
>>
>>
>> On 4/4/16, 11:40 AM, "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> wrote:
>>
>>>Hi Joe,
>>>
>>>I am running fro 16108467c19f59 plus the pull request for NIFI-1660
>>>
>>>Will do on the dump.
>>>
>>>Thanks,
>>>Chris
>>>
>>>From: Joe Witt <jo...@gmail.com>>
>>>Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>>>Date: Sunday, April 3, 2016 at 8:34 PM
>>>To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>>>Subject: Re: Question about content repository archiving
>>>
>>>
>>>Chris
>>>
>>>What version of nifi are you on?
>>>
>>>If you happen to see such as case again please do take a thread dump of nifi as well.    bin/nifi.sh dump
>>>
>>>Thanks
>>>Joe
>>>
>>>Thanks
>>>Joe
>>>
>>>On Apr 3, 2016 8:20 PM, "Andre" <an...@fucs.org>> wrote:
>>>Chris,
>>>
>>>Can you navigate under /opt/mount2/nifi_data/content_repository and run
>>>
>>>du -hsc */archive
>>>
>>>keen to see if your archives aren't being deleted after expiry...
>>>
>>>Seems pretty much like what I experienced as well ( http://markmail.org/message/37a2dxtkfvhwyty7 )
>>>
>>>
>>>On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
>>>I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.
>>>
>>>$ du -sh /opt/mount2/nifi_data/*
>>>712K /opt/mount2/nifi_data/conf
>>>147G /opt/mount2/nifi_data/content_repository
>>>656K /opt/mount2/nifi_data/database_repository
>>>318M /opt/mount2/nifi_data/flowfile_repository
>>>937M /opt/mount2/nifi_data/provenance_repository
>>>
>>>$ df -lh | grep mount2
>>>/dev/mapper/disk2-mount2
>>>                      148G  148G     0 100% /opt/mount2
>>>
>>># Content Repository
>>>nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>>>nifi.content.claim.max.appendable.size=10 MB
>>>nifi.content.claim.max.flow.files=100
>>>nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
>>>nifi.content.repository.archive.max.retention.period=12 hours
>>>nifi.content.repository.archive.max.usage.percentage=50%
>>>nifi.content.repository.archive.enabled=true
>>>nifi.content.repository.always.sync=false
>>>nifi.content.viewer.url=/nifi-content-viewer/
>>>
>>>Can anyone point out what I might be missing?
>>>
>>>Thanks,
>>>Chris
>>>

Re: Question about content repository archiving

Posted by Joe Witt <jo...@gmail.com>.
Sounds good.  Let's dig in on this one.  I've added some replies to the JIRA.

On Mon, Apr 4, 2016 at 2:58 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
> The problem reoccured.  I’ve created
>
> 1. NIFI-1726 <https://issues.apache.org/jira/browse/NIFI-1726>
>
>
> Thanks,
> Chris
>
>
>
>
> On 4/4/16, 11:40 AM, "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> wrote:
>
>>Hi Joe,
>>
>>I am running fro 16108467c19f59 plus the pull request for NIFI-1660
>>
>>Will do on the dump.
>>
>>Thanks,
>>Chris
>>
>>From: Joe Witt <jo...@gmail.com>>
>>Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>>Date: Sunday, April 3, 2016 at 8:34 PM
>>To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>>Subject: Re: Question about content repository archiving
>>
>>
>>Chris
>>
>>What version of nifi are you on?
>>
>>If you happen to see such as case again please do take a thread dump of nifi as well.    bin/nifi.sh dump
>>
>>Thanks
>>Joe
>>
>>Thanks
>>Joe
>>
>>On Apr 3, 2016 8:20 PM, "Andre" <an...@fucs.org>> wrote:
>>Chris,
>>
>>Can you navigate under /opt/mount2/nifi_data/content_repository and run
>>
>>du -hsc */archive
>>
>>keen to see if your archives aren't being deleted after expiry...
>>
>>Seems pretty much like what I experienced as well ( http://markmail.org/message/37a2dxtkfvhwyty7 )
>>
>>
>>On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
>>I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.
>>
>>$ du -sh /opt/mount2/nifi_data/*
>>712K /opt/mount2/nifi_data/conf
>>147G /opt/mount2/nifi_data/content_repository
>>656K /opt/mount2/nifi_data/database_repository
>>318M /opt/mount2/nifi_data/flowfile_repository
>>937M /opt/mount2/nifi_data/provenance_repository
>>
>>$ df -lh | grep mount2
>>/dev/mapper/disk2-mount2
>>                      148G  148G     0 100% /opt/mount2
>>
>># Content Repository
>>nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>>nifi.content.claim.max.appendable.size=10 MB
>>nifi.content.claim.max.flow.files=100
>>nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
>>nifi.content.repository.archive.max.retention.period=12 hours
>>nifi.content.repository.archive.max.usage.percentage=50%
>>nifi.content.repository.archive.enabled=true
>>nifi.content.repository.always.sync=false
>>nifi.content.viewer.url=/nifi-content-viewer/
>>
>>Can anyone point out what I might be missing?
>>
>>Thanks,
>>Chris
>>

Re: Question about content repository archiving

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
The problem reoccured.  I’ve created 

1. NIFI-1726 <https://issues.apache.org/jira/browse/NIFI-1726>


Thanks,
Chris




On 4/4/16, 11:40 AM, "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com> wrote:

>Hi Joe,
>
>I am running fro 16108467c19f59 plus the pull request for NIFI-1660
>
>Will do on the dump.
>
>Thanks,
>Chris
>
>From: Joe Witt <jo...@gmail.com>>
>Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>Date: Sunday, April 3, 2016 at 8:34 PM
>To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
>Subject: Re: Question about content repository archiving
>
>
>Chris
>
>What version of nifi are you on?
>
>If you happen to see such as case again please do take a thread dump of nifi as well.    bin/nifi.sh dump
>
>Thanks
>Joe
>
>Thanks
>Joe
>
>On Apr 3, 2016 8:20 PM, "Andre" <an...@fucs.org>> wrote:
>Chris,
>
>Can you navigate under /opt/mount2/nifi_data/content_repository and run
>
>du -hsc */archive
>
>keen to see if your archives aren't being deleted after expiry...
>
>Seems pretty much like what I experienced as well ( http://markmail.org/message/37a2dxtkfvhwyty7 )
>
>
>On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
>I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.
>
>$ du -sh /opt/mount2/nifi_data/*
>712K /opt/mount2/nifi_data/conf
>147G /opt/mount2/nifi_data/content_repository
>656K /opt/mount2/nifi_data/database_repository
>318M /opt/mount2/nifi_data/flowfile_repository
>937M /opt/mount2/nifi_data/provenance_repository
>
>$ df -lh | grep mount2
>/dev/mapper/disk2-mount2
>                      148G  148G     0 100% /opt/mount2
>
># Content Repository
>nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>nifi.content.claim.max.appendable.size=10 MB
>nifi.content.claim.max.flow.files=100
>nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
>nifi.content.repository.archive.max.retention.period=12 hours
>nifi.content.repository.archive.max.usage.percentage=50%
>nifi.content.repository.archive.enabled=true
>nifi.content.repository.always.sync=false
>nifi.content.viewer.url=/nifi-content-viewer/
>
>Can anyone point out what I might be missing?
>
>Thanks,
>Chris
>

Re: Question about content repository archiving

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Hi Joe,

I am running fro 16108467c19f59 plus the pull request for NIFI-1660

Will do on the dump.

Thanks,
Chris

From: Joe Witt <jo...@gmail.com>>
Reply-To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Date: Sunday, April 3, 2016 at 8:34 PM
To: "users@nifi.apache.org<ma...@nifi.apache.org>" <us...@nifi.apache.org>>
Subject: Re: Question about content repository archiving


Chris

What version of nifi are you on?

If you happen to see such as case again please do take a thread dump of nifi as well.    bin/nifi.sh dump

Thanks
Joe

Thanks
Joe

On Apr 3, 2016 8:20 PM, "Andre" <an...@fucs.org>> wrote:
Chris,

Can you navigate under /opt/mount2/nifi_data/content_repository and run

du -hsc */archive

keen to see if your archives aren't being deleted after expiry...

Seems pretty much like what I experienced as well ( http://markmail.org/message/37a2dxtkfvhwyty7 )


On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote) <ch...@hpe.com>> wrote:
I’m having a problem where the disk on which I am storing my NiFi data keeps filling up.  The content repository is taking up the overwhelming majority of that space.  It seems that the archiver is not working the way I would expect.

$ du -sh /opt/mount2/nifi_data/*
712K /opt/mount2/nifi_data/conf
147G /opt/mount2/nifi_data/content_repository
656K /opt/mount2/nifi_data/database_repository
318M /opt/mount2/nifi_data/flowfile_repository
937M /opt/mount2/nifi_data/provenance_repository

$ df -lh | grep mount2
/dev/mapper/disk2-mount2
                      148G  148G     0 100% /opt/mount2

# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=/nifi-content-viewer/

Can anyone point out what I might be missing?

Thanks,
Chris


Re: Question about content repository archiving

Posted by Joe Witt <jo...@gmail.com>.
Chris

What version of nifi are you on?

If you happen to see such as case again please do take a thread dump of
nifi as well.    bin/nifi.sh dump

Thanks
Joe

Thanks
Joe
On Apr 3, 2016 8:20 PM, "Andre" <an...@fucs.org> wrote:

> Chris,
>
> Can you navigate under /opt/mount2/nifi_data/content_repository and run
>
> du -hsc */archive
>
> keen to see if your archives aren't being deleted after expiry...
>
> Seems pretty much like what I experienced as well (
> http://markmail.org/message/37a2dxtkfvhwyty7 )
>
>
> On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>
>> I’m having a problem where the disk on which I am storing my NiFi data
>> keeps filling up.  The content repository is taking up the overwhelming
>> majority of that space.  It seems that the archiver is not working the way
>> I would expect.
>>
>> $ du -sh /opt/mount2/nifi_data/*
>> 712K /opt/mount2/nifi_data/conf
>> 147G /opt/mount2/nifi_data/content_repository
>> 656K /opt/mount2/nifi_data/database_repository
>> 318M /opt/mount2/nifi_data/flowfile_repository
>> 937M /opt/mount2/nifi_data/provenance_repository
>>
>> $ df -lh | grep mount2
>> /dev/mapper/disk2-mount2
>>                       148G  148G     0 100% /opt/mount2
>>
>> # Content Repository
>>
>> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>> nifi.content.claim.max.appendable.size=10 MB
>> nifi.content.claim.max.flow.files=100
>>
>> nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
>> nifi.content.repository.archive.max.retention.period=12 hours
>> nifi.content.repository.archive.max.usage.percentage=50%
>> nifi.content.repository.archive.enabled=true
>> nifi.content.repository.always.sync=false
>> nifi.content.viewer.url=/nifi-content-viewer/
>>
>> Can anyone point out what I might be missing?
>>
>> Thanks,
>> Chris
>>
>
>

Re: Question about content repository archiving

Posted by Andre <an...@fucs.org>.
Chris,

Can you navigate under /opt/mount2/nifi_data/content_repository and run

du -hsc */archive

keen to see if your archives aren't being deleted after expiry...

Seems pretty much like what I experienced as well (
http://markmail.org/message/37a2dxtkfvhwyty7 )


On Mon, Apr 4, 2016 at 6:09 AM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:

> I’m having a problem where the disk on which I am storing my NiFi data
> keeps filling up.  The content repository is taking up the overwhelming
> majority of that space.  It seems that the archiver is not working the way
> I would expect.
>
> $ du -sh /opt/mount2/nifi_data/*
> 712K /opt/mount2/nifi_data/conf
> 147G /opt/mount2/nifi_data/content_repository
> 656K /opt/mount2/nifi_data/database_repository
> 318M /opt/mount2/nifi_data/flowfile_repository
> 937M /opt/mount2/nifi_data/provenance_repository
>
> $ df -lh | grep mount2
> /dev/mapper/disk2-mount2
>                       148G  148G     0 100% /opt/mount2
>
> # Content Repository
>
> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
> nifi.content.claim.max.appendable.size=10 MB
> nifi.content.claim.max.flow.files=100
>
> nifi.content.repository.directory.default=/opt/mount2/nifi_data/content_repository
> nifi.content.repository.archive.max.retention.period=12 hours
> nifi.content.repository.archive.max.usage.percentage=50%
> nifi.content.repository.archive.enabled=true
> nifi.content.repository.always.sync=false
> nifi.content.viewer.url=/nifi-content-viewer/
>
> Can anyone point out what I might be missing?
>
> Thanks,
> Chris
>