You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by "Jens M. Kofoed" <jm...@gmail.com> on 2022/08/10 10:10:40 UTC

Check Provenance data

Hi

Is it possible to check the content of the Provenance data?
I have a 3 node cluster, and for month the provenance data have used about
10% of the disk space (10 GB of 100 GB disk) which isn't very much. But
yesterday the disk went full so I expanded the disk to 500 GB, and the
provenance is using around 102 GB on each node.

I can't see any significantly change in the daily amount of flowfiles. So
how can I debug/check why the Provenance filled up the disk.

kind regards
Jens M. Kofoed

Re: Check Provenance data

Posted by Mark Payne <ma...@hotmail.com>.
Jens,

If you are not seeing any difference in the number of FlowFiles (and not doing significantly larger amount of processing)
the increase in provenance space likely means that you’re now storing a lot more information in attributes. The provenance
events include FlowFile attributes in them, so adding large attributes (or large numbers of attributes) will take up a lot more
disk space.

Thanks
-Mark


> On Aug 10, 2022, at 6:10 AM, Jens M. Kofoed <jm...@gmail.com> wrote:
> 
> Hi
> 
> Is it possible to check the content of the Provenance data?
> I have a 3 node cluster, and for month the provenance data have used about
> 10% of the disk space (10 GB of 100 GB disk) which isn't very much. But
> yesterday the disk went full so I expanded the disk to 500 GB, and the
> provenance is using around 102 GB on each node.
> 
> I can't see any significantly change in the daily amount of flowfiles. So
> how can I debug/check why the Provenance filled up the disk.
> 
> kind regards
> Jens M. Kofoed