You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Darren Govoni <da...@ontrenet.com> on 2020/04/08 18:30:58 UTC

Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren


Re: Not Seeing Provenance data

Posted by Darren Govoni <da...@ontrenet.com>.
It never has worked for me with a simple, out of the box install on one machine in EC2.

But there should be a configuration that keeps the last X hours of provenance. NOT based on wall clock time.

For example. I want the last 24 hours of provenance regardless if the last time a processor ran was 3 days ago. So it would be relative to the latest logged data.

Sent from my Verizon, Samsung Galaxy smartphone

________________________________
From: Wyllys Ingersoll <wy...@keepertech.com>
Sent: Saturday, April 11, 2020 9:45:57 AM
To: users@nifi.apache.org <us...@nifi.apache.org>
Subject: Re: Not Seeing Provenance data

Nope, already checked that.

On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <pt...@cox.net>> wrote:

No issues here.  Sounds like a timezone / system clock / clock drift issue (in a cluster).

On 4/10/2020 11:59 AM, Joe Witt wrote:
The provenance repo is in large scale use by many many users so fundamentally it does work.  There are conditions that apparently need improving.  In the past couple days these items have been flagged by folks on this list, JIRAs and PRs raised and merged, all good. If you can help by creating a build of the latest and confirm it fixes your case then please do so.

Thanks

On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com>> wrote:
It would seem the feature is either broken completely or only works in specific conditions.

Can the Nifi team put a fix on their road map for this?
Its a rather central feature to Nifi.

Sent from my Verizon, Samsung Galaxy smartphone

________________________________
From: Wyllys Ingersoll <wy...@keepertech.com>>
Sent: Friday, April 10, 2020 11:17:42 AM
To: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>>
Subject: Re: Not Seeing Provenance data

I have a similar problem with viewing provenance.  I have a 3-node cluster in a kubernetes environment, the provenance_repository directory for each node is on a persistent data store so it is not deleted or lost between container restarts (which are not very common).  My nifi.provenance.repository.max.storage.time is 24 hours.

Whenever I try to view any provenance, nothing is ever shown.  If I manually inspect the provenance_repository directory, there is a lucene index and TOC being created.

I see log messages like these:

Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with identifier 64a703fe-0171-1000-0000-000065abd91a against index directories [./provenance_repository/lucene-8-index-1560864819888]
Returning the following list of index locations because they were finished being written to before 1586531601311: []
Found no events in the Provenance Repository. In order to perform maintenace of the indices, will assume that the first event time is now (1586531601311)


Any suggestions?

-Wyllys Ingersoll



On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

Hey Mark,



great news and thank you very much!



Happy Holidays!

Harald



Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 17:18
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Thanks Harald,



I have created a Jira [1] for this. There’s currently a PR up for it as well.



Thanks

-Mark



[1] https://issues.apache.org/jira/browse/NIFI-7346



On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



Hi Mark,



I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.



So yes, I’d say your Idea

    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data

     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”

     that it’s about to delete is also the “active file”.’

fits very nicely to my test.



As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.



Thanks again for looking into this.

Harald





Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



Hi Mark,



thank you for looking into this.



The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉



But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.



Thank you!

Harald





Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Hey Daren, Herald,



Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.



I think I may understand the problem, now, though, after looking at it again.



In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"

Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?



If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.



Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?



Thanks

-Mark







On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?



Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…



Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data



Hi,

  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?



thanks,

Darren





Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring



Re: Not Seeing Provenance data

Posted by Jairo Henao <ja...@gmail.com>.
Hi,

I went from an instance without security to one where I configured HTTPS.
After enabling security with policies and users, I couldn't check the
provenance with the admin user.

When I added the following policy, the provenance was again visible to me
(I don't know if this is documented anywhere):

<policy identifier="e097fd1d-0171-1000-fc74-4ca6dc8b3aed"
resource="/provenance-data/process-groups/<<MAIN_GROUP_ID>>" action="R">
            <user identifier="<<USER_IDENTIFIER>>"/>
        </policy>



On Sat, Apr 11, 2020 at 12:58 PM Wyllys Ingersoll <
wyllys.ingersoll@keepertech.com> wrote:

> Yes, each node has its persistent stores for each of those directories.
>
> On Sat, Apr 11, 2020 at 10:20 AM Patrick Timmins <pt...@cox.net> wrote:
>
>> Is the underlying storage for the four repositories (provenance,
>> database, flowfile, and content) consistent within a node?
>>
>> Are all three nodes in the cluster using the same type of underlying
>> storage/device for the various NiFi repositories?
>>
>>
>> On 4/11/2020 8:45 AM, Wyllys Ingersoll wrote:
>>
>> Nope, already checked that.
>>
>> On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <pt...@cox.net> wrote:
>>
>>> No issues here.  Sounds like a timezone / system clock / clock drift
>>> issue (in a cluster).
>>> On 4/10/2020 11:59 AM, Joe Witt wrote:
>>>
>>> The provenance repo is in large scale use by many many users so
>>> fundamentally it does work.  There are conditions that apparently need
>>> improving.  In the past couple days these items have been flagged by folks
>>> on this list, JIRAs and PRs raised and merged, all good. If you can help by
>>> creating a build of the latest and confirm it fixes your case then please
>>> do so.
>>>
>>> Thanks
>>>
>>> On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com>
>>> wrote:
>>>
>>>> It would seem the feature is either broken completely or only works in
>>>> specific conditions.
>>>>
>>>> Can the Nifi team put a fix on their road map for this?
>>>> Its a rather central feature to Nifi.
>>>>
>>>> Sent from my Verizon, Samsung Galaxy smartphone
>>>>
>>>> ------------------------------
>>>> *From:* Wyllys Ingersoll <wy...@keepertech.com>
>>>> *Sent:* Friday, April 10, 2020 11:17:42 AM
>>>> *To:* users@nifi.apache.org <us...@nifi.apache.org>
>>>> *Subject:* Re: Not Seeing Provenance data
>>>>
>>>> I have a similar problem with viewing provenance.  I have a 3-node
>>>> cluster in a kubernetes environment, the provenance_repository directory
>>>> for each node is on a persistent data store so it is not deleted or lost
>>>> between container restarts (which are not very common).  My
>>>> nifi.provenance.repository.max.storage.time is 24 hours.
>>>>
>>>> Whenever I try to view any provenance, nothing is ever shown.  If I
>>>> manually inspect the provenance_repository directory, there is a lucene
>>>> index and TOC being created.
>>>>
>>>> I see log messages like these:
>>>>
>>>> Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with
>>>> identifier 64a703fe-0171-1000-0000-000065abd91a against index directories
>>>> [./provenance_repository/lucene-8-index-1560864819888]
>>>> Returning the following list of index locations because they were
>>>> finished being written to before 1586531601311: []
>>>> Found no events in the Provenance Repository. In order to perform
>>>> maintenace of the indices, will assume that the first event time is now
>>>> (1586531601311)
>>>>
>>>>
>>>> Any suggestions?
>>>>
>>>> -Wyllys Ingersoll
>>>>
>>>>
>>>>
>>>> On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <
>>>> harald.dobbernack@key-work.de> wrote:
>>>>
>>>> Hey Mark,
>>>>
>>>>
>>>>
>>>> great news and thank you very much!
>>>>
>>>>
>>>>
>>>> Happy Holidays!
>>>>
>>>> Harald
>>>>
>>>>
>>>>
>>>> *Von:* Mark Payne <ma...@hotmail.com>
>>>> *Gesendet:* Donnerstag, 9. April 2020 17:18
>>>> *An:* users@nifi.apache.org
>>>> *Betreff:* Re: Not Seeing Provenance data
>>>>
>>>>
>>>>
>>>> Thanks Harald,
>>>>
>>>>
>>>>
>>>> I have created a Jira [1] for this. There’s currently a PR up for it as
>>>> well.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> -Mark
>>>>
>>>>
>>>>
>>>> [1] https://issues.apache.org/jira/browse/NIFI-7346
>>>>
>>>>
>>>>
>>>> On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <
>>>> harald.dobbernack@key-work.de> wrote:
>>>>
>>>>
>>>>
>>>> Hi Mark,
>>>>
>>>>
>>>>
>>>> I can confirm after testing that if no provenance event has been
>>>> generated in a time greater than the set nifi.provenance.repository.max.storage.time
>>>> then as expected the last recorded provenance events don’t exist anymore
>>>> but also from then on any new provenance events are also not searchable,
>>>> the provenance Search remains completely empty regardless of how many flows
>>>> are active.  As described also *.prov file is then missing in provenance
>>>> repository. After restart of Nifi new prov File will be generated and
>>>> provenance will work again, but only showing stuff generated since last
>>>> NiFi Start.
>>>>
>>>>
>>>>
>>>> So yes, I’d say your Idea
>>>>
>>>>     ‘If so, then I think that would understand why it deleted the data.
>>>> It’s trying to age off old data
>>>>
>>>>      but unfortunately it doesn’t perform a check to first determine
>>>> whether or not the “old file”
>>>>
>>>>      that it’s about to delete is also the “active file”.’
>>>>
>>>> fits very nicely to my test.
>>>>
>>>>
>>>>
>>>> As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time
>>>> until this can be resolved.
>>>>
>>>>
>>>>
>>>> Thanks again for looking into this.
>>>>
>>>> Harald
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Von:* Dobbernack, Harald (Key-Work)
>>>> *Gesendet:* Donnerstag, 9. April 2020 15:22
>>>> *An:* users@nifi.apache.org
>>>> *Betreff:* AW: Not Seeing Provenance data
>>>>
>>>>
>>>>
>>>> Hi Mark,
>>>>
>>>>
>>>>
>>>> thank you for looking into this.
>>>>
>>>>
>>>>
>>>> The nifi.provenance.repository.max.storage.time setting might explain
>>>> why I haven’t been experiencing the effect so often since changing from the
>>>> default to 120 hours a few months ago 😉
>>>>
>>>>
>>>>
>>>> But I believe provenance stopped working last time although there was
>>>> an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send
>>>> a message’ before being rerouted to the same wait processor. I would have
>>>> expected this generates provenance entries?  As I am not actually 100% sure
>>>> if that wait processor was in use when last provenance got lost I will
>>>> check with a testing system to see if I can reproduce provenance breakage
>>>> when no active flows are around for a time greater
>>>>  nifi.provenance.repository.max.storage.time and I will get back to
>>>> you.
>>>>
>>>>
>>>>
>>>> Thank you!
>>>>
>>>> Harald
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Von:* Mark Payne <ma...@hotmail.com>
>>>> *Gesendet:* Donnerstag, 9. April 2020 14:41
>>>> *An:* users@nifi.apache.org
>>>> *Betreff:* Re: Not Seeing Provenance data
>>>>
>>>>
>>>>
>>>> Hey Daren, Herald,
>>>>
>>>>
>>>>
>>>> Thanks for the note. I have seen this once before but couldn’t figure
>>>> out what caused it. Restarting addressed the issue.
>>>>
>>>>
>>>>
>>>> I think I may understand the problem, now, though, after looking at it
>>>> again.
>>>>
>>>>
>>>>
>>>> In nifi.properties, there are a couple of property named
>>>> “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
>>>>
>>>> Is it possible that you went 24 hours (or whatever value is set for
>>>> that property) without generating any Provenance events?
>>>>
>>>>
>>>>
>>>> If so, then I think that would understand why it deleted the data. It’s
>>>> trying to age off old data but unfortunately it doesn’t perform a check to
>>>> first determine whether or not the “old file” that it’s about to delete is
>>>> also the “active file”.
>>>>
>>>>
>>>>
>>>> Can you confirm whether or not you would expect to see 24 hours pass
>>>> without any provenance data?
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> -Mark
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <
>>>> harald.dobbernack@key-work.de> wrote:
>>>>
>>>>
>>>>
>>>> What I noticed is that as long as provenance is working there will be
>>>> *.prov files in the directory. When Provenance isn’t working these files
>>>> are not to be seen. Maybe some Cleaning Process deletes those files
>>>> prematurely or the process building them doesn’t work any more?
>>>>
>>>>
>>>>
>>>> *Von:* Dobbernack, Harald (Key-Work) <ha...@key-work.de>
>>>> *Gesendet:* Donnerstag, 9. April 2020 10:27
>>>> *An:* users@nifi.apache.org
>>>> *Betreff:* AW: Not Seeing Provenance data
>>>>
>>>>
>>>>
>>>> This is something I experience too from time to time. My quick and
>>>> dirty workaround is stop nifi, delete everything in the provenance
>>>> directory, restart….  Then Provenance is usable again (of course only with
>>>> data since the delete) . I’m hoping very much there is a better way,
>>>> someone can show us better settings or a potential bug can be discovered…
>>>>
>>>>
>>>>
>>>> *Von:* Darren Govoni <da...@ontrenet.com>
>>>> *Gesendet:* Mittwoch, 8. April 2020 20:31
>>>> *An:* users@nifi.apache.org
>>>> *Betreff:* Not Seeing Provenance data
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>   When I go to "View data provenance" in Nifi, I never see any logs for
>>>> my flow. Am I missing some configuration setting somewhere?
>>>>
>>>>
>>>>
>>>> thanks,
>>>>
>>>> Darren
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Harald Dobbernack*
>>>> Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany
>>>> | https://www.key-work.de | Datenschutz
>>>> <https://www.key-work.de/de/footer/datenschutz.html>
>>>> Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax:
>>>> +49-721-78203-10
>>>>
>>>> Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
>>>> Geschäftsführer: Andreas Stappert, Tobin Wotring
>>>>
>>>>
>>>>
>>>>

-- 
Saludos

Jairo Henao

*Chat Skype: jairo.henao.05*

Re: Not Seeing Provenance data

Posted by Wyllys Ingersoll <wy...@keepertech.com>.
Yes, each node has its persistent stores for each of those directories.

On Sat, Apr 11, 2020 at 10:20 AM Patrick Timmins <pt...@cox.net> wrote:

> Is the underlying storage for the four repositories (provenance,
> database, flowfile, and content) consistent within a node?
>
> Are all three nodes in the cluster using the same type of underlying
> storage/device for the various NiFi repositories?
>
>
> On 4/11/2020 8:45 AM, Wyllys Ingersoll wrote:
>
> Nope, already checked that.
>
> On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <pt...@cox.net> wrote:
>
>> No issues here.  Sounds like a timezone / system clock / clock drift
>> issue (in a cluster).
>> On 4/10/2020 11:59 AM, Joe Witt wrote:
>>
>> The provenance repo is in large scale use by many many users so
>> fundamentally it does work.  There are conditions that apparently need
>> improving.  In the past couple days these items have been flagged by folks
>> on this list, JIRAs and PRs raised and merged, all good. If you can help by
>> creating a build of the latest and confirm it fixes your case then please
>> do so.
>>
>> Thanks
>>
>> On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com>
>> wrote:
>>
>>> It would seem the feature is either broken completely or only works in
>>> specific conditions.
>>>
>>> Can the Nifi team put a fix on their road map for this?
>>> Its a rather central feature to Nifi.
>>>
>>> Sent from my Verizon, Samsung Galaxy smartphone
>>>
>>> ------------------------------
>>> *From:* Wyllys Ingersoll <wy...@keepertech.com>
>>> *Sent:* Friday, April 10, 2020 11:17:42 AM
>>> *To:* users@nifi.apache.org <us...@nifi.apache.org>
>>> *Subject:* Re: Not Seeing Provenance data
>>>
>>> I have a similar problem with viewing provenance.  I have a 3-node
>>> cluster in a kubernetes environment, the provenance_repository directory
>>> for each node is on a persistent data store so it is not deleted or lost
>>> between container restarts (which are not very common).  My
>>> nifi.provenance.repository.max.storage.time is 24 hours.
>>>
>>> Whenever I try to view any provenance, nothing is ever shown.  If I
>>> manually inspect the provenance_repository directory, there is a lucene
>>> index and TOC being created.
>>>
>>> I see log messages like these:
>>>
>>> Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with
>>> identifier 64a703fe-0171-1000-0000-000065abd91a against index directories
>>> [./provenance_repository/lucene-8-index-1560864819888]
>>> Returning the following list of index locations because they were
>>> finished being written to before 1586531601311: []
>>> Found no events in the Provenance Repository. In order to perform
>>> maintenace of the indices, will assume that the first event time is now
>>> (1586531601311)
>>>
>>>
>>> Any suggestions?
>>>
>>> -Wyllys Ingersoll
>>>
>>>
>>>
>>> On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <
>>> harald.dobbernack@key-work.de> wrote:
>>>
>>> Hey Mark,
>>>
>>>
>>>
>>> great news and thank you very much!
>>>
>>>
>>>
>>> Happy Holidays!
>>>
>>> Harald
>>>
>>>
>>>
>>> *Von:* Mark Payne <ma...@hotmail.com>
>>> *Gesendet:* Donnerstag, 9. April 2020 17:18
>>> *An:* users@nifi.apache.org
>>> *Betreff:* Re: Not Seeing Provenance data
>>>
>>>
>>>
>>> Thanks Harald,
>>>
>>>
>>>
>>> I have created a Jira [1] for this. There’s currently a PR up for it as
>>> well.
>>>
>>>
>>>
>>> Thanks
>>>
>>> -Mark
>>>
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/NIFI-7346
>>>
>>>
>>>
>>> On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <
>>> harald.dobbernack@key-work.de> wrote:
>>>
>>>
>>>
>>> Hi Mark,
>>>
>>>
>>>
>>> I can confirm after testing that if no provenance event has been
>>> generated in a time greater than the set nifi.provenance.repository.max.storage.time
>>> then as expected the last recorded provenance events don’t exist anymore
>>> but also from then on any new provenance events are also not searchable,
>>> the provenance Search remains completely empty regardless of how many flows
>>> are active.  As described also *.prov file is then missing in provenance
>>> repository. After restart of Nifi new prov File will be generated and
>>> provenance will work again, but only showing stuff generated since last
>>> NiFi Start.
>>>
>>>
>>>
>>> So yes, I’d say your Idea
>>>
>>>     ‘If so, then I think that would understand why it deleted the data.
>>> It’s trying to age off old data
>>>
>>>      but unfortunately it doesn’t perform a check to first determine
>>> whether or not the “old file”
>>>
>>>      that it’s about to delete is also the “active file”.’
>>>
>>> fits very nicely to my test.
>>>
>>>
>>>
>>> As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time
>>> until this can be resolved.
>>>
>>>
>>>
>>> Thanks again for looking into this.
>>>
>>> Harald
>>>
>>>
>>>
>>>
>>>
>>> *Von:* Dobbernack, Harald (Key-Work)
>>> *Gesendet:* Donnerstag, 9. April 2020 15:22
>>> *An:* users@nifi.apache.org
>>> *Betreff:* AW: Not Seeing Provenance data
>>>
>>>
>>>
>>> Hi Mark,
>>>
>>>
>>>
>>> thank you for looking into this.
>>>
>>>
>>>
>>> The nifi.provenance.repository.max.storage.time setting might explain
>>> why I haven’t been experiencing the effect so often since changing from the
>>> default to 120 hours a few months ago 😉
>>>
>>>
>>>
>>> But I believe provenance stopped working last time although there was an
>>> ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a
>>> message’ before being rerouted to the same wait processor. I would have
>>> expected this generates provenance entries?  As I am not actually 100% sure
>>> if that wait processor was in use when last provenance got lost I will
>>> check with a testing system to see if I can reproduce provenance breakage
>>> when no active flows are around for a time greater
>>>  nifi.provenance.repository.max.storage.time and I will get back to you.
>>>
>>>
>>>
>>> Thank you!
>>>
>>> Harald
>>>
>>>
>>>
>>>
>>>
>>> *Von:* Mark Payne <ma...@hotmail.com>
>>> *Gesendet:* Donnerstag, 9. April 2020 14:41
>>> *An:* users@nifi.apache.org
>>> *Betreff:* Re: Not Seeing Provenance data
>>>
>>>
>>>
>>> Hey Daren, Herald,
>>>
>>>
>>>
>>> Thanks for the note. I have seen this once before but couldn’t figure
>>> out what caused it. Restarting addressed the issue.
>>>
>>>
>>>
>>> I think I may understand the problem, now, though, after looking at it
>>> again.
>>>
>>>
>>>
>>> In nifi.properties, there are a couple of property named
>>> “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
>>>
>>> Is it possible that you went 24 hours (or whatever value is set for that
>>> property) without generating any Provenance events?
>>>
>>>
>>>
>>> If so, then I think that would understand why it deleted the data. It’s
>>> trying to age off old data but unfortunately it doesn’t perform a check to
>>> first determine whether or not the “old file” that it’s about to delete is
>>> also the “active file”.
>>>
>>>
>>>
>>> Can you confirm whether or not you would expect to see 24 hours pass
>>> without any provenance data?
>>>
>>>
>>>
>>> Thanks
>>>
>>> -Mark
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <
>>> harald.dobbernack@key-work.de> wrote:
>>>
>>>
>>>
>>> What I noticed is that as long as provenance is working there will be
>>> *.prov files in the directory. When Provenance isn’t working these files
>>> are not to be seen. Maybe some Cleaning Process deletes those files
>>> prematurely or the process building them doesn’t work any more?
>>>
>>>
>>>
>>> *Von:* Dobbernack, Harald (Key-Work) <ha...@key-work.de>
>>> *Gesendet:* Donnerstag, 9. April 2020 10:27
>>> *An:* users@nifi.apache.org
>>> *Betreff:* AW: Not Seeing Provenance data
>>>
>>>
>>>
>>> This is something I experience too from time to time. My quick and dirty
>>> workaround is stop nifi, delete everything in the provenance directory,
>>> restart….  Then Provenance is usable again (of course only with data since
>>> the delete) . I’m hoping very much there is a better way, someone can show
>>> us better settings or a potential bug can be discovered…
>>>
>>>
>>>
>>> *Von:* Darren Govoni <da...@ontrenet.com>
>>> *Gesendet:* Mittwoch, 8. April 2020 20:31
>>> *An:* users@nifi.apache.org
>>> *Betreff:* Not Seeing Provenance data
>>>
>>>
>>>
>>> Hi,
>>>
>>>   When I go to "View data provenance" in Nifi, I never see any logs for
>>> my flow. Am I missing some configuration setting somewhere?
>>>
>>>
>>>
>>> thanks,
>>>
>>> Darren
>>>
>>>
>>>
>>>
>>>
>>> *Harald Dobbernack*
>>> Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany |
>>>  https://www.key-work.de | Datenschutz
>>> <https://www.key-work.de/de/footer/datenschutz.html>
>>> Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax:
>>> +49-721-78203-10
>>>
>>> Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
>>> Geschäftsführer: Andreas Stappert, Tobin Wotring
>>>
>>>
>>>
>>>

Re: Not Seeing Provenance data

Posted by Darren Govoni <da...@ontrenet.com>.
Yes. Didn't change anything. Just unzipped the nifi distro onto a big partition and ran it as is.

Sent from my Verizon, Samsung Galaxy smartphone

________________________________
From: Patrick Timmins <pt...@cox.net>
Sent: Saturday, April 11, 2020 10:20:12 AM
To: users@nifi.apache.org <us...@nifi.apache.org>
Subject: Re: Not Seeing Provenance data


Is the underlying storage for the four repositories (provenance, database, flowfile, and content) consistent within a node?

Are all three nodes in the cluster using the same type of underlying storage/device for the various NiFi repositories?


On 4/11/2020 8:45 AM, Wyllys Ingersoll wrote:
Nope, already checked that.

On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <pt...@cox.net>> wrote:

No issues here.  Sounds like a timezone / system clock / clock drift issue (in a cluster).

On 4/10/2020 11:59 AM, Joe Witt wrote:
The provenance repo is in large scale use by many many users so fundamentally it does work.  There are conditions that apparently need improving.  In the past couple days these items have been flagged by folks on this list, JIRAs and PRs raised and merged, all good. If you can help by creating a build of the latest and confirm it fixes your case then please do so.

Thanks

On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com>> wrote:
It would seem the feature is either broken completely or only works in specific conditions.

Can the Nifi team put a fix on their road map for this?
Its a rather central feature to Nifi.

Sent from my Verizon, Samsung Galaxy smartphone

________________________________
From: Wyllys Ingersoll <wy...@keepertech.com>>
Sent: Friday, April 10, 2020 11:17:42 AM
To: users@nifi.apache.org<ma...@nifi.apache.org> <us...@nifi.apache.org>>
Subject: Re: Not Seeing Provenance data

I have a similar problem with viewing provenance.  I have a 3-node cluster in a kubernetes environment, the provenance_repository directory for each node is on a persistent data store so it is not deleted or lost between container restarts (which are not very common).  My nifi.provenance.repository.max.storage.time is 24 hours.

Whenever I try to view any provenance, nothing is ever shown.  If I manually inspect the provenance_repository directory, there is a lucene index and TOC being created.

I see log messages like these:

Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with identifier 64a703fe-0171-1000-0000-000065abd91a against index directories [./provenance_repository/lucene-8-index-1560864819888]
Returning the following list of index locations because they were finished being written to before 1586531601311: []
Found no events in the Provenance Repository. In order to perform maintenace of the indices, will assume that the first event time is now (1586531601311)


Any suggestions?

-Wyllys Ingersoll



On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

Hey Mark,



great news and thank you very much!



Happy Holidays!

Harald



Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 17:18
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Thanks Harald,



I have created a Jira [1] for this. There’s currently a PR up for it as well.



Thanks

-Mark



[1] https://issues.apache.org/jira/browse/NIFI-7346



On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



Hi Mark,



I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.



So yes, I’d say your Idea

    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data

     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”

     that it’s about to delete is also the “active file”.’

fits very nicely to my test.



As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.



Thanks again for looking into this.

Harald





Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



Hi Mark,



thank you for looking into this.



The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉



But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.



Thank you!

Harald





Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Hey Daren, Herald,



Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.



I think I may understand the problem, now, though, after looking at it again.



In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"

Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?



If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.



Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?



Thanks

-Mark







On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?



Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…



Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data



Hi,

  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?



thanks,

Darren





Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring



Re: Not Seeing Provenance data

Posted by Patrick Timmins <pt...@cox.net>.
Is the underlying storage for the four repositories (provenance, 
database, flowfile, and content) consistent within a node?

Are all three nodes in the cluster using the same type of underlying 
storage/device for the various NiFi repositories?


On 4/11/2020 8:45 AM, Wyllys Ingersoll wrote:
> Nope, already checked that.
>
> On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <ptimmins@cox.net 
> <ma...@cox.net>> wrote:
>
>     No issues here.  Sounds like a timezone / system clock / clock
>     drift issue (in a cluster).
>
>     On 4/10/2020 11:59 AM, Joe Witt wrote:
>>     The provenance repo is in large scale use by many many users so
>>     fundamentally it does work.  There are conditions that apparently
>>     need improving.  In the past couple days these items have been
>>     flagged by folks on this list, JIRAs and PRs raised and merged,
>>     all good. If you can help by creating a build of the latest and
>>     confirm it fixes your case then please do so.
>>
>>     Thanks
>>
>>     On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni
>>     <darren@ontrenet.com <ma...@ontrenet.com>> wrote:
>>
>>         It would seem the feature is either broken completely or only
>>         works in specific conditions.
>>
>>         Can the Nifi team put a fix on their road map for this?
>>         Its a rather central feature to Nifi.
>>
>>         Sent from my Verizon, Samsung Galaxy smartphone
>>
>>         ------------------------------------------------------------------------
>>         *From:* Wyllys Ingersoll <wyllys.ingersoll@keepertech.com
>>         <ma...@keepertech.com>>
>>         *Sent:* Friday, April 10, 2020 11:17:42 AM
>>         *To:* users@nifi.apache.org <ma...@nifi.apache.org>
>>         <users@nifi.apache.org <ma...@nifi.apache.org>>
>>         *Subject:* Re: Not Seeing Provenance data
>>         I have a similar problem with viewing provenance.  I have a
>>         3-node cluster in a kubernetes environment, the
>>         provenance_repository directory for each node is on a
>>         persistent data store so it is not deleted or lost between
>>         container restarts (which are not very common).  My
>>         nifi.provenance.repository.max.storage.time is 24 hours.
>>
>>         Whenever I try to view any provenance, nothing is ever
>>         shown.  If I manually inspect the provenance_repository
>>         directory, there is a lucene index and TOC being created.
>>
>>         I see log messages like these:
>>
>>         Submitting query
>>         +processorId:882133fe-b684-148b-ad88-7850437ca591 with
>>         identifier 64a703fe-0171-1000-0000-000065abd91a against index
>>         directories
>>         [./provenance_repository/lucene-8-index-1560864819888]
>>         Returning the following list of index locations because they
>>         were finished being written to before 1586531601311: []
>>         Found no events in the Provenance Repository. In order to
>>         perform maintenace of the indices, will assume that the first
>>         event time is now (1586531601311)
>>
>>
>>         Any suggestions?
>>
>>         -Wyllys Ingersoll
>>
>>
>>
>>         On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work)
>>         <harald.dobbernack@key-work.de
>>         <ma...@key-work.de>> wrote:
>>
>>             Hey Mark,
>>
>>             great news and thank you very much!
>>
>>             Happy Holidays!
>>
>>             Harald
>>
>>             *Von:* Mark Payne <markap14@hotmail.com
>>             <ma...@hotmail.com>>
>>             *Gesendet:* Donnerstag, 9. April 2020 17:18
>>             *An:* users@nifi.apache.org <ma...@nifi.apache.org>
>>             *Betreff:* Re: Not Seeing Provenance data
>>
>>             Thanks Harald,
>>
>>             I have created a Jira [1] for this. There’s currently a
>>             PR up for it as well.
>>
>>             Thanks
>>
>>             -Mark
>>
>>             [1] https://issues.apache.org/jira/browse/NIFI-7346
>>
>>
>>
>>                 On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald
>>                 (Key-Work) <harald.dobbernack@key-work.de
>>                 <ma...@key-work.de>> wrote:
>>
>>                 Hi Mark,
>>
>>                 I can confirm after testing that if no provenance
>>                 event has been generated in a time greater than the
>>                 setnifi.provenance.repository.max.storage.time then
>>                 as expected the last recorded provenance events don’t
>>                 exist anymore but also from then on any new
>>                 provenance events are also not searchable, the
>>                 provenance Search remains completely empty regardless
>>                 of how many flows are active.  As described also
>>                 *.prov file is then missing in provenance repository.
>>                 After restart of Nifi new prov File will be generated
>>                 and provenance will work again, but only showing
>>                 stuff generated since last NiFi Start.
>>
>>                 So yes, I’d say your Idea
>>
>>                     ‘If so, then I think that would understand why it
>>                 deleted the data. It’s trying to age off old data
>>
>>                      but unfortunately it doesn’t perform a check to
>>                 first determine whether or not the “old file”
>>
>>                      that it’s about to delete is also the “active
>>                 file”.’
>>
>>                 fits very nicely to my test.
>>
>>                 As a workaround we’re going to set a
>>                 greaternifi.provenance.repository.max.storage.time
>>                 until this can be resolved.
>>
>>                 Thanks again for looking into this.
>>
>>                 Harald
>>
>>                 *Von:*Dobbernack, Harald (Key-Work)
>>                 *Gesendet:*Donnerstag, 9. April 2020 15:22
>>                 *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>>                 *Betreff:*AW: Not Seeing Provenance data
>>
>>                 Hi Mark,
>>
>>                 thank you for looking into this.
>>
>>                 The nifi.provenance.repository.max.storage.time
>>                 setting might explain why I haven’t been experiencing
>>                 the effect so often since changing from the default
>>                 to 120 hours a few months ago😉
>>
>>                 But I believe provenance stopped working last time
>>                 although there was an ‘active’ flows in wait
>>                 Processor, expiring every hour, going on to ‘send a
>>                 message’ before being rerouted to the same wait
>>                 processor. I would have expected this generates
>>                 provenance entries?  As I am not actually 100% sure
>>                 if that wait processor was in use when last
>>                 provenance got lost I will check with a testing
>>                 system to see if I can reproduce provenance breakage
>>                 when no active flows are around for a time greater
>>                  nifi.provenance.repository.max.storage.timeand I
>>                 will get back to you.
>>
>>                 Thank you!
>>
>>                 Harald
>>
>>                 *Von:*Mark Payne <markap14@hotmail.com
>>                 <ma...@hotmail.com>>
>>                 *Gesendet:*Donnerstag, 9.April 2020 14:41
>>                 *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>>                 *Betreff:*Re: Not Seeing Provenance data
>>
>>                 Hey Daren, Herald,
>>
>>                 Thanks for the note. I have seen this once before but
>>                 couldn’t figure out what caused it. Restarting
>>                 addressed the issue.
>>
>>                 I think I may understand the problem, now, though,
>>                 after looking at it again.
>>
>>                 In nifi.properties, there are a couple of property
>>                 named “nifi.provenance.repository.max.storage.time”
>>                 that defaults to “24 hours"
>>
>>                 Is it possible that you went 24 hours (or whatever
>>                 value is set for that property) without generating
>>                 any Provenance events?
>>
>>                 If so, then I think that would understand why it
>>                 deleted the data. It’s trying to age off old data but
>>                 unfortunately it doesn’t perform a check to first
>>                 determine whether or not the “old file” that it’s
>>                 about to delete is also the “active file”.
>>
>>                 Can you confirm whether or not you would expect to
>>                 see 24 hours pass without any provenance data?
>>
>>                 Thanks
>>
>>                 -Mark
>>
>>                     On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald
>>                     (Key-Work) <harald.dobbernack@key-work.de
>>                     <ma...@key-work.de>> wrote:
>>
>>                     What I noticed is that as long as provenance is
>>                     working there will be *.prov files in the
>>                     directory. When Provenance isn’t working these
>>                     files are not to be seen. Maybe some Cleaning
>>                     Process deletes those files prematurely or the
>>                     process building them doesn’t work any more?
>>
>>                     *Von:*Dobbernack, Harald (Key-Work)
>>                     <harald.dobbernack@key-work.de
>>                     <ma...@key-work.de>>
>>                     *Gesendet:*Donnerstag, 9. April 2020 10:27
>>                     *An:*users@nifi.apache.org
>>                     <ma...@nifi.apache.org>
>>                     *Betreff:*AW: Not Seeing Provenance data
>>
>>                     This is something I experience too from time to
>>                     time. My quick and dirty workaround is stop nifi,
>>                     delete everything in the provenance directory,
>>                     restart….  Then Provenance is usable again (of
>>                     course only with data since the delete) . I’m
>>                     hoping very much there is a better way, someone
>>                     can show us better settings or a potential bug
>>                     can be discovered…
>>
>>                     *Von:*Darren Govoni <darren@ontrenet.com
>>                     <ma...@ontrenet.com>>
>>                     *Gesendet:*Mittwoch, 8. April 2020 20:31
>>                     *An:*users@nifi.apache.org
>>                     <ma...@nifi.apache.org>
>>                     *Betreff:*Not Seeing Provenance data
>>
>>                     Hi,
>>
>>                       When I go to "View data provenance" in Nifi, I
>>                     never see any logs for my flow. Am I missing some
>>                     configuration setting somewhere?
>>
>>                     thanks,
>>
>>                     Darren
>>
>>
>>
>>                     *Harald Dobbernack**
>>                     *Key-Work Consulting GmbH | Kriegsstr. 100 |
>>                     76133 | Karlsruhe | Germany
>>                     |https://www.key-work.de|Datenschutz
>>                     <https://www.key-work.de/de/footer/datenschutz.html>
>>                     Fon: +49-721-78203-264 |
>>                     E-Mail:harald.dobbernack@key-work.de
>>                     <ma...@key-work.de>| Fax:
>>                     +49-721-78203-10
>>
>>                     Key-Work Consulting GmbH, Karlsruhe, HRB 108695,
>>                     HRG Mannheim
>>                     Geschäftsführer: Andreas Stappert, Tobin Wotring
>>

Re: Not Seeing Provenance data

Posted by Wyllys Ingersoll <wy...@keepertech.com>.
Nope, already checked that.

On Fri, Apr 10, 2020 at 8:23 PM Patrick Timmins <pt...@cox.net> wrote:

> No issues here.  Sounds like a timezone / system clock / clock drift issue
> (in a cluster).
> On 4/10/2020 11:59 AM, Joe Witt wrote:
>
> The provenance repo is in large scale use by many many users so
> fundamentally it does work.  There are conditions that apparently need
> improving.  In the past couple days these items have been flagged by folks
> on this list, JIRAs and PRs raised and merged, all good. If you can help by
> creating a build of the latest and confirm it fixes your case then please
> do so.
>
> Thanks
>
> On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com>
> wrote:
>
>> It would seem the feature is either broken completely or only works in
>> specific conditions.
>>
>> Can the Nifi team put a fix on their road map for this?
>> Its a rather central feature to Nifi.
>>
>> Sent from my Verizon, Samsung Galaxy smartphone
>>
>> ------------------------------
>> *From:* Wyllys Ingersoll <wy...@keepertech.com>
>> *Sent:* Friday, April 10, 2020 11:17:42 AM
>> *To:* users@nifi.apache.org <us...@nifi.apache.org>
>> *Subject:* Re: Not Seeing Provenance data
>>
>> I have a similar problem with viewing provenance.  I have a 3-node
>> cluster in a kubernetes environment, the provenance_repository directory
>> for each node is on a persistent data store so it is not deleted or lost
>> between container restarts (which are not very common).  My
>> nifi.provenance.repository.max.storage.time is 24 hours.
>>
>> Whenever I try to view any provenance, nothing is ever shown.  If I
>> manually inspect the provenance_repository directory, there is a lucene
>> index and TOC being created.
>>
>> I see log messages like these:
>>
>> Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with
>> identifier 64a703fe-0171-1000-0000-000065abd91a against index directories
>> [./provenance_repository/lucene-8-index-1560864819888]
>> Returning the following list of index locations because they were
>> finished being written to before 1586531601311: []
>> Found no events in the Provenance Repository. In order to perform
>> maintenace of the indices, will assume that the first event time is now
>> (1586531601311)
>>
>>
>> Any suggestions?
>>
>> -Wyllys Ingersoll
>>
>>
>>
>> On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <
>> harald.dobbernack@key-work.de> wrote:
>>
>> Hey Mark,
>>
>>
>>
>> great news and thank you very much!
>>
>>
>>
>> Happy Holidays!
>>
>> Harald
>>
>>
>>
>> *Von:* Mark Payne <ma...@hotmail.com>
>> *Gesendet:* Donnerstag, 9. April 2020 17:18
>> *An:* users@nifi.apache.org
>> *Betreff:* Re: Not Seeing Provenance data
>>
>>
>>
>> Thanks Harald,
>>
>>
>>
>> I have created a Jira [1] for this. There’s currently a PR up for it as
>> well.
>>
>>
>>
>> Thanks
>>
>> -Mark
>>
>>
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-7346
>>
>>
>>
>> On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <
>> harald.dobbernack@key-work.de> wrote:
>>
>>
>>
>> Hi Mark,
>>
>>
>>
>> I can confirm after testing that if no provenance event has been
>> generated in a time greater than the set nifi.provenance.repository.max.storage.time
>> then as expected the last recorded provenance events don’t exist anymore
>> but also from then on any new provenance events are also not searchable,
>> the provenance Search remains completely empty regardless of how many flows
>> are active.  As described also *.prov file is then missing in provenance
>> repository. After restart of Nifi new prov File will be generated and
>> provenance will work again, but only showing stuff generated since last
>> NiFi Start.
>>
>>
>>
>> So yes, I’d say your Idea
>>
>>     ‘If so, then I think that would understand why it deleted the data.
>> It’s trying to age off old data
>>
>>      but unfortunately it doesn’t perform a check to first determine
>> whether or not the “old file”
>>
>>      that it’s about to delete is also the “active file”.’
>>
>> fits very nicely to my test.
>>
>>
>>
>> As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time
>> until this can be resolved.
>>
>>
>>
>> Thanks again for looking into this.
>>
>> Harald
>>
>>
>>
>>
>>
>> *Von:* Dobbernack, Harald (Key-Work)
>> *Gesendet:* Donnerstag, 9. April 2020 15:22
>> *An:* users@nifi.apache.org
>> *Betreff:* AW: Not Seeing Provenance data
>>
>>
>>
>> Hi Mark,
>>
>>
>>
>> thank you for looking into this.
>>
>>
>>
>> The nifi.provenance.repository.max.storage.time setting might explain why
>> I haven’t been experiencing the effect so often since changing from the
>> default to 120 hours a few months ago 😉
>>
>>
>>
>> But I believe provenance stopped working last time although there was an
>> ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a
>> message’ before being rerouted to the same wait processor. I would have
>> expected this generates provenance entries?  As I am not actually 100% sure
>> if that wait processor was in use when last provenance got lost I will
>> check with a testing system to see if I can reproduce provenance breakage
>> when no active flows are around for a time greater
>>  nifi.provenance.repository.max.storage.time and I will get back to you.
>>
>>
>>
>> Thank you!
>>
>> Harald
>>
>>
>>
>>
>>
>> *Von:* Mark Payne <ma...@hotmail.com>
>> *Gesendet:* Donnerstag, 9. April 2020 14:41
>> *An:* users@nifi.apache.org
>> *Betreff:* Re: Not Seeing Provenance data
>>
>>
>>
>> Hey Daren, Herald,
>>
>>
>>
>> Thanks for the note. I have seen this once before but couldn’t figure out
>> what caused it. Restarting addressed the issue.
>>
>>
>>
>> I think I may understand the problem, now, though, after looking at it
>> again.
>>
>>
>>
>> In nifi.properties, there are a couple of property named
>> “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
>>
>> Is it possible that you went 24 hours (or whatever value is set for that
>> property) without generating any Provenance events?
>>
>>
>>
>> If so, then I think that would understand why it deleted the data. It’s
>> trying to age off old data but unfortunately it doesn’t perform a check to
>> first determine whether or not the “old file” that it’s about to delete is
>> also the “active file”.
>>
>>
>>
>> Can you confirm whether or not you would expect to see 24 hours pass
>> without any provenance data?
>>
>>
>>
>> Thanks
>>
>> -Mark
>>
>>
>>
>>
>>
>>
>>
>> On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <
>> harald.dobbernack@key-work.de> wrote:
>>
>>
>>
>> What I noticed is that as long as provenance is working there will be
>> *.prov files in the directory. When Provenance isn’t working these files
>> are not to be seen. Maybe some Cleaning Process deletes those files
>> prematurely or the process building them doesn’t work any more?
>>
>>
>>
>> *Von:* Dobbernack, Harald (Key-Work) <ha...@key-work.de>
>> *Gesendet:* Donnerstag, 9. April 2020 10:27
>> *An:* users@nifi.apache.org
>> *Betreff:* AW: Not Seeing Provenance data
>>
>>
>>
>> This is something I experience too from time to time. My quick and dirty
>> workaround is stop nifi, delete everything in the provenance directory,
>> restart….  Then Provenance is usable again (of course only with data since
>> the delete) . I’m hoping very much there is a better way, someone can show
>> us better settings or a potential bug can be discovered…
>>
>>
>>
>> *Von:* Darren Govoni <da...@ontrenet.com>
>> *Gesendet:* Mittwoch, 8. April 2020 20:31
>> *An:* users@nifi.apache.org
>> *Betreff:* Not Seeing Provenance data
>>
>>
>>
>> Hi,
>>
>>   When I go to "View data provenance" in Nifi, I never see any logs for
>> my flow. Am I missing some configuration setting somewhere?
>>
>>
>>
>> thanks,
>>
>> Darren
>>
>>
>>
>>
>>
>> *Harald Dobbernack*
>> Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany |
>>  https://www.key-work.de | Datenschutz
>> <https://www.key-work.de/de/footer/datenschutz.html>
>> Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax:
>> +49-721-78203-10
>>
>> Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
>> Geschäftsführer: Andreas Stappert, Tobin Wotring
>>
>>
>>
>>

Re: Not Seeing Provenance data

Posted by Patrick Timmins <pt...@cox.net>.
No issues here.  Sounds like a timezone / system clock / clock drift 
issue (in a cluster).

On 4/10/2020 11:59 AM, Joe Witt wrote:
> The provenance repo is in large scale use by many many users so 
> fundamentally it does work.  There are conditions that apparently need 
> improving.  In the past couple days these items have been flagged by 
> folks on this list, JIRAs and PRs raised and merged, all good. If you 
> can help by creating a build of the latest and confirm it fixes your 
> case then please do so.
>
> Thanks
>
> On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <darren@ontrenet.com 
> <ma...@ontrenet.com>> wrote:
>
>     It would seem the feature is either broken completely or only
>     works in specific conditions.
>
>     Can the Nifi team put a fix on their road map for this?
>     Its a rather central feature to Nifi.
>
>     Sent from my Verizon, Samsung Galaxy smartphone
>
>     ------------------------------------------------------------------------
>     *From:* Wyllys Ingersoll <wyllys.ingersoll@keepertech.com
>     <ma...@keepertech.com>>
>     *Sent:* Friday, April 10, 2020 11:17:42 AM
>     *To:* users@nifi.apache.org <ma...@nifi.apache.org>
>     <users@nifi.apache.org <ma...@nifi.apache.org>>
>     *Subject:* Re: Not Seeing Provenance data
>     I have a similar problem with viewing provenance.  I have a 3-node
>     cluster in a kubernetes environment, the provenance_repository
>     directory for each node is on a persistent data store so it is not
>     deleted or lost between container restarts (which are not very
>     common).  My nifi.provenance.repository.max.storage.time is 24 hours.
>
>     Whenever I try to view any provenance, nothing is ever shown.  If
>     I manually inspect the provenance_repository directory, there is a
>     lucene index and TOC being created.
>
>     I see log messages like these:
>
>     Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591
>     with identifier 64a703fe-0171-1000-0000-000065abd91a against index
>     directories [./provenance_repository/lucene-8-index-1560864819888]
>     Returning the following list of index locations because they were
>     finished being written to before 1586531601311: []
>     Found no events in the Provenance Repository. In order to perform
>     maintenace of the indices, will assume that the first event time
>     is now (1586531601311)
>
>
>     Any suggestions?
>
>     -Wyllys Ingersoll
>
>
>
>     On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work)
>     <harald.dobbernack@key-work.de
>     <ma...@key-work.de>> wrote:
>
>         Hey Mark,
>
>         great news and thank you very much!
>
>         Happy Holidays!
>
>         Harald
>
>         *Von:* Mark Payne <markap14@hotmail.com
>         <ma...@hotmail.com>>
>         *Gesendet:* Donnerstag, 9. April 2020 17:18
>         *An:* users@nifi.apache.org <ma...@nifi.apache.org>
>         *Betreff:* Re: Not Seeing Provenance data
>
>         Thanks Harald,
>
>         I have created a Jira [1] for this. There’s currently a PR up
>         for it as well.
>
>         Thanks
>
>         -Mark
>
>         [1] https://issues.apache.org/jira/browse/NIFI-7346
>
>
>
>             On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work)
>             <harald.dobbernack@key-work.de
>             <ma...@key-work.de>> wrote:
>
>             Hi Mark,
>
>             I can confirm after testing that if no provenance event
>             has been generated in a time greater than the
>             setnifi.provenance.repository.max.storage.time then as
>             expected the last recorded provenance events don’t exist
>             anymore but also from then on any new provenance events
>             are also not searchable, the provenance Search remains
>             completely empty regardless of how many flows are active. 
>             As described also *.prov file is then missing in
>             provenance repository. After restart of Nifi new prov File
>             will be generated and provenance will work again, but only
>             showing stuff generated since last NiFi Start.
>
>             So yes, I’d say your Idea
>
>                 ‘If so, then I think that would understand why it
>             deleted the data. It’s trying to age off old data
>
>                  but unfortunately it doesn’t perform a check to first
>             determine whether or not the “old file”
>
>                  that it’s about to delete is also the “active file”.’
>
>             fits very nicely to my test.
>
>             As a workaround we’re going to set a
>             greaternifi.provenance.repository.max.storage.time until
>             this can be resolved.
>
>             Thanks again for looking into this.
>
>             Harald
>
>             *Von:*Dobbernack, Harald (Key-Work)
>             *Gesendet:*Donnerstag, 9. April 2020 15:22
>             *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>             *Betreff:*AW: Not Seeing Provenance data
>
>             Hi Mark,
>
>             thank you for looking into this.
>
>             The nifi.provenance.repository.max.storage.time setting
>             might explain why I haven’t been experiencing the effect
>             so often since changing from the default to 120 hours a
>             few months ago😉
>
>             But I believe provenance stopped working last time
>             although there was an ‘active’ flows in wait Processor,
>             expiring every hour, going on to ‘send a message’ before
>             being rerouted to the same wait processor. I would have
>             expected this generates provenance entries?  As I am not
>             actually 100% sure if that wait processor was in use when
>             last provenance got lost I will check with a testing
>             system to see if I can reproduce provenance breakage when
>             no active flows are around for a time greater
>              nifi.provenance.repository.max.storage.timeand I will get
>             back to you.
>
>             Thank you!
>
>             Harald
>
>             *Von:*Mark Payne <markap14@hotmail.com
>             <ma...@hotmail.com>>
>             *Gesendet:*Donnerstag, 9.April 2020 14:41
>             *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>             *Betreff:*Re: Not Seeing Provenance data
>
>             Hey Daren, Herald,
>
>             Thanks for the note. I have seen this once before but
>             couldn’t figure out what caused it. Restarting addressed
>             the issue.
>
>             I think I may understand the problem, now, though, after
>             looking at it again.
>
>             In nifi.properties, there are a couple of property named
>             “nifi.provenance.repository.max.storage.time” that
>             defaults to “24 hours"
>
>             Is it possible that you went 24 hours (or whatever value
>             is set for that property) without generating any
>             Provenance events?
>
>             If so, then I think that would understand why it deleted
>             the data. It’s trying to age off old data but
>             unfortunately it doesn’t perform a check to first
>             determine whether or not the “old file” that it’s about to
>             delete is also the “active file”.
>
>             Can you confirm whether or not you would expect to see 24
>             hours pass without any provenance data?
>
>             Thanks
>
>             -Mark
>
>                 On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald
>                 (Key-Work) <harald.dobbernack@key-work.de
>                 <ma...@key-work.de>> wrote:
>
>                 What I noticed is that as long as provenance is
>                 working there will be *.prov files in the directory.
>                 When Provenance isn’t working these files are not to
>                 be seen. Maybe some Cleaning Process deletes those
>                 files prematurely or the process building them doesn’t
>                 work any more?
>
>                 *Von:*Dobbernack, Harald (Key-Work)
>                 <harald.dobbernack@key-work.de
>                 <ma...@key-work.de>>
>                 *Gesendet:*Donnerstag, 9. April 2020 10:27
>                 *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>                 *Betreff:*AW: Not Seeing Provenance data
>
>                 This is something I experience too from time to time.
>                 My quick and dirty workaround is stop nifi, delete
>                 everything in the provenance directory, restart…. 
>                 Then Provenance is usable again (of course only with
>                 data since the delete) . I’m hoping very much there is
>                 a better way, someone can show us better settings or a
>                 potential bug can be discovered…
>
>                 *Von:*Darren Govoni <darren@ontrenet.com
>                 <ma...@ontrenet.com>>
>                 *Gesendet:*Mittwoch, 8. April 2020 20:31
>                 *An:*users@nifi.apache.org <ma...@nifi.apache.org>
>                 *Betreff:*Not Seeing Provenance data
>
>                 Hi,
>
>                   When I go to "View data provenance" in Nifi, I never
>                 see any logs for my flow. Am I missing some
>                 configuration setting somewhere?
>
>                 thanks,
>
>                 Darren
>
>
>
>                 *Harald Dobbernack**
>                 *Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 |
>                 Karlsruhe | Germany
>                 |https://www.key-work.de|Datenschutz
>                 <https://www.key-work.de/de/footer/datenschutz.html>
>                 Fon: +49-721-78203-264 |
>                 E-Mail:harald.dobbernack@key-work.de
>                 <ma...@key-work.de>| Fax:
>                 +49-721-78203-10
>
>                 Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG
>                 Mannheim
>                 Geschäftsführer: Andreas Stappert, Tobin Wotring
>

Re: Not Seeing Provenance data

Posted by Joe Witt <jo...@gmail.com>.
The provenance repo is in large scale use by many many users so
fundamentally it does work.  There are conditions that apparently need
improving.  In the past couple days these items have been flagged by folks
on this list, JIRAs and PRs raised and merged, all good. If you can help by
creating a build of the latest and confirm it fixes your case then please
do so.

Thanks

On Fri, Apr 10, 2020 at 12:48 PM Darren Govoni <da...@ontrenet.com> wrote:

> It would seem the feature is either broken completely or only works in
> specific conditions.
>
> Can the Nifi team put a fix on their road map for this?
> Its a rather central feature to Nifi.
>
> Sent from my Verizon, Samsung Galaxy smartphone
>
> ------------------------------
> *From:* Wyllys Ingersoll <wy...@keepertech.com>
> *Sent:* Friday, April 10, 2020 11:17:42 AM
> *To:* users@nifi.apache.org <us...@nifi.apache.org>
> *Subject:* Re: Not Seeing Provenance data
>
> I have a similar problem with viewing provenance.  I have a 3-node cluster
> in a kubernetes environment, the provenance_repository directory for each
> node is on a persistent data store so it is not deleted or lost between
> container restarts (which are not very common).  My
> nifi.provenance.repository.max.storage.time is 24 hours.
>
> Whenever I try to view any provenance, nothing is ever shown.  If I
> manually inspect the provenance_repository directory, there is a lucene
> index and TOC being created.
>
> I see log messages like these:
>
> Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with
> identifier 64a703fe-0171-1000-0000-000065abd91a against index directories
> [./provenance_repository/lucene-8-index-1560864819888]
> Returning the following list of index locations because they were finished
> being written to before 1586531601311: []
> Found no events in the Provenance Repository. In order to perform
> maintenace of the indices, will assume that the first event time is now
> (1586531601311)
>
>
> Any suggestions?
>
> -Wyllys Ingersoll
>
>
>
> On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <
> harald.dobbernack@key-work.de> wrote:
>
> Hey Mark,
>
>
>
> great news and thank you very much!
>
>
>
> Happy Holidays!
>
> Harald
>
>
>
> *Von:* Mark Payne <ma...@hotmail.com>
> *Gesendet:* Donnerstag, 9. April 2020 17:18
> *An:* users@nifi.apache.org
> *Betreff:* Re: Not Seeing Provenance data
>
>
>
> Thanks Harald,
>
>
>
> I have created a Jira [1] for this. There’s currently a PR up for it as
> well.
>
>
>
> Thanks
>
> -Mark
>
>
>
> [1] https://issues.apache.org/jira/browse/NIFI-7346
>
>
>
> On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <
> harald.dobbernack@key-work.de> wrote:
>
>
>
> Hi Mark,
>
>
>
> I can confirm after testing that if no provenance event has been generated
> in a time greater than the set nifi.provenance.repository.max.storage.time
> then as expected the last recorded provenance events don’t exist anymore
> but also from then on any new provenance events are also not searchable,
> the provenance Search remains completely empty regardless of how many flows
> are active.  As described also *.prov file is then missing in provenance
> repository. After restart of Nifi new prov File will be generated and
> provenance will work again, but only showing stuff generated since last
> NiFi Start.
>
>
>
> So yes, I’d say your Idea
>
>     ‘If so, then I think that would understand why it deleted the data.
> It’s trying to age off old data
>
>      but unfortunately it doesn’t perform a check to first determine
> whether or not the “old file”
>
>      that it’s about to delete is also the “active file”.’
>
> fits very nicely to my test.
>
>
>
> As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time
> until this can be resolved.
>
>
>
> Thanks again for looking into this.
>
> Harald
>
>
>
>
>
> *Von:* Dobbernack, Harald (Key-Work)
> *Gesendet:* Donnerstag, 9. April 2020 15:22
> *An:* users@nifi.apache.org
> *Betreff:* AW: Not Seeing Provenance data
>
>
>
> Hi Mark,
>
>
>
> thank you for looking into this.
>
>
>
> The nifi.provenance.repository.max.storage.time setting might explain why
> I haven’t been experiencing the effect so often since changing from the
> default to 120 hours a few months ago 😉
>
>
>
> But I believe provenance stopped working last time although there was an
> ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a
> message’ before being rerouted to the same wait processor. I would have
> expected this generates provenance entries?  As I am not actually 100% sure
> if that wait processor was in use when last provenance got lost I will
> check with a testing system to see if I can reproduce provenance breakage
> when no active flows are around for a time greater
>  nifi.provenance.repository.max.storage.time and I will get back to you.
>
>
>
> Thank you!
>
> Harald
>
>
>
>
>
> *Von:* Mark Payne <ma...@hotmail.com>
> *Gesendet:* Donnerstag, 9. April 2020 14:41
> *An:* users@nifi.apache.org
> *Betreff:* Re: Not Seeing Provenance data
>
>
>
> Hey Daren, Herald,
>
>
>
> Thanks for the note. I have seen this once before but couldn’t figure out
> what caused it. Restarting addressed the issue.
>
>
>
> I think I may understand the problem, now, though, after looking at it
> again.
>
>
>
> In nifi.properties, there are a couple of property named
> “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
>
> Is it possible that you went 24 hours (or whatever value is set for that
> property) without generating any Provenance events?
>
>
>
> If so, then I think that would understand why it deleted the data. It’s
> trying to age off old data but unfortunately it doesn’t perform a check to
> first determine whether or not the “old file” that it’s about to delete is
> also the “active file”.
>
>
>
> Can you confirm whether or not you would expect to see 24 hours pass
> without any provenance data?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <
> harald.dobbernack@key-work.de> wrote:
>
>
>
> What I noticed is that as long as provenance is working there will be
> *.prov files in the directory. When Provenance isn’t working these files
> are not to be seen. Maybe some Cleaning Process deletes those files
> prematurely or the process building them doesn’t work any more?
>
>
>
> *Von:* Dobbernack, Harald (Key-Work) <ha...@key-work.de>
> *Gesendet:* Donnerstag, 9. April 2020 10:27
> *An:* users@nifi.apache.org
> *Betreff:* AW: Not Seeing Provenance data
>
>
>
> This is something I experience too from time to time. My quick and dirty
> workaround is stop nifi, delete everything in the provenance directory,
> restart….  Then Provenance is usable again (of course only with data since
> the delete) . I’m hoping very much there is a better way, someone can show
> us better settings or a potential bug can be discovered…
>
>
>
> *Von:* Darren Govoni <da...@ontrenet.com>
> *Gesendet:* Mittwoch, 8. April 2020 20:31
> *An:* users@nifi.apache.org
> *Betreff:* Not Seeing Provenance data
>
>
>
> Hi,
>
>   When I go to "View data provenance" in Nifi, I never see any logs for my
> flow. Am I missing some configuration setting somewhere?
>
>
>
> thanks,
>
> Darren
>
>
>
>
>
> *Harald Dobbernack*
> Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany |
> https://www.key-work.de | Datenschutz
> <https://www.key-work.de/de/footer/datenschutz.html>
> Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax:
> +49-721-78203-10
>
> Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
> Geschäftsführer: Andreas Stappert, Tobin Wotring
>
>
>
>

Re: Not Seeing Provenance data

Posted by Darren Govoni <da...@ontrenet.com>.
It would seem the feature is either broken completely or only works in specific conditions.

Can the Nifi team put a fix on their road map for this?
Its a rather central feature to Nifi.

Sent from my Verizon, Samsung Galaxy smartphone

________________________________
From: Wyllys Ingersoll <wy...@keepertech.com>
Sent: Friday, April 10, 2020 11:17:42 AM
To: users@nifi.apache.org <us...@nifi.apache.org>
Subject: Re: Not Seeing Provenance data

I have a similar problem with viewing provenance.  I have a 3-node cluster in a kubernetes environment, the provenance_repository directory for each node is on a persistent data store so it is not deleted or lost between container restarts (which are not very common).  My nifi.provenance.repository.max.storage.time is 24 hours.

Whenever I try to view any provenance, nothing is ever shown.  If I manually inspect the provenance_repository directory, there is a lucene index and TOC being created.

I see log messages like these:

Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with identifier 64a703fe-0171-1000-0000-000065abd91a against index directories [./provenance_repository/lucene-8-index-1560864819888]
Returning the following list of index locations because they were finished being written to before 1586531601311: []
Found no events in the Provenance Repository. In order to perform maintenace of the indices, will assume that the first event time is now (1586531601311)


Any suggestions?

-Wyllys Ingersoll



On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

Hey Mark,



great news and thank you very much!



Happy Holidays!

Harald



Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 17:18
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Thanks Harald,



I have created a Jira [1] for this. There’s currently a PR up for it as well.



Thanks

-Mark



[1] https://issues.apache.org/jira/browse/NIFI-7346



On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



Hi Mark,



I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.



So yes, I’d say your Idea

    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data

     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”

     that it’s about to delete is also the “active file”.’

fits very nicely to my test.



As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.



Thanks again for looking into this.

Harald





Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



Hi Mark,



thank you for looking into this.



The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉



But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.



Thank you!

Harald





Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data



Hey Daren, Herald,



Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.



I think I may understand the problem, now, though, after looking at it again.



In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"

Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?



If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.



Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?



Thanks

-Mark







On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:



What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?



Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data



This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…



Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data



Hi,

  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?



thanks,

Darren





Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring



Re: Not Seeing Provenance data

Posted by Wyllys Ingersoll <wy...@keepertech.com>.
I have a similar problem with viewing provenance.  I have a 3-node cluster
in a kubernetes environment, the provenance_repository directory for each
node is on a persistent data store so it is not deleted or lost between
container restarts (which are not very common).  My
nifi.provenance.repository.max.storage.time is 24 hours.

Whenever I try to view any provenance, nothing is ever shown.  If I
manually inspect the provenance_repository directory, there is a lucene
index and TOC being created.

I see log messages like these:

Submitting query +processorId:882133fe-b684-148b-ad88-7850437ca591 with
identifier 64a703fe-0171-1000-0000-000065abd91a against index directories
[./provenance_repository/lucene-8-index-1560864819888]
Returning the following list of index locations because they were finished
being written to before 1586531601311: []
Found no events in the Provenance Repository. In order to perform
maintenace of the indices, will assume that the first event time is now
(1586531601311)


Any suggestions?

-Wyllys Ingersoll



On Thu, Apr 9, 2020 at 11:25 AM Dobbernack, Harald (Key-Work) <
harald.dobbernack@key-work.de> wrote:

> Hey Mark,
>
>
>
> great news and thank you very much!
>
>
>
> Happy Holidays!
>
> Harald
>
>
>
> *Von:* Mark Payne <ma...@hotmail.com>
> *Gesendet:* Donnerstag, 9. April 2020 17:18
> *An:* users@nifi.apache.org
> *Betreff:* Re: Not Seeing Provenance data
>
>
>
> Thanks Harald,
>
>
>
> I have created a Jira [1] for this. There’s currently a PR up for it as
> well.
>
>
>
> Thanks
>
> -Mark
>
>
>
> [1] https://issues.apache.org/jira/browse/NIFI-7346
>
>
>
> On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <
> harald.dobbernack@key-work.de> wrote:
>
>
>
> Hi Mark,
>
>
>
> I can confirm after testing that if no provenance event has been generated
> in a time greater than the set nifi.provenance.repository.max.storage.time
> then as expected the last recorded provenance events don’t exist anymore
> but also from then on any new provenance events are also not searchable,
> the provenance Search remains completely empty regardless of how many flows
> are active.  As described also *.prov file is then missing in provenance
> repository. After restart of Nifi new prov File will be generated and
> provenance will work again, but only showing stuff generated since last
> NiFi Start.
>
>
>
> So yes, I’d say your Idea
>
>     ‘If so, then I think that would understand why it deleted the data.
> It’s trying to age off old data
>
>      but unfortunately it doesn’t perform a check to first determine
> whether or not the “old file”
>
>      that it’s about to delete is also the “active file”.’
>
> fits very nicely to my test.
>
>
>
> As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time
> until this can be resolved.
>
>
>
> Thanks again for looking into this.
>
> Harald
>
>
>
>
>
> *Von:* Dobbernack, Harald (Key-Work)
> *Gesendet:* Donnerstag, 9. April 2020 15:22
> *An:* users@nifi.apache.org
> *Betreff:* AW: Not Seeing Provenance data
>
>
>
> Hi Mark,
>
>
>
> thank you for looking into this.
>
>
>
> The nifi.provenance.repository.max.storage.time setting might explain why
> I haven’t been experiencing the effect so often since changing from the
> default to 120 hours a few months ago 😉
>
>
>
> But I believe provenance stopped working last time although there was an
> ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a
> message’ before being rerouted to the same wait processor. I would have
> expected this generates provenance entries?  As I am not actually 100% sure
> if that wait processor was in use when last provenance got lost I will
> check with a testing system to see if I can reproduce provenance breakage
> when no active flows are around for a time greater
>  nifi.provenance.repository.max.storage.time and I will get back to you.
>
>
>
> Thank you!
>
> Harald
>
>
>
>
>
> *Von:* Mark Payne <ma...@hotmail.com>
> *Gesendet:* Donnerstag, 9. April 2020 14:41
> *An:* users@nifi.apache.org
> *Betreff:* Re: Not Seeing Provenance data
>
>
>
> Hey Daren, Herald,
>
>
>
> Thanks for the note. I have seen this once before but couldn’t figure out
> what caused it. Restarting addressed the issue.
>
>
>
> I think I may understand the problem, now, though, after looking at it
> again.
>
>
>
> In nifi.properties, there are a couple of property named
> “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
>
> Is it possible that you went 24 hours (or whatever value is set for that
> property) without generating any Provenance events?
>
>
>
> If so, then I think that would understand why it deleted the data. It’s
> trying to age off old data but unfortunately it doesn’t perform a check to
> first determine whether or not the “old file” that it’s about to delete is
> also the “active file”.
>
>
>
> Can you confirm whether or not you would expect to see 24 hours pass
> without any provenance data?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
>
>
> On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <
> harald.dobbernack@key-work.de> wrote:
>
>
>
> What I noticed is that as long as provenance is working there will be
> *.prov files in the directory. When Provenance isn’t working these files
> are not to be seen. Maybe some Cleaning Process deletes those files
> prematurely or the process building them doesn’t work any more?
>
>
>
> *Von:* Dobbernack, Harald (Key-Work) <ha...@key-work.de>
> *Gesendet:* Donnerstag, 9. April 2020 10:27
> *An:* users@nifi.apache.org
> *Betreff:* AW: Not Seeing Provenance data
>
>
>
> This is something I experience too from time to time. My quick and dirty
> workaround is stop nifi, delete everything in the provenance directory,
> restart….  Then Provenance is usable again (of course only with data since
> the delete) . I’m hoping very much there is a better way, someone can show
> us better settings or a potential bug can be discovered…
>
>
>
> *Von:* Darren Govoni <da...@ontrenet.com>
> *Gesendet:* Mittwoch, 8. April 2020 20:31
> *An:* users@nifi.apache.org
> *Betreff:* Not Seeing Provenance data
>
>
>
> Hi,
>
>   When I go to "View data provenance" in Nifi, I never see any logs for my
> flow. Am I missing some configuration setting somewhere?
>
>
>
> thanks,
>
> Darren
>
>
>
>
>
> *Harald Dobbernack*
> Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany |
> https://www.key-work.de | Datenschutz
> <https://www.key-work.de/de/footer/datenschutz.html>
> Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax:
> +49-721-78203-10
>
> Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
> Geschäftsführer: Andreas Stappert, Tobin Wotring
>
>
>

AW: Not Seeing Provenance data

Posted by "Dobbernack, Harald (Key-Work)" <ha...@key-work.de>.
Hey Mark,

great news and thank you very much!

Happy Holidays!
Harald

Von: Mark Payne <ma...@hotmail.com>
Gesendet: Donnerstag, 9. April 2020 17:18
An: users@nifi.apache.org
Betreff: Re: Not Seeing Provenance data

Thanks Harald,

I have created a Jira [1] for this. There’s currently a PR up for it as well.

Thanks
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-7346


On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

Hi Mark,

I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.

So yes, I’d say your Idea
    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data
     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”
     that it’s about to delete is also the “active file”.’
fits very nicely to my test.

As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.

Thanks again for looking into this.
Harald


Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

Hi Mark,

thank you for looking into this.

The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉

But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.

Thank you!
Harald


Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data

Hey Daren, Herald,

Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.

I think I may understand the problem, now, though, after looking at it again.

In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?

If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.

Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?

Thanks
-Mark



On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<x-msg://13/www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


Re: Not Seeing Provenance data

Posted by Mark Payne <ma...@hotmail.com>.
Thanks Harald,

I have created a Jira [1] for this. There’s currently a PR up for it as well.

Thanks
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-7346

On Apr 9, 2020, at 11:14 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

Hi Mark,

I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.

So yes, I’d say your Idea
    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data
     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”
     that it’s about to delete is also the “active file”.’
fits very nicely to my test.

As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.

Thanks again for looking into this.
Harald


Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

Hi Mark,

thank you for looking into this.

The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉

But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.

Thank you!
Harald


Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data

Hey Daren, Herald,

Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.

I think I may understand the problem, now, though, after looking at it again.

In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?

If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.

Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?

Thanks
-Mark



On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<x-msg://13/www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


AW: Not Seeing Provenance data

Posted by "Dobbernack, Harald (Key-Work)" <ha...@key-work.de>.
Hi Mark,

I can confirm after testing that if no provenance event has been generated in a time greater than the set nifi.provenance.repository.max.storage.time then as expected the last recorded provenance events don’t exist anymore but also from then on any new provenance events are also not searchable, the provenance Search remains completely empty regardless of how many flows are active.  As described also *.prov file is then missing in provenance repository. After restart of Nifi new prov File will be generated and provenance will work again, but only showing stuff generated since last NiFi Start.

So yes, I’d say your Idea
    ‘If so, then I think that would understand why it deleted the data. It’s trying to age off old data
     but unfortunately it doesn’t perform a check to first determine whether or not the “old file”
     that it’s about to delete is also the “active file”.’
fits very nicely to my test.

As a workaround we’re going to set a greater nifi.provenance.repository.max.storage.time until this can be resolved.

Thanks again for looking into this.
Harald


Von: Dobbernack, Harald (Key-Work)
Gesendet: Donnerstag, 9. April 2020 15:22
An: users@nifi.apache.org
Betreff: AW: Not Seeing Provenance data

Hi Mark,

thank you for looking into this.

The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉

But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.

Thank you!
Harald


Von: Mark Payne <ma...@hotmail.com>>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Re: Not Seeing Provenance data

Hey Daren, Herald,

Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.

I think I may understand the problem, now, though, after looking at it again.

In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?

If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.

Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?

Thanks
-Mark



On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<x-msg://13/www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


AW: Not Seeing Provenance data

Posted by "Dobbernack, Harald (Key-Work)" <ha...@key-work.de>.
Hi Mark,

thank you for looking into this.

The nifi.provenance.repository.max.storage.time setting might explain why I haven’t been experiencing the effect so often since changing from the default to 120 hours a few months ago 😉

But I believe provenance stopped working last time although there was an ‘active’ flows in wait Processor, expiring every hour, going on to ‘send a message’ before being rerouted to the same wait processor. I would have expected this generates provenance entries?  As I am not actually 100% sure if that wait processor was in use when last provenance got lost I will check with a testing system to see if I can reproduce provenance breakage when no active flows are around for a time greater  nifi.provenance.repository.max.storage.time and I will get back to you.

Thank you!
Harald


Von: Mark Payne <ma...@hotmail.com>
Gesendet: Donnerstag, 9. April 2020 14:41
An: users@nifi.apache.org
Betreff: Re: Not Seeing Provenance data

Hey Daren, Herald,

Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.

I think I may understand the problem, now, though, after looking at it again.

In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?

If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.

Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?

Thanks
-Mark




On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<x-msg://13/www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


Re: Not Seeing Provenance data

Posted by Mark Payne <ma...@hotmail.com>.
Hey Daren, Herald,

Thanks for the note. I have seen this once before but couldn’t figure out what caused it. Restarting addressed the issue.

I think I may understand the problem, now, though, after looking at it again.

In nifi.properties, there are a couple of property named “nifi.provenance.repository.max.storage.time” that defaults to “24 hours"
Is it possible that you went 24 hours (or whatever value is set for that property) without generating any Provenance events?

If so, then I think that would understand why it deleted the data. It’s trying to age off old data but unfortunately it doesn’t perform a check to first determine whether or not the “old file” that it’s about to delete is also the “active file”.

Can you confirm whether or not you would expect to see 24 hours pass without any provenance data?

Thanks
-Mark



On Apr 9, 2020, at 4:32 AM, Dobbernack, Harald (Key-Work) <ha...@key-work.de>> wrote:

What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn’t working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn’t work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart….  Then Provenance is usable again (of course only with data since the delete) . I’m hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered…

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<x-msg://13/www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


AW: Not Seeing Provenance data

Posted by "Dobbernack, Harald (Key-Work)" <ha...@key-work.de>.
What I noticed is that as long as provenance is working there will be *.prov files in the directory. When Provenance isn't working these files are not to be seen. Maybe some Cleaning Process deletes those files prematurely or the process building them doesn't work any more?

Von: Dobbernack, Harald (Key-Work) <ha...@key-work.de>
Gesendet: Donnerstag, 9. April 2020 10:27
An: users@nifi.apache.org
Betreff: AW: Not Seeing Provenance data

This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart....  Then Provenance is usable again (of course only with data since the delete) . I'm hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered...

Von: Darren Govoni <da...@ontrenet.com>>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org<ma...@nifi.apache.org>
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de<ma...@key-work.de> | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring

AW: Not Seeing Provenance data

Posted by "Dobbernack, Harald (Key-Work)" <ha...@key-work.de>.
This is something I experience too from time to time. My quick and dirty workaround is stop nifi, delete everything in the provenance directory, restart....  Then Provenance is usable again (of course only with data since the delete) . I'm hoping very much there is a better way, someone can show us better settings or a potential bug can be discovered...

Von: Darren Govoni <da...@ontrenet.com>
Gesendet: Mittwoch, 8. April 2020 20:31
An: users@nifi.apache.org
Betreff: Not Seeing Provenance data

Hi,
  When I go to "View data provenance" in Nifi, I never see any logs for my flow. Am I missing some configuration setting somewhere?

thanks,
Darren



Harald Dobbernack
Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | https://www.key-work.de<www.key-work.de> | Datenschutz<https://www.key-work.de/de/footer/datenschutz.html>
Fon: +49-721-78203-264 | E-Mail: harald.dobbernack@key-work.de | Fax: +49-721-78203-10

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Gesch?ftsf?hrer: Andreas Stappert, Tobin Wotring