You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Fred S <fs...@gmail.com> on 2017/10/04 20:50:20 UTC

3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Dear community.

I'm currently trying to set up a SiteToSiteProvenanceReportingTask to keep
track of DOWNLOAD provenance events.

If I set it up without any filters at all everything gets sent to my
dedicated NiFi instance for handling these messages. When I set a filter in
the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") -
nothing.

I've been looking around for anyone experiencing a similar issue, without
any luck. Have anyone else set this up successfully without having to log
*all* provenance events? Due to the number of flowfiles I have flying by
each day it would not be feasible to go with that approach.

I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth I've
also tried to accomplish this both with and without security/ssl.

If anyone else needs any additional information about my setup, please let
me know.

All the best,
Fred

Re: 3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Posted by Mark Payne <ma...@hotmail.com>.
Fred, Drew,

Just to close the loop here: I was able to recreate and understand the issue. It turns out that if a batch of events are read
(default batch size of 1,000) and none of the events read match the filter, the reporting task would stop advancing through
the events and would continually just read those same 1,000 events, none of which would match. I submitted a PR to address
this.

Thanks
-Mark

On Oct 6, 2017, at 3:43 AM, Fred S <fs...@gmail.com>> wrote:

Thanks for looking into the issue and sort of helping me rule out that I'm way of doing something wrong here.

I've created an issue in JIRA: https://issues.apache.org/jira/browse/NIFI-4468

Thanks,
-Fred

On Thu, Oct 5, 2017 at 5:27 PM, Andrew Lim <an...@gmail.com>> wrote:
I did some more testing and found that if I restart the NiFi instance that is running the reporting task, I start getting provenance events again in my other instance.

If I stop the reporting task, add the DOWNLOAD filter, restart the task and then generate only download events, those download events do show up in my other instance.  But if I generate different types of provenance events (RETRIEVE, DOWNLOAD, etc.), the latest download events are not showing up again.

-Drew


> On Oct 5, 2017, at 11:02 AM, Andrew Lim <an...@gmail.com>> wrote:
>
> I had a SiteToSiteProvenanceReportingTask running in 1.3.0, so I gave this a try myself.
>
> I think I’m able to reproduce the issue that Fred is seeing.  But I’m not sure if the issue is only related to adding an Event Type filter to the reporting task.
>
> With no filter, I generated some DOWNLOAD events by downloading some FlowFiles.  I did see these come into my NiFi instance that handles the events.  Everything was working as expected.
>
> But then I stopped the reporting task, added the DOWNLOAD filter and then restarted the task.  Then I downloaded some FlowFiles again.  As Fred was seeing, these events did not come in.
>
> However, then I stopped the reporting task, removed the filter, and then restarted the task.  Then I generated numerous provenance events (RETRIEVE, DOWNLOAD, etc).  But no events came in.
>
> So maybe this has something to do with starting/stopping the reporting task in addition to adding the filter.
>
> Fred, can you file a Jira?  I can add my comments to it.
>
> Thanks,
>
> -Drew
>
>
>> On Oct 5, 2017, at 9:51 AM, Mark Payne <ma...@hotmail.com>> wrote:
>>
>> Hi Fred,
>>
>> I have used the reporting task with specific event types listed, but I've not run into issues personally.
>>
>> I am curious if you have tried looking for more common event types, such as RECEIVE, CONTENT_MODIFIED, or ATTRIBUTES_MODIFIED?
>> EXPIRE events only fire when FlowFiles get aged off from a Connection due to not being processed within the
>> Connection's "FlowFile Expiration" threshold, and DOWNLOAD events are only triggered when a user downloads
>> the contents of the FlowFile to look at it manually. So both of these are fairly rare events.
>> RECEIVE, ATTRIBUTES_MODIFIED, and CONTENT_MODIFIED, on the other hand, will generally happen constantly.
>>
>> Thanks
>> -Mark
>>
>>
>>
>>> On Oct 4, 2017, at 4:50 PM, Fred S <fs...@gmail.com>> wrote:
>>>
>>> Dear community.
>>>
>>> I'm currently trying to set up a SiteToSiteProvenanceReportingTask to keep track of DOWNLOAD provenance events.
>>>
>>> If I set it up without any filters at all everything gets sent to my dedicated NiFi instance for handling these messages. When I set a filter in the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") - nothing.
>>>
>>> I've been looking around for anyone experiencing a similar issue, without any luck. Have anyone else set this up successfully without having to log *all* provenance events? Due to the number of flowfiles I have flying by each day it would not be feasible to go with that approach.
>>>
>>> I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth I've also tried to accomplish this both with and without security/ssl.
>>>
>>> If anyone else needs any additional information about my setup, please let me know.
>>>
>>> All the best,
>>> Fred
>>
>




Re: 3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Posted by Fred S <fs...@gmail.com>.
Thanks for looking into the issue and sort of helping me rule out that I'm
way of doing something wrong here.

I've created an issue in JIRA:
https://issues.apache.org/jira/browse/NIFI-4468

Thanks,
-Fred

On Thu, Oct 5, 2017 at 5:27 PM, Andrew Lim <an...@gmail.com>
wrote:

> I did some more testing and found that if I restart the NiFi instance that
> is running the reporting task, I start getting provenance events again in
> my other instance.
>
> If I stop the reporting task, add the DOWNLOAD filter, restart the task
> and then generate only download events, those download events do show up in
> my other instance.  But if I generate different types of provenance events
> (RETRIEVE, DOWNLOAD, etc.), the latest download events are not showing up
> again.
>
> -Drew
>
>
> > On Oct 5, 2017, at 11:02 AM, Andrew Lim <an...@gmail.com>
> wrote:
> >
> > I had a SiteToSiteProvenanceReportingTask running in 1.3.0, so I gave
> this a try myself.
> >
> > I think I’m able to reproduce the issue that Fred is seeing.  But I’m
> not sure if the issue is only related to adding an Event Type filter to the
> reporting task.
> >
> > With no filter, I generated some DOWNLOAD events by downloading some
> FlowFiles.  I did see these come into my NiFi instance that handles the
> events.  Everything was working as expected.
> >
> > But then I stopped the reporting task, added the DOWNLOAD filter and
> then restarted the task.  Then I downloaded some FlowFiles again.  As Fred
> was seeing, these events did not come in.
> >
> > However, then I stopped the reporting task, removed the filter, and then
> restarted the task.  Then I generated numerous provenance events (RETRIEVE,
> DOWNLOAD, etc).  But no events came in.
> >
> > So maybe this has something to do with starting/stopping the reporting
> task in addition to adding the filter.
> >
> > Fred, can you file a Jira?  I can add my comments to it.
> >
> > Thanks,
> >
> > -Drew
> >
> >
> >> On Oct 5, 2017, at 9:51 AM, Mark Payne <ma...@hotmail.com> wrote:
> >>
> >> Hi Fred,
> >>
> >> I have used the reporting task with specific event types listed, but
> I've not run into issues personally.
> >>
> >> I am curious if you have tried looking for more common event types,
> such as RECEIVE, CONTENT_MODIFIED, or ATTRIBUTES_MODIFIED?
> >> EXPIRE events only fire when FlowFiles get aged off from a Connection
> due to not being processed within the
> >> Connection's "FlowFile Expiration" threshold, and DOWNLOAD events are
> only triggered when a user downloads
> >> the contents of the FlowFile to look at it manually. So both of these
> are fairly rare events.
> >> RECEIVE, ATTRIBUTES_MODIFIED, and CONTENT_MODIFIED, on the other hand,
> will generally happen constantly.
> >>
> >> Thanks
> >> -Mark
> >>
> >>
> >>
> >>> On Oct 4, 2017, at 4:50 PM, Fred S <fs...@gmail.com> wrote:
> >>>
> >>> Dear community.
> >>>
> >>> I'm currently trying to set up a SiteToSiteProvenanceReportingTask to
> keep track of DOWNLOAD provenance events.
> >>>
> >>> If I set it up without any filters at all everything gets sent to my
> dedicated NiFi instance for handling these messages. When I set a filter in
> the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") -
> nothing.
> >>>
> >>> I've been looking around for anyone experiencing a similar issue,
> without any luck. Have anyone else set this up successfully without having
> to log *all* provenance events? Due to the number of flowfiles I have
> flying by each day it would not be feasible to go with that approach.
> >>>
> >>> I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth
> I've also tried to accomplish this both with and without security/ssl.
> >>>
> >>> If anyone else needs any additional information about my setup, please
> let me know.
> >>>
> >>> All the best,
> >>> Fred
> >>
> >
>
>

Re: 3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Posted by Andrew Lim <an...@gmail.com>.
I did some more testing and found that if I restart the NiFi instance that is running the reporting task, I start getting provenance events again in my other instance.

If I stop the reporting task, add the DOWNLOAD filter, restart the task and then generate only download events, those download events do show up in my other instance.  But if I generate different types of provenance events (RETRIEVE, DOWNLOAD, etc.), the latest download events are not showing up again.

-Drew


> On Oct 5, 2017, at 11:02 AM, Andrew Lim <an...@gmail.com> wrote:
> 
> I had a SiteToSiteProvenanceReportingTask running in 1.3.0, so I gave this a try myself.
> 
> I think I’m able to reproduce the issue that Fred is seeing.  But I’m not sure if the issue is only related to adding an Event Type filter to the reporting task.
> 
> With no filter, I generated some DOWNLOAD events by downloading some FlowFiles.  I did see these come into my NiFi instance that handles the events.  Everything was working as expected.
> 
> But then I stopped the reporting task, added the DOWNLOAD filter and then restarted the task.  Then I downloaded some FlowFiles again.  As Fred was seeing, these events did not come in.
> 
> However, then I stopped the reporting task, removed the filter, and then restarted the task.  Then I generated numerous provenance events (RETRIEVE, DOWNLOAD, etc).  But no events came in.
> 
> So maybe this has something to do with starting/stopping the reporting task in addition to adding the filter.
> 
> Fred, can you file a Jira?  I can add my comments to it.
> 
> Thanks,
> 
> -Drew
> 
> 
>> On Oct 5, 2017, at 9:51 AM, Mark Payne <ma...@hotmail.com> wrote:
>> 
>> Hi Fred,
>> 
>> I have used the reporting task with specific event types listed, but I've not run into issues personally.
>> 
>> I am curious if you have tried looking for more common event types, such as RECEIVE, CONTENT_MODIFIED, or ATTRIBUTES_MODIFIED?
>> EXPIRE events only fire when FlowFiles get aged off from a Connection due to not being processed within the
>> Connection's "FlowFile Expiration" threshold, and DOWNLOAD events are only triggered when a user downloads
>> the contents of the FlowFile to look at it manually. So both of these are fairly rare events.
>> RECEIVE, ATTRIBUTES_MODIFIED, and CONTENT_MODIFIED, on the other hand, will generally happen constantly.
>> 
>> Thanks
>> -Mark
>> 
>> 
>> 
>>> On Oct 4, 2017, at 4:50 PM, Fred S <fs...@gmail.com> wrote:
>>> 
>>> Dear community.
>>> 
>>> I'm currently trying to set up a SiteToSiteProvenanceReportingTask to keep track of DOWNLOAD provenance events.
>>> 
>>> If I set it up without any filters at all everything gets sent to my dedicated NiFi instance for handling these messages. When I set a filter in the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") - nothing.
>>> 
>>> I've been looking around for anyone experiencing a similar issue, without any luck. Have anyone else set this up successfully without having to log *all* provenance events? Due to the number of flowfiles I have flying by each day it would not be feasible to go with that approach.
>>> 
>>> I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth I've also tried to accomplish this both with and without security/ssl.
>>> 
>>> If anyone else needs any additional information about my setup, please let me know.
>>> 
>>> All the best,
>>> Fred
>> 
> 


Re: 3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Posted by Andrew Lim <an...@gmail.com>.
I had a SiteToSiteProvenanceReportingTask running in 1.3.0, so I gave this a try myself.

I think I’m able to reproduce the issue that Fred is seeing.  But I’m not sure if the issue is only related to adding an Event Type filter to the reporting task.

With no filter, I generated some DOWNLOAD events by downloading some FlowFiles.  I did see these come into my NiFi instance that handles the events.  Everything was working as expected.

But then I stopped the reporting task, added the DOWNLOAD filter and then restarted the task.  Then I downloaded some FlowFiles again.  As Fred was seeing, these events did not come in.

However, then I stopped the reporting task, removed the filter, and then restarted the task.  Then I generated numerous provenance events (RETRIEVE, DOWNLOAD, etc).  But no events came in.

So maybe this has something to do with starting/stopping the reporting task in addition to adding the filter.

Fred, can you file a Jira?  I can add my comments to it.

Thanks,

-Drew


> On Oct 5, 2017, at 9:51 AM, Mark Payne <ma...@hotmail.com> wrote:
> 
> Hi Fred,
> 
> I have used the reporting task with specific event types listed, but I've not run into issues personally.
> 
> I am curious if you have tried looking for more common event types, such as RECEIVE, CONTENT_MODIFIED, or ATTRIBUTES_MODIFIED?
> EXPIRE events only fire when FlowFiles get aged off from a Connection due to not being processed within the
> Connection's "FlowFile Expiration" threshold, and DOWNLOAD events are only triggered when a user downloads
> the contents of the FlowFile to look at it manually. So both of these are fairly rare events.
> RECEIVE, ATTRIBUTES_MODIFIED, and CONTENT_MODIFIED, on the other hand, will generally happen constantly.
> 
> Thanks
> -Mark
> 
> 
> 
>> On Oct 4, 2017, at 4:50 PM, Fred S <fs...@gmail.com> wrote:
>> 
>> Dear community.
>> 
>> I'm currently trying to set up a SiteToSiteProvenanceReportingTask to keep track of DOWNLOAD provenance events.
>> 
>> If I set it up without any filters at all everything gets sent to my dedicated NiFi instance for handling these messages. When I set a filter in the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") - nothing.
>> 
>> I've been looking around for anyone experiencing a similar issue, without any luck. Have anyone else set this up successfully without having to log *all* provenance events? Due to the number of flowfiles I have flying by each day it would not be feasible to go with that approach.
>> 
>> I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth I've also tried to accomplish this both with and without security/ssl.
>> 
>> If anyone else needs any additional information about my setup, please let me know.
>> 
>> All the best,
>> Fred
> 


Re: 3 of 7 Expand all Print all In new window SiteToSiteProvenanceReportingTask reports no data when event type filter is defined

Posted by Mark Payne <ma...@hotmail.com>.
Hi Fred,

I have used the reporting task with specific event types listed, but I've not run into issues personally.

I am curious if you have tried looking for more common event types, such as RECEIVE, CONTENT_MODIFIED, or ATTRIBUTES_MODIFIED?
EXPIRE events only fire when FlowFiles get aged off from a Connection due to not being processed within the
Connection's "FlowFile Expiration" threshold, and DOWNLOAD events are only triggered when a user downloads
the contents of the FlowFile to look at it manually. So both of these are fairly rare events.
RECEIVE, ATTRIBUTES_MODIFIED, and CONTENT_MODIFIED, on the other hand, will generally happen constantly.

Thanks
-Mark



On Oct 4, 2017, at 4:50 PM, Fred S <fs...@gmail.com>> wrote:

Dear community.

I'm currently trying to set up a SiteToSiteProvenanceReportingTask to keep track of DOWNLOAD provenance events.

If I set it up without any filters at all everything gets sent to my dedicated NiFi instance for handling these messages. When I set a filter in the "Event Type" property (I've tried "DOWNLOAD" and "DOWNLOAD, EXPIRE") - nothing.

I've been looking around for anyone experiencing a similar issue, without any luck. Have anyone else set this up successfully without having to log *all* provenance events? Due to the number of flowfiles I have flying by each day it would not be feasible to go with that approach.

I see this issue on both NiFi 1.3.0 and 1.4.0. For what it's worth I've also tried to accomplish this both with and without security/ssl.

If anyone else needs any additional information about my setup, please let me know.

All the best,
Fred