You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by marlborino <p....@gmail.com> on 2016/11/02 08:27:32 UTC

Get the provenance of a flowfile from java api

Hi all,
I have developed a custom processor in order to parse an XML content of a
flowfile.

Sometimes the xml is not well-formed so the processor crashes.
I would like to retrieve from the recieved flowfile the provenance data in
order to get the previous processor ID.
Is it possibile through java api?
Thanks a lot.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Get-the-provenance-of-a-flowfile-from-java-api-tp13788.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Get the provenance of a flowfile from java api

Posted by Matt Burgess <ma...@apache.org>.
Self-modifying flows are not a common use case. It is possible to stop
a processor using the REST API (even from inside another processor),
but I recommend you handle the flow using a different approach.

For example, instead of your processor crashing when the XML is
malformed, it could catch the error, log it, and send the flow file
out to a "failure" relationship. That could be used to alert the user
somehow (send an email, e.g.) so they can go in and stop the
InvokeHttp processor. Alternatively (or in addition to), you can route
that "failure" relationship to a processor that is not started, and
you can set the backpressure for that part of the flow (all processors
downstream from the failure relationship) to a single object. At that
point, it will indicate to the framework that InvokeHttp should no
longer be scheduled, and it will effectively stop running (although it
will still be "started" in the UI, just not triggered while the
backpressure is being applied).

Note that in this case, you'd want one of your custom processors for
each instance of InvokeHttp, so the applied backpressure would not
hold up the InvokeHttp instances that are fetching valid XML.
Alternatively, if you know what the XML is supposed to look like (i.e.
you have an XML Schema Definition for the incoming files), you can use
ValidateXml after each InvokeHttp, and the "failure" relationship can
be handled as described, where all the "success" relationships could
go to a single instance of your custom processor.

For completeness, if you do want to stop an offending processor from
inside your custom processor, you'd need to make a REST API call to
query the provenance events for the offending flow file's UUID, along
with a Component Type of "InvokeHttp". That is an asynchronous call,
it will return JSON including a URL to check for completion, so you'll
have to manually retry that call until it is finished. Then you can
parse the results looking for the processor UUID corresponding to the
InvokeHttp processor, then you can make a REST API call to stop that
processor (I believe by setting its status to "STOPPED"). As you can
see, it is pretty complicated, so you may want to try something else.

Regards,
Matt

On Thu, Nov 3, 2016 at 3:41 AM, marlborino <p....@gmail.com> wrote:
> Thanks for your reply.
> I have understood this approach, but I think that doesn't fit my
> requirements.
>
> If I can I would like to explain better my use case in order to get some
> suggestions about I can develop my process.
>
> My workflow is composed by several invokeHTTP processors.
> Everyone of them retrieves an XML page from a specific URL, it sends the
> flowfile to a custom processor in order to parse the XML and it puts each
> item into a text file.
> Sometimes, as I told you in the first post, some processors crashes because
> of the XML is not well formed. In this case I would like to stop the
> invokeHTTP processor which has retrieved the malformed XML.
> Keeping in mind that the parser processor is custom, so I can handle and
> rewrite the function as I want, is there a way to do this function?
>
> Thanks a lot for any suggestion.
>
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Get-the-provenance-of-a-flowfile-from-java-api-tp13788p13793.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Get the provenance of a flowfile from java api

Posted by marlborino <p....@gmail.com>.
Thanks for your reply.
I have understood this approach, but I think that doesn't fit my
requirements.

If I can I would like to explain better my use case in order to get some
suggestions about I can develop my process.

My workflow is composed by several invokeHTTP processors.
Everyone of them retrieves an XML page from a specific URL, it sends the
flowfile to a custom processor in order to parse the XML and it puts each
item into a text file.
Sometimes, as I told you in the first post, some processors crashes because
of the XML is not well formed. In this case I would like to stop the
invokeHTTP processor which has retrieved the malformed XML.
Keeping in mind that the parser processor is custom, so I can handle and
rewrite the function as I want, is there a way to do this function?

Thanks a lot for any suggestion.




--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Get-the-provenance-of-a-flowfile-from-java-api-tp13788p13793.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Get the provenance of a flowfile from java api

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

Provenance data is not directly accessible through the processor API.

You can retrieve provenance data through the REST API [1] or from a
reporting task [2].

-Bryan

[1] https://nifi.apache.org/docs/nifi-docs/rest-api/index.html
[2]
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-site-to-site-reporting-bundle/nifi-site-to-site-reporting-task/src/main/java/org/apache/nifi/reporting/SiteToSiteProvenanceReportingTask.java#L161


On Wed, Nov 2, 2016 at 4:27 AM, marlborino <p....@gmail.com> wrote:

> Hi all,
> I have developed a custom processor in order to parse an XML content of a
> flowfile.
>
> Sometimes the xml is not well-formed so the processor crashes.
> I would like to retrieve from the recieved flowfile the provenance data in
> order to get the previous processor ID.
> Is it possibile through java api?
> Thanks a lot.
>
>
>
> --
> View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/Get-the-provenance-of-a-flowfile-
> from-java-api-tp13788.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>