You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Igor Kravzov <ig...@gmail.com> on 2016/05/11 14:35:05 UTC

Assign FlowFile (split text) to an attribute - which processor to use?

Hi,

I am about to build following workflow:

1. Reads file (search keyword on a new line) form a disk (GetFile PR)
2. Split file (SplitText PR)
3. Assign splits line to an attribute (UpdateAttribute, how???)
4. Pass attribute to HTTP query (InvokeHTTP PR)
....

So, how to assign split line to an attribute?

Thanks in advance.

Re: Assign FlowFile (split text) to an attribute - which processor to use?

Posted by Igor Kravzov <ig...@gmail.com>.
Thanks Mark. It worked.

On Wed, May 11, 2016 at 10:41 AM, Mark Payne <ma...@hotmail.com> wrote:

> Igor,
>
> You can use the ExtractText processor instead of UpdateAttribute. It
> allows you to use a Regular Expression
> to match against the content of a FlowFile and create an attribute from
> the Capturing Group. So you could add
> a property to the processor named "myAttribute" with a value of "(.+)" and
> that will extract the text of the FlowFile
> into an attribute named "myAttribute".
>
> Thanks
> -Mark
>
> > On May 11, 2016, at 10:35 AM, Igor Kravzov <ig...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I am about to build following workflow:
> >
> > 1. Reads file (search keyword on a new line) form a disk (GetFile PR)
> > 2. Split file (SplitText PR)
> > 3. Assign splits line to an attribute (UpdateAttribute, how???)
> > 4. Pass attribute to HTTP query (InvokeHTTP PR)
> > ....
> >
> > So, how to assign split line to an attribute?
> >
> > Thanks in advance.
>
>

Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Posted by Keith Lim <Ke...@ds-iq.com>.
Thanks Aldrin for providing a clear sample in such a short time.   That Advanced menu implementation is very eye-opening. :) I appreciate your great effort.


Thanks,
Keith

________________________________
From: Aldrin Piri <al...@gmail.com>
Sent: Thursday, May 12, 2016 6:36 PM
To: users@nifi.apache.org
Subject: Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Hi Keith,

I threw together a quick example of how this is done and made it available as a GitHub Gist [1].  In that template, the core logic of transforming the evaluated JsonPath expression result is in the "Count IDs as Attribute" processor which is making use of the  'Advanced' menu for UpdateAttribute.  As a caveat, this is quite quick and dirty, but illustrates how you can extract values from content to attributes for your case.

One thing I would like to highlight is that Expression Language (EL) is exclusively against FlowFile attributes.  The pattern that typically happens is an extraction/selection of key features/characteristics from the content, promoted as attributes for further manipulation and handling.  This approach is taken in the referenced template.

I additionally noticed that I was also incorrect in my previous statement and overlooked the length() operator for the JsonPath library [2].  This, however, does not seem to apply to expressions for EvaluateJsonPath and will likely need some additional inspection.  I have created an issue for that [3].

Please let us know if you have any additional questions!

--aldrin

[1] https://gist.github.com/apiri/0e2d0c9b1a7a4f109fbc91da56693d30
[https://avatars3.githubusercontent.com/u/502889?v=3&s=400]<https://gist.github.com/apiri/0e2d0c9b1a7a4f109fbc91da56693d30>

Generates a sample JSON flowfile which then has: 1) An array of values extracted to an attribute 2) Using the 'Advanced' menu of UpdateAttributes, determines if the result IDs are non-empty and then performs a count using allDelinatedValues to create an id<https://gist.github.com/apiri/0e2d0c9b1a7a4f109fbc91da56693d30>
gist.github.com
Generates a sample JSON flowfile which then has: 1) An array of values extracted to an attribute 2) Using the 'Advanced' menu of UpdateAttributes, determines if the result IDs are non-empty and t...


[2] https://github.com/jayway/JsonPath#functions
[3] https://issues.apache.org/jira/browse/NIFI-1875

On Thu, May 12, 2016 at 2:52 PM, Keith Lim <Ke...@ds-iq.com>> wrote:

Hi Aldrin,

I am still not able to get this to work.  I see that the expression language guide (http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html) mentions how to work with flowfile attributes, but not on the flowfile content itself.  What is the builtin variable that points to the flowfile content that I can used with the expression language?

For illustration, let's say I have the below json as the content of a flowfile.  What processor can I used to take this flowfile as input and how can I extract the count of item in the json array "results" below.  I want to assign it to a user defined attribute in the processor.

{ "results" :  [   {  "name" : "Jane Doe",  "id" : "1" },   {  "name" : "John Doe", "id" : "2" }  ] }

Thanks for all your help.

Thanks,
keith
________________________________
From: Aldrin Piri <al...@gmail.com>>
Sent: Thursday, May 12, 2016 1:32 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Hi Keith,

Scanning over some of the docs, it does not appear that JsonPath supports a count operator but could possibly be used to extract from your source document that could possibly be manipulated using something like allDelinatedValues [1].  Certainly not the most elegant approach, but could work.

If you would like some additional help, a sample to work from would be nice to give some more concrete assistance.

[1] http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#alldelineatedvalues

On Thu, May 12, 2016 at 11:52 AM, Keith Lim <Ke...@ds-iq.com>> wrote:
I have a flow file with json array and would like to use EvaluateJsonPath processor the extract the item count of that array.  Does the Nifi Expression Language in combination of json path feature supports this without writing script?  What is the syntax?

Thanks,
Keith





Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Posted by Aldrin Piri <al...@gmail.com>.
Hi Keith,

I threw together a quick example of how this is done and made it available
as a GitHub Gist [1].  In that template, the core logic of transforming the
evaluated JsonPath expression result is in the "Count IDs as Attribute"
processor which is making use of the  'Advanced' menu for UpdateAttribute.
As a caveat, this is quite quick and dirty, but illustrates how you can
extract values from content to attributes for your case.

One thing I would like to highlight is that Expression Language (EL) is
exclusively against FlowFile attributes.  The pattern that typically
happens is an extraction/selection of key features/characteristics from the
content, promoted as attributes for further manipulation and handling.
This approach is taken in the referenced template.

I additionally noticed that I was also incorrect in my previous statement
and overlooked the length() operator for the JsonPath library [2].  This,
however, does not seem to apply to expressions for EvaluateJsonPath and
will likely need some additional inspection.  I have created an issue for
that [3].

Please let us know if you have any additional questions!

--aldrin

[1] https://gist.github.com/apiri/0e2d0c9b1a7a4f109fbc91da56693d30
[2] https://github.com/jayway/JsonPath#functions
[3] https://issues.apache.org/jira/browse/NIFI-1875

On Thu, May 12, 2016 at 2:52 PM, Keith Lim <Ke...@ds-iq.com> wrote:

> Hi Aldrin,
>
> I am still not able to get this to work.  I see that the expression
> language guide (
> http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html)
> mentions how to work with flowfile attributes, but not on the flowfile
> content itself.  What is the builtin variable that points to the flowfile
> content that I can used with the expression language?
>
> For illustration, let's say I have the below json as the content of a
> flowfile.  What processor can I used to take this flowfile as input and how
> can I extract the count of item in the json array "results" below.  I want
> to assign it to a user defined attribute in the processor.
>
> { "results" :  [   {  "name" : "Jane Doe",  "id" : "1" },   {  "name" :
> "John Doe", "id" : "2" }  ] }
>
> Thanks for all your help.
>
> Thanks,
> keith
> ------------------------------
> *From:* Aldrin Piri <al...@gmail.com>
> *Sent:* Thursday, May 12, 2016 1:32 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: How to extract scalar info of json array using
> EvaluateJsonPath processor?
>
> Hi Keith,
>
> Scanning over some of the docs, it does not appear that JsonPath supports
> a count operator but could possibly be used to extract from your source
> document that could possibly be manipulated using something like
> allDelinatedValues [1].  Certainly not the most elegant approach, but could
> work.
>
> If you would like some additional help, a sample to work from would be
> nice to give some more concrete assistance.
>
> [1]
> http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#alldelineatedvalues
>
> On Thu, May 12, 2016 at 11:52 AM, Keith Lim <Ke...@ds-iq.com> wrote:
>
>> I have a flow file with json array and would like to use EvaluateJsonPath
>> processor the extract the item count of that array.  Does the Nifi
>> Expression Language in combination of json path feature supports this without
>> writing script?  What is the syntax?
>>
>> Thanks,
>> Keith
>>
>>
>>
>

Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Posted by Keith Lim <Ke...@ds-iq.com>.
Hi Aldrin,

I am still not able to get this to work.  I see that the expression language guide (http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html) mentions how to work with flowfile attributes, but not on the flowfile content itself.  What is the builtin variable that points to the flowfile content that I can used with the expression language?

For illustration, let's say I have the below json as the content of a flowfile.  What processor can I used to take this flowfile as input and how can I extract the count of item in the json array "results" below.  I want to assign it to a user defined attribute in the processor.

{ "results" :  [   {  "name" : "Jane Doe",  "id" : "1" },   {  "name" : "John Doe", "id" : "2" }  ] }

Thanks for all your help.

Thanks,
keith
________________________________
From: Aldrin Piri <al...@gmail.com>
Sent: Thursday, May 12, 2016 1:32 PM
To: users@nifi.apache.org
Subject: Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Hi Keith,

Scanning over some of the docs, it does not appear that JsonPath supports a count operator but could possibly be used to extract from your source document that could possibly be manipulated using something like allDelinatedValues [1].  Certainly not the most elegant approach, but could work.

If you would like some additional help, a sample to work from would be nice to give some more concrete assistance.

[1] http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#alldelineatedvalues

On Thu, May 12, 2016 at 11:52 AM, Keith Lim <Ke...@ds-iq.com>> wrote:
I have a flow file with json array and would like to use EvaluateJsonPath processor the extract the item count of that array.  Does the Nifi Expression Language in combination of json path feature supports this without writing script?  What is the syntax?

Thanks,
Keith




Re: How to extract scalar info of json array using EvaluateJsonPath processor?

Posted by Aldrin Piri <al...@gmail.com>.
Hi Keith,

Scanning over some of the docs, it does not appear that JsonPath supports a
count operator but could possibly be used to extract from your source
document that could possibly be manipulated using something like
allDelinatedValues [1].  Certainly not the most elegant approach, but could
work.

If you would like some additional help, a sample to work from would be nice
to give some more concrete assistance.

[1]
http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#alldelineatedvalues

On Thu, May 12, 2016 at 11:52 AM, Keith Lim <Ke...@ds-iq.com> wrote:

> I have a flow file with json array and would like to use EvaluateJsonPath
> processor the extract the item count of that array.  Does the Nifi
> Expression Language in combination of json path feature supports this without
> writing script?  What is the syntax?
>
> Thanks,
> Keith
>
>
>

How to extract scalar info of json array using EvaluateJsonPath processor?

Posted by Keith Lim <Ke...@ds-iq.com>.
I have a flow file with json array and would like to use EvaluateJsonPath processor the extract the item count of that array.  Does the Nifi Expression Language in combination of json path feature supports this without writing script?  What is the syntax?

Thanks,
Keith



Re: How to extract mutiple json properties/fields into processor properties?

Posted by Matt Burgess <ma...@gmail.com>.
Are you using update attribute to fill HTTP header attributes? In any case, I think InvokeHttp will be a solution.

Regards,
Matt

> On May 11, 2016, at 6:15 PM, Keith Lim <Ke...@ds-iq.com> wrote:
> 
> Thanks Brian, that works.  I have a follow up question.  I want to use the update attribute flowfile from this 
> 
> EvaluateJSONPath processor and flow it to a GetHttp.  However, GetHttp does not allow me to link to, i.e. to take a flowfile as input.   Is there a different processor I should use to do this?
> 
> Thanks,
> Keith
> 
> 
> From: Bryan Bende <bb...@gmail.com>
> Sent: Wednesday, May 11, 2016 12:01 PM
> To: users@nifi.apache.org
> Subject: Re: How to extract mutiple json properties/fields into processor properties?
>  
> Hi Keith,
> 
> If you have a single JSON document and want to extract data from it into attributes, then the processor you would be interested in is EvaluateJSONPath.
> 
> You add user defined properties to the processor, where the name will be the id of the resulting attribute, and the value will be the json path to extract.
> 
> For instance, if you added properties:
> 
> myId = $.id
> myMsg = $.msg
> 
> That would extract the values of the "id" and "msg" fields from the JSON and put them on to the flow file as attributes with the names "myId" and "myMsg".
> 
> You would also want to set "Destination" to "flowfile-attribute".
> 
> -Bryan
> 
> 
>> On Wed, May 11, 2016 at 2:47 PM, Keith Lim <Ke...@ds-iq.com> wrote:
>> I want to extract from a single json blob (single record) of several properties into user defined properties.
>> Can I use a single SplitJson processor to extract all of them?  How do you set that up?
>> Or do I have to use one instance for each property extraction and flow through from one to another?
>> I.e. if I have 3 properties that I want to extract, do I need to string 3 SplitJson in a series?
>> 
>> Thanks,
>> Keith
> 

Re: How to extract mutiple json properties/fields into processor properties?

Posted by Keith Lim <Ke...@ds-iq.com>.
Thanks Brian, that works.  I have a follow up question.  I want to use the update attribute flowfile from this

EvaluateJSONPath processor and flow it to a GetHttp.  However, GetHttp does not allow me to link to, i.e. to take a flowfile as input.   Is there a different processor I should use to do this?

Thanks,
Keith


________________________________
From: Bryan Bende <bb...@gmail.com>
Sent: Wednesday, May 11, 2016 12:01 PM
To: users@nifi.apache.org
Subject: Re: How to extract mutiple json properties/fields into processor properties?

Hi Keith,

If you have a single JSON document and want to extract data from it into attributes, then the processor you would be interested in is EvaluateJSONPath.

You add user defined properties to the processor, where the name will be the id of the resulting attribute, and the value will be the json path to extract.

For instance, if you added properties:

myId = $.id
myMsg = $.msg

That would extract the values of the "id" and "msg" fields from the JSON and put them on to the flow file as attributes with the names "myId" and "myMsg".

You would also want to set "Destination" to "flowfile-attribute".

-Bryan


On Wed, May 11, 2016 at 2:47 PM, Keith Lim <Ke...@ds-iq.com>> wrote:
I want to extract from a single json blob (single record) of several properties into user defined properties.
Can I use a single SplitJson processor to extract all of them?  How do you set that up?
Or do I have to use one instance for each property extraction and flow through from one to another?
I.e. if I have 3 properties that I want to extract, do I need to string 3 SplitJson in a series?

Thanks,
Keith



Re: How to extract mutiple json properties/fields into processor properties?

Posted by Bryan Bende <bb...@gmail.com>.
Hi Keith,

If you have a single JSON document and want to extract data from it into
attributes, then the processor you would be interested in is
EvaluateJSONPath.

You add user defined properties to the processor, where the name will be
the id of the resulting attribute, and the value will be the json path to
extract.

For instance, if you added properties:

myId = $.id
myMsg = $.msg

That would extract the values of the "id" and "msg" fields from the JSON
and put them on to the flow file as attributes with the names "myId" and
"myMsg".

You would also want to set "Destination" to "flowfile-attribute".

-Bryan


On Wed, May 11, 2016 at 2:47 PM, Keith Lim <Ke...@ds-iq.com> wrote:

> I want to extract from a single json blob (single record) of several
> properties into user defined properties.
> Can I use a single SplitJson processor to extract all of them?  How do you
> set that up?
> Or do I have to use one instance for each property extraction and flow
> through from one to another?
> I.e. if I have 3 properties that I want to extract, do I need to string 3
> SplitJson in a series?
>
> Thanks,
> Keith
>
>

How to extract mutiple json properties/fields into processor properties?

Posted by Keith Lim <Ke...@ds-iq.com>.
I want to extract from a single json blob (single record) of several properties into user defined properties. 
Can I use a single SplitJson processor to extract all of them?  How do you set that up?
Or do I have to use one instance for each property extraction and flow through from one to another? 
I.e. if I have 3 properties that I want to extract, do I need to string 3 SplitJson in a series?

Thanks,
Keith


Re: Assign FlowFile (split text) to an attribute - which processor to use?

Posted by Mark Payne <ma...@hotmail.com>.
Igor,

You can use the ExtractText processor instead of UpdateAttribute. It allows you to use a Regular Expression
to match against the content of a FlowFile and create an attribute from the Capturing Group. So you could add
a property to the processor named "myAttribute" with a value of "(.+)" and that will extract the text of the FlowFile
into an attribute named "myAttribute".

Thanks
-Mark

> On May 11, 2016, at 10:35 AM, Igor Kravzov <ig...@gmail.com> wrote:
> 
> Hi,
> 
> I am about to build following workflow:
> 
> 1. Reads file (search keyword on a new line) form a disk (GetFile PR)
> 2. Split file (SplitText PR)
> 3. Assign splits line to an attribute (UpdateAttribute, how???)
> 4. Pass attribute to HTTP query (InvokeHTTP PR)
> ....
> 
> So, how to assign split line to an attribute?
> 
> Thanks in advance.