You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Sven Davison <sv...@gmail.com> on 2016/08/25 16:10:41 UTC

dynamic getTwitter ?

i have a GetTwitter processor which works wonders. I'm tracking a few
people and a couple hash tags but i'm also pulling all hashtags out of the
posts and tracking how many times i saw it and when the last time was that
i saw it.

example tweet: "hello world #earth #usa"

if i'm watching #usa, i'll still get both tags and put them into my
database. using the tag as the id, a count for how many times it's been
seen and a lastSeen field for when it was last seen.

what i would like to do, is dynamically follow new tags upon condition X.
Say... once #earth gets more than 500 posts and only if the tag was seen in
the last 7 days. I can make a view in MySQL to build the result set, but
how do i get that result set into nifi, to follow those tags that will
change.

Re: dynamic getTwitter ?

Posted by Sven Davison <sv...@gmail.com>.
awesome feedback/links. I'll poke around for a demo/tutorial to help me get
started, I'm a visual guy but once it CLICKS... i'm fine. I'm not familiar
w/ those languages but not against learning. I've learned a lot of other
languages based on need... looks like there's a need to learn python or
something now.

On Thu, Aug 25, 2016 at 2:50 PM, Andy LoPresto <al...@apache.org> wrote:

> And of course the Developer Guide [1] and Contributor Guide [2] on the
> NiFi site.
>
> [1] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
> [2] https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Aug 25, 2016, at 11:48 AM, Andy LoPresto <al...@apache.org> wrote:
>
> Another crazy idea — would it be more computationally efficient to use
> NiFi’s REST API to add a new instance of the GetTwitter processor if a new
> endpoint was needed? Basically track using the state manager which terms
> are currently registered (a map of terms to processor IDs) and if a new
> term needs to be searched, duplicate an existing processor and replace the
> search term? They could all be located in a specific PG to allow for
> isolation from the “meta-flow” that is operating on NiFi itself.
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Aug 25, 2016, at 11:45 AM, Andy LoPresto <al...@apache.org> wrote:
>
> Yeah I had a feeling there was a reason it didn’t support EL in the first
> place but didn’t know enough of the context. Thanks Aldrin.
>
> @Sven,
>
> Writing a custom processor is always a good exercise. If you are familiar
> with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to
> get a feel for the processor lifecycle and very rapid development feedback
> loop, and then transition to full-scale NAR development.
>
> If you run into any roadblocks or have more in-depth questions, I would
> recommend asking on the developer list as it is a bit more technical and
> some of the experienced NiFi users (even those not on the core development
> team) respond quickly to questions on that list.
>
> Matt Burgess has written a number of articles about this that are very
> helpful [1][2].
>
> [1] https://funnifi.blogspot.com/2016/02/executescript-
> processor-hello-world.html
> [2] https://funnifi.blogspot.com/2016/02/writing-reusable-
> scripted-processors-in.html
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Aug 25, 2016, at 10:34 AM, Sven Davison <sv...@gmail.com> wrote:
>
> Good to know. The more i think about this, the more it seems like a
> tech/dev version of the movie called 'pentagon wars'. Maybe a custom
> processor would serve a duel purpose.. getting it done.. and building my
> first custom processor.
>
> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <al...@gmail.com> wrote:
>
>> One consideration for why it does not support EL is due to client the
>> processor is wrapping that registers with a given endpoint.  EL would
>> require this disconnect/reconnection process to potentially happen on every
>> FlowFile presented to the processor (some smart caching could certainly
>> lessen the effect). Currently, filtering and such is very much integrated
>> with the lifecycle of the processor.  A more dynamic processor could be
>> achieved, but will come with a few caveats.
>>
>> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <sv...@gmail.com>
>> wrote:
>>
>>> thats, close to the same flow i was looking at really. but was chucked
>>> out for lack of EL support w/in GetTwitter. The good news is... we're
>>> learning!
>>>
>>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <al...@apache.org>
>>> wrote:
>>>
>>>> Hi Sven,
>>>>
>>>> Someone may have a more streamlined solution, but I’d suggest taking a
>>>> look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to
>>>> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to
>>>> extract the specific values you are interested in. Then use UpdateAttribute
>>>> [4] to populate those values from the flowfile content to an attribute, and
>>>> finally use GetTwitter [5] to filter on those values.
>>>>
>>>> However, at this time the query fields in GetTwitter do not support
>>>> Expression Language, so you will have to:
>>>>
>>>> * Modify the source of GetTwitter to support EL
>>>> * Raise a Jira requesting this feature
>>>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to
>>>> populate those values
>>>>
>>>> Sorry it’s not a cleaner solution. I would encourage you to raise the
>>>> Jira [7] to have GetTwitter support EL in the query properties. It’s likely
>>>> I am overlooking a potential simpler flow, but without EL support in
>>>> GetTwitter, I don’t see an easy way forward.
>>>>
>>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.standard.ExecuteSQL/index.html
>>>> [2] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.avro.ConvertAvroToJSON/index.html
>>>> [3] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.standard.EvaluateJsonPath/index.html
>>>> [4] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.attributes.UpdateAttribute/index.html
>>>> [5] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.twitter.GetTwitter/index.html
>>>> [6] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>>> ache.nifi.processors.script.ExecuteScript/index.html
>>>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>>>>
>>>> Andy LoPresto
>>>> alopresto@apache.org
>>>> *alopresto.apache@gmail.com <al...@gmail.com>*
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>>
>>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <sv...@gmail.com>
>>>> wrote:
>>>>
>>>> i have a GetTwitter processor which works wonders. I'm tracking a few
>>>> people and a couple hash tags but i'm also pulling all hashtags out of the
>>>> posts and tracking how many times i saw it and when the last time was that
>>>> i saw it.
>>>>
>>>> example tweet: "hello world #earth #usa"
>>>>
>>>> if i'm watching #usa, i'll still get both tags and put them into my
>>>> database. using the tag as the id, a count for how many times it's been
>>>> seen and a lastSeen field for when it was last seen.
>>>>
>>>> what i would like to do, is dynamically follow new tags upon condition
>>>> X. Say... once #earth gets more than 500 posts and only if the tag was seen
>>>> in the last 7 days. I can make a view in MySQL to build the result set, but
>>>> how do i get that result set into nifi, to follow those tags that will
>>>> change.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
>
>

Re: dynamic getTwitter ?

Posted by Andy LoPresto <al...@apache.org>.
And of course the Developer Guide [1] and Contributor Guide [2] on the NiFi site.

[1] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
[2] https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide <https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide>


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 11:48 AM, Andy LoPresto <al...@apache.org> wrote:
> 
> Another crazy idea — would it be more computationally efficient to use NiFi’s REST API to add a new instance of the GetTwitter processor if a new endpoint was needed? Basically track using the state manager which terms are currently registered (a map of terms to processor IDs) and if a new term needs to be searched, duplicate an existing processor and replace the search term? They could all be located in a specific PG to allow for isolation from the “meta-flow” that is operating on NiFi itself.
> 
> Andy LoPresto
> alopresto@apache.org <ma...@apache.org>
> alopresto.apache@gmail.com <ma...@gmail.com>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Aug 25, 2016, at 11:45 AM, Andy LoPresto <alopresto@apache.org <ma...@apache.org>> wrote:
>> 
>> Yeah I had a feeling there was a reason it didn’t support EL in the first place but didn’t know enough of the context. Thanks Aldrin.
>> 
>> @Sven,
>> 
>> Writing a custom processor is always a good exercise. If you are familiar with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to get a feel for the processor lifecycle and very rapid development feedback loop, and then transition to full-scale NAR development.
>> 
>> If you run into any roadblocks or have more in-depth questions, I would recommend asking on the developer list as it is a bit more technical and some of the experienced NiFi users (even those not on the core development team) respond quickly to questions on that list.
>> 
>> Matt Burgess has written a number of articles about this that are very helpful [1][2].
>> 
>> [1] https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html>
>> [2] https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html>
>> 
>> Andy LoPresto
>> alopresto@apache.org <ma...@apache.org>
>> alopresto.apache@gmail.com <ma...@gmail.com>
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Aug 25, 2016, at 10:34 AM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Good to know. The more i think about this, the more it seems like a tech/dev version of the movie called 'pentagon wars'. Maybe a custom processor would serve a duel purpose.. getting it done.. and building my first custom processor.
>>> 
>>> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <aldrinpiri@gmail.com <ma...@gmail.com>> wrote:
>>> One consideration for why it does not support EL is due to client the processor is wrapping that registers with a given endpoint.  EL would require this disconnect/reconnection process to potentially happen on every FlowFile presented to the processor (some smart caching could certainly lessen the effect). Currently, filtering and such is very much integrated with the lifecycle of the processor.  A more dynamic processor could be achieved, but will come with a few caveats.
>>> 
>>> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>>> thats, close to the same flow i was looking at really. but was chucked out for lack of EL support w/in GetTwitter. The good news is... we're learning!
>>> 
>>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <alopresto@apache.org <ma...@apache.org>> wrote:
>>> Hi Sven,
>>> 
>>> Someone may have a more streamlined solution, but I’d suggest taking a look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to extract the specific values you are interested in. Then use UpdateAttribute [4] to populate those values from the flowfile content to an attribute, and finally use GetTwitter [5] to filter on those values.
>>> 
>>> However, at this time the query fields in GetTwitter do not support Expression Language, so you will have to:
>>> 
>>> * Modify the source of GetTwitter to support EL
>>> * Raise a Jira requesting this feature
>>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to populate those values
>>> 
>>> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira [7] to have GetTwitter support EL in the query properties. It’s likely I am overlooking a potential simpler flow, but without EL support in GetTwitter, I don’t see an easy way forward.
>>> 
>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html>
>>> [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html>
>>> [3] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html>
>>> [4] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html>
>>> [5] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html>
>>> [6] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html>
>>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa <https://issues.apache.org/jira/secure/CreateIssue!default.jspa>
>>> 
>>> Andy LoPresto
>>> alopresto@apache.org <ma...@apache.org>
>>> alopresto.apache@gmail.com <ma...@gmail.com>
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>> 
>>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> i have a GetTwitter processor which works wonders. I'm tracking a few people and a couple hash tags but i'm also pulling all hashtags out of the posts and tracking how many times i saw it and when the last time was that i saw it.
>>>> 
>>>> example tweet: "hello world #earth #usa"
>>>> 
>>>> if i'm watching #usa, i'll still get both tags and put them into my database. using the tag as the id, a count for how many times it's been seen and a lastSeen field for when it was last seen.
>>>> 
>>>> what i would like to do, is dynamically follow new tags upon condition X. Say... once #earth gets more than 500 posts and only if the tag was seen in the last 7 days. I can make a view in MySQL to build the result set, but how do i get that result set into nifi, to follow those tags that will change.
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 


Re: dynamic getTwitter ?

Posted by Andy LoPresto <al...@apache.org>.
Another crazy idea — would it be more computationally efficient to use NiFi’s REST API to add a new instance of the GetTwitter processor if a new endpoint was needed? Basically track using the state manager which terms are currently registered (a map of terms to processor IDs) and if a new term needs to be searched, duplicate an existing processor and replace the search term? They could all be located in a specific PG to allow for isolation from the “meta-flow” that is operating on NiFi itself.

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 11:45 AM, Andy LoPresto <al...@apache.org> wrote:
> 
> Yeah I had a feeling there was a reason it didn’t support EL in the first place but didn’t know enough of the context. Thanks Aldrin.
> 
> @Sven,
> 
> Writing a custom processor is always a good exercise. If you are familiar with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to get a feel for the processor lifecycle and very rapid development feedback loop, and then transition to full-scale NAR development.
> 
> If you run into any roadblocks or have more in-depth questions, I would recommend asking on the developer list as it is a bit more technical and some of the experienced NiFi users (even those not on the core development team) respond quickly to questions on that list.
> 
> Matt Burgess has written a number of articles about this that are very helpful [1][2].
> 
> [1] https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html>
> [2] https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html>
> 
> Andy LoPresto
> alopresto@apache.org <ma...@apache.org>
> alopresto.apache@gmail.com <ma...@gmail.com>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Aug 25, 2016, at 10:34 AM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Good to know. The more i think about this, the more it seems like a tech/dev version of the movie called 'pentagon wars'. Maybe a custom processor would serve a duel purpose.. getting it done.. and building my first custom processor.
>> 
>> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <aldrinpiri@gmail.com <ma...@gmail.com>> wrote:
>> One consideration for why it does not support EL is due to client the processor is wrapping that registers with a given endpoint.  EL would require this disconnect/reconnection process to potentially happen on every FlowFile presented to the processor (some smart caching could certainly lessen the effect). Currently, filtering and such is very much integrated with the lifecycle of the processor.  A more dynamic processor could be achieved, but will come with a few caveats.
>> 
>> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>> thats, close to the same flow i was looking at really. but was chucked out for lack of EL support w/in GetTwitter. The good news is... we're learning!
>> 
>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <alopresto@apache.org <ma...@apache.org>> wrote:
>> Hi Sven,
>> 
>> Someone may have a more streamlined solution, but I’d suggest taking a look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to extract the specific values you are interested in. Then use UpdateAttribute [4] to populate those values from the flowfile content to an attribute, and finally use GetTwitter [5] to filter on those values.
>> 
>> However, at this time the query fields in GetTwitter do not support Expression Language, so you will have to:
>> 
>> * Modify the source of GetTwitter to support EL
>> * Raise a Jira requesting this feature
>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to populate those values
>> 
>> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira [7] to have GetTwitter support EL in the query properties. It’s likely I am overlooking a potential simpler flow, but without EL support in GetTwitter, I don’t see an easy way forward.
>> 
>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html>
>> [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html>
>> [3] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html>
>> [4] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html>
>> [5] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html>
>> [6] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html>
>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa <https://issues.apache.org/jira/secure/CreateIssue!default.jspa>
>> 
>> Andy LoPresto
>> alopresto@apache.org <ma...@apache.org>
>> alopresto.apache@gmail.com <ma...@gmail.com>
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> i have a GetTwitter processor which works wonders. I'm tracking a few people and a couple hash tags but i'm also pulling all hashtags out of the posts and tracking how many times i saw it and when the last time was that i saw it.
>>> 
>>> example tweet: "hello world #earth #usa"
>>> 
>>> if i'm watching #usa, i'll still get both tags and put them into my database. using the tag as the id, a count for how many times it's been seen and a lastSeen field for when it was last seen.
>>> 
>>> what i would like to do, is dynamically follow new tags upon condition X. Say... once #earth gets more than 500 posts and only if the tag was seen in the last 7 days. I can make a view in MySQL to build the result set, but how do i get that result set into nifi, to follow those tags that will change.
>>> 
>>> 
>> 
>> 
>> 
>> 
> 


Re: dynamic getTwitter ?

Posted by Andy LoPresto <al...@apache.org>.
Yeah I had a feeling there was a reason it didn’t support EL in the first place but didn’t know enough of the context. Thanks Aldrin.

@Sven,

Writing a custom processor is always a good exercise. If you are familiar with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to get a feel for the processor lifecycle and very rapid development feedback loop, and then transition to full-scale NAR development.

If you run into any roadblocks or have more in-depth questions, I would recommend asking on the developer list as it is a bit more technical and some of the experienced NiFi users (even those not on the core development team) respond quickly to questions on that list.

Matt Burgess has written a number of articles about this that are very helpful [1][2].

[1] https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html>
[2] https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html>

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 10:34 AM, Sven Davison <sv...@gmail.com> wrote:
> 
> Good to know. The more i think about this, the more it seems like a tech/dev version of the movie called 'pentagon wars'. Maybe a custom processor would serve a duel purpose.. getting it done.. and building my first custom processor.
> 
> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <aldrinpiri@gmail.com <ma...@gmail.com>> wrote:
> One consideration for why it does not support EL is due to client the processor is wrapping that registers with a given endpoint.  EL would require this disconnect/reconnection process to potentially happen on every FlowFile presented to the processor (some smart caching could certainly lessen the effect). Currently, filtering and such is very much integrated with the lifecycle of the processor.  A more dynamic processor could be achieved, but will come with a few caveats.
> 
> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
> thats, close to the same flow i was looking at really. but was chucked out for lack of EL support w/in GetTwitter. The good news is... we're learning!
> 
> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <alopresto@apache.org <ma...@apache.org>> wrote:
> Hi Sven,
> 
> Someone may have a more streamlined solution, but I’d suggest taking a look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to extract the specific values you are interested in. Then use UpdateAttribute [4] to populate those values from the flowfile content to an attribute, and finally use GetTwitter [5] to filter on those values.
> 
> However, at this time the query fields in GetTwitter do not support Expression Language, so you will have to:
> 
> * Modify the source of GetTwitter to support EL
> * Raise a Jira requesting this feature
> * Write a small script wrapping GetTwitter using ExecuteScript [6] to populate those values
> 
> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira [7] to have GetTwitter support EL in the query properties. It’s likely I am overlooking a potential simpler flow, but without EL support in GetTwitter, I don’t see an easy way forward.
> 
> [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html>
> [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html>
> [3] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html>
> [4] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html>
> [5] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html>
> [6] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html>
> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa <https://issues.apache.org/jira/secure/CreateIssue!default.jspa>
> 
> Andy LoPresto
> alopresto@apache.org <ma...@apache.org>
> alopresto.apache@gmail.com <ma...@gmail.com>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Aug 25, 2016, at 9:10 AM, Sven Davison <svendavison@gmail.com <ma...@gmail.com>> wrote:
>> 
>> i have a GetTwitter processor which works wonders. I'm tracking a few people and a couple hash tags but i'm also pulling all hashtags out of the posts and tracking how many times i saw it and when the last time was that i saw it.
>> 
>> example tweet: "hello world #earth #usa"
>> 
>> if i'm watching #usa, i'll still get both tags and put them into my database. using the tag as the id, a count for how many times it's been seen and a lastSeen field for when it was last seen.
>> 
>> what i would like to do, is dynamically follow new tags upon condition X. Say... once #earth gets more than 500 posts and only if the tag was seen in the last 7 days. I can make a view in MySQL to build the result set, but how do i get that result set into nifi, to follow those tags that will change.
>> 
>> 
> 
> 
> 
> 


Re: dynamic getTwitter ?

Posted by Sven Davison <sv...@gmail.com>.
Good to know. The more i think about this, the more it seems like a
tech/dev version of the movie called 'pentagon wars'. Maybe a custom
processor would serve a duel purpose.. getting it done.. and building my
first custom processor.

On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <al...@gmail.com> wrote:

> One consideration for why it does not support EL is due to client the
> processor is wrapping that registers with a given endpoint.  EL would
> require this disconnect/reconnection process to potentially happen on every
> FlowFile presented to the processor (some smart caching could certainly
> lessen the effect). Currently, filtering and such is very much integrated
> with the lifecycle of the processor.  A more dynamic processor could be
> achieved, but will come with a few caveats.
>
> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <sv...@gmail.com>
> wrote:
>
>> thats, close to the same flow i was looking at really. but was chucked
>> out for lack of EL support w/in GetTwitter. The good news is... we're
>> learning!
>>
>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <al...@apache.org>
>> wrote:
>>
>>> Hi Sven,
>>>
>>> Someone may have a more streamlined solution, but I’d suggest taking a
>>> look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to
>>> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to
>>> extract the specific values you are interested in. Then use UpdateAttribute
>>> [4] to populate those values from the flowfile content to an attribute, and
>>> finally use GetTwitter [5] to filter on those values.
>>>
>>> However, at this time the query fields in GetTwitter do not support
>>> Expression Language, so you will have to:
>>>
>>> * Modify the source of GetTwitter to support EL
>>> * Raise a Jira requesting this feature
>>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to
>>> populate those values
>>>
>>> Sorry it’s not a cleaner solution. I would encourage you to raise the
>>> Jira [7] to have GetTwitter support EL in the query properties. It’s likely
>>> I am overlooking a potential simpler flow, but without EL support in
>>> GetTwitter, I don’t see an easy way forward.
>>>
>>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.standard.ExecuteSQL/index.html
>>> [2] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.avro.ConvertAvroToJSON/index.html
>>> [3] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.standard.EvaluateJsonPath/index.html
>>> [4] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.attributes.UpdateAttribute/index.html
>>> [5] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.twitter.GetTwitter/index.html
>>> [6] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>>> ache.nifi.processors.script.ExecuteScript/index.html
>>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>>>
>>> Andy LoPresto
>>> alopresto@apache.org
>>> *alopresto.apache@gmail.com <al...@gmail.com>*
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>>
>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <sv...@gmail.com> wrote:
>>>
>>> i have a GetTwitter processor which works wonders. I'm tracking a few
>>> people and a couple hash tags but i'm also pulling all hashtags out of the
>>> posts and tracking how many times i saw it and when the last time was that
>>> i saw it.
>>>
>>> example tweet: "hello world #earth #usa"
>>>
>>> if i'm watching #usa, i'll still get both tags and put them into my
>>> database. using the tag as the id, a count for how many times it's been
>>> seen and a lastSeen field for when it was last seen.
>>>
>>> what i would like to do, is dynamically follow new tags upon condition
>>> X. Say... once #earth gets more than 500 posts and only if the tag was seen
>>> in the last 7 days. I can make a view in MySQL to build the result set, but
>>> how do i get that result set into nifi, to follow those tags that will
>>> change.
>>>
>>>
>>>
>>>
>>
>

Re: dynamic getTwitter ?

Posted by Aldrin Piri <al...@gmail.com>.
One consideration for why it does not support EL is due to client the
processor is wrapping that registers with a given endpoint.  EL would
require this disconnect/reconnection process to potentially happen on every
FlowFile presented to the processor (some smart caching could certainly
lessen the effect). Currently, filtering and such is very much integrated
with the lifecycle of the processor.  A more dynamic processor could be
achieved, but will come with a few caveats.

On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <sv...@gmail.com> wrote:

> thats, close to the same flow i was looking at really. but was chucked out
> for lack of EL support w/in GetTwitter. The good news is... we're learning!
>
> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <al...@apache.org>
> wrote:
>
>> Hi Sven,
>>
>> Someone may have a more streamlined solution, but I’d suggest taking a
>> look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to
>> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to
>> extract the specific values you are interested in. Then use UpdateAttribute
>> [4] to populate those values from the flowfile content to an attribute, and
>> finally use GetTwitter [5] to filter on those values.
>>
>> However, at this time the query fields in GetTwitter do not support
>> Expression Language, so you will have to:
>>
>> * Modify the source of GetTwitter to support EL
>> * Raise a Jira requesting this feature
>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to
>> populate those values
>>
>> Sorry it’s not a cleaner solution. I would encourage you to raise the
>> Jira [7] to have GetTwitter support EL in the query properties. It’s likely
>> I am overlooking a potential simpler flow, but without EL support in
>> GetTwitter, I don’t see an easy way forward.
>>
>> [1] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.standard.ExecuteSQL/index.html
>> [2] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.avro.ConvertAvroToJSON/index.html
>> [3] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.standard.EvaluateJsonPath/index.html
>> [4] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.attributes.UpdateAttribute/index.html
>> [5] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.twitter.GetTwitter/index.html
>> [6] https://nifi.apache.org/docs/nifi-docs/components/org.ap
>> ache.nifi.processors.script.ExecuteScript/index.html
>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>>
>> Andy LoPresto
>> alopresto@apache.org
>> *alopresto.apache@gmail.com <al...@gmail.com>*
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> On Aug 25, 2016, at 9:10 AM, Sven Davison <sv...@gmail.com> wrote:
>>
>> i have a GetTwitter processor which works wonders. I'm tracking a few
>> people and a couple hash tags but i'm also pulling all hashtags out of the
>> posts and tracking how many times i saw it and when the last time was that
>> i saw it.
>>
>> example tweet: "hello world #earth #usa"
>>
>> if i'm watching #usa, i'll still get both tags and put them into my
>> database. using the tag as the id, a count for how many times it's been
>> seen and a lastSeen field for when it was last seen.
>>
>> what i would like to do, is dynamically follow new tags upon condition X.
>> Say... once #earth gets more than 500 posts and only if the tag was seen in
>> the last 7 days. I can make a view in MySQL to build the result set, but
>> how do i get that result set into nifi, to follow those tags that will
>> change.
>>
>>
>>
>>
>

Re: dynamic getTwitter ?

Posted by Sven Davison <sv...@gmail.com>.
thats, close to the same flow i was looking at really. but was chucked out
for lack of EL support w/in GetTwitter. The good news is... we're learning!

On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <al...@apache.org>
wrote:

> Hi Sven,
>
> Someone may have a more streamlined solution, but I’d suggest taking a
> look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to
> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to
> extract the specific values you are interested in. Then use UpdateAttribute
> [4] to populate those values from the flowfile content to an attribute, and
> finally use GetTwitter [5] to filter on those values.
>
> However, at this time the query fields in GetTwitter do not support
> Expression Language, so you will have to:
>
> * Modify the source of GetTwitter to support EL
> * Raise a Jira requesting this feature
> * Write a small script wrapping GetTwitter using ExecuteScript [6] to
> populate those values
>
> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira
> [7] to have GetTwitter support EL in the query properties. It’s likely I am
> overlooking a potential simpler flow, but without EL support in GetTwitter,
> I don’t see an easy way forward.
>
> [1] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.standard.ExecuteSQL/index.html
> [2] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.avro.ConvertAvroToJSON/index.html
> [3] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.standard.EvaluateJsonPath/index.html
> [4] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.attributes.UpdateAttribute/index.html
> [5] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.twitter.GetTwitter/index.html
> [6] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.script.ExecuteScript/index.html
> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Aug 25, 2016, at 9:10 AM, Sven Davison <sv...@gmail.com> wrote:
>
> i have a GetTwitter processor which works wonders. I'm tracking a few
> people and a couple hash tags but i'm also pulling all hashtags out of the
> posts and tracking how many times i saw it and when the last time was that
> i saw it.
>
> example tweet: "hello world #earth #usa"
>
> if i'm watching #usa, i'll still get both tags and put them into my
> database. using the tag as the id, a count for how many times it's been
> seen and a lastSeen field for when it was last seen.
>
> what i would like to do, is dynamically follow new tags upon condition X.
> Say... once #earth gets more than 500 posts and only if the tag was seen in
> the last 7 days. I can make a view in MySQL to build the result set, but
> how do i get that result set into nifi, to follow those tags that will
> change.
>
>
>
>

Re: dynamic getTwitter ?

Posted by Andy LoPresto <al...@apache.org>.
Hi Sven,

Someone may have a more streamlined solution, but I’d suggest taking a look at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to extract the specific values you are interested in. Then use UpdateAttribute [4] to populate those values from the flowfile content to an attribute, and finally use GetTwitter [5] to filter on those values.

However, at this time the query fields in GetTwitter do not support Expression Language, so you will have to:

* Modify the source of GetTwitter to support EL
* Raise a Jira requesting this feature
* Write a small script wrapping GetTwitter using ExecuteScript [6] to populate those values

Sorry it’s not a cleaner solution. I would encourage you to raise the Jira [7] to have GetTwitter support EL in the query properties. It’s likely I am overlooking a potential simpler flow, but without EL support in GetTwitter, I don’t see an easy way forward.

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
[2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html
[3] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html
[4] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
[5] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html
[6] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html
[7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 9:10 AM, Sven Davison <sv...@gmail.com> wrote:
> 
> i have a GetTwitter processor which works wonders. I'm tracking a few people and a couple hash tags but i'm also pulling all hashtags out of the posts and tracking how many times i saw it and when the last time was that i saw it.
> 
> example tweet: "hello world #earth #usa"
> 
> if i'm watching #usa, i'll still get both tags and put them into my database. using the tag as the id, a count for how many times it's been seen and a lastSeen field for when it was last seen.
> 
> what i would like to do, is dynamically follow new tags upon condition X. Say... once #earth gets more than 500 posts and only if the tag was seen in the last 7 days. I can make a view in MySQL to build the result set, but how do i get that result set into nifi, to follow those tags that will change.
> 
>