You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Mika Borner <ni...@my2ndhead.com> on 2017/04/16 06:50:37 UTC

LookupAttribute Processor how to config

Hi

I'm struggling with the LookupAttribute Processor [1].

My flowfile has an attribute "foo" with the value "foovalue1". I want to 
create a new attribute "bar", that matches foo's value (=>barvalue1). 
Therefore I have defined a csv lookup table like this:


     foo,bar

     foovalue1,barvalue1

     foovalue2,barvalue2


Therefore I'm creating  a new dynamic property "bar" with the value 
"${foo}". Unfortunately this does not work. Any hints?

I think this processor is very useful. Will it be integrated into an 
official Nifi release anytime?

Thanks

Mika>


[1] 
https://github.com/jfrazee/nifi-lookup-service/tree/file-based-lookup-service


Re: LookupAttribute Processor how to config

Posted by Mika Borner <ni...@my2ndhead.com>.
Another question:

I tried to use the InMemoryLookupTableService, but was not able to find 
out how to fill it with key/values. Where/How can I add entries?

Thanks

Mika>


On 04/17/2017 12:22 AM, Mika Borner wrote:
> Thanks for your answer.
>
> After playing around with your example, I think I'm starting to 
> understand how it works.
>
> in the example, I've defined a property "foo" with a value of 
> "foovalue1" within the GenerateFlowFile processor. Inside the 
> LookupAttribute processor, I've changed the value of the property 
> "bar" into "${foo}.1". This will create correctly an attribute "bar" 
> with the lookup value of "barvalue1".
>
> -What I'm still struggling with is to understand where the "dot one" 
> comes from. Is this from the column index needed for multi-column tables?
> -How often will the CSV controller service refresh when a file is 
> changed? I think it would be handy if this could be done based on a 
> schedule.
>
> You asked for more feedback regarding the csv controller service. 
> Coming from another platform (Splunk) where lookup tables are used 
> extensively for enriching and filtering, following features would be 
> useful :-)
>
> -Being able to select multiple columns (input attributes) at the same 
> time as a criteria for the lookup
> -Being able to select one or more columns as output attributes
> -Wildcard/Regex matching of values
> -CaseInsensitive matching of values
> -CIDR matching for IP networks
>
> Probably larger CSV files should be indexed and cached in memory for 
> reasonable performance.
>
> Thanks
> Mika>
>
>
>
> On 04/16/2017 09:06 PM, Joey Frazee wrote:
>> Mika,
>>
>> The values for the dynamic properties are the names of the lookup 
>> keys themselves. On top of that for the CSV lookup table they’re 
>> indexed for multi-column tables, so for your example you want to add 
>> a dynamic property bar with the value foo.1 (see [1] for a template). 
>> The reason ${foo} doesn’t work is twofold: (1) there’s no foo 
>> attribute yet so ${foo} is empty and so there’s no key, and (2) it’s 
>> not indexed. I suppose this indicates the documentation isn’t 
>> sufficient or that the CSV controller service is just confusing and 
>> should work differently.
>>
>> You might be wondering why the property values are the lookup keys. 
>> The answer is that it’s very useful for use cases where you’re using 
>> FlowFile attributes or contents to specify what the keys are. For 
>> example, you could imagine a flow that is:
>>
>> [GetHTTP to fetch some user profile JSON from a WS] -> 
>> [EvaluateJsonPath to extract a zipcode attribute, let’s say 10453] -> 
>> [LookupAttribute to enrich the user profile by doing a lookup for the 
>> key ${zipcode}, which is 10453, and then stuffing the result, NYC, in 
>> the attribute location]
>>
>> For your example, the key isn’t being generated from an existing 
>> attribute so you don’t need an Expression Language expression.
>>
>> As for it being official, I’ve been working on NIFI-3404 [2, 3] which 
>> I’ll probably PR this week and then we’ll see what people think. I 
>> actually wasn’t sure I was going to include the CSV controller 
>> service so your email couldn’t have come at a better time.
>>
>> If you have any feedback on the behavior of the CSV controller 
>> service, I’m all ears.
>>
>> 1. https://gist.github.com/jfrazee/a3b5558882b45228f768ef8dabb9ef54
>> 2. https://issues.apache.org/jira/browse/NIFI-3404
>> 3. https://github.com/jfrazee/nifi/tree/NIFI-3404
>>
>>> On Apr 16, 2017, at 1:50 AM, Mika Borner <ni...@my2ndhead.com> wrote:
>>>
>>> Hi
>>>
>>> I'm struggling with the LookupAttribute Processor [1].
>>>
>>> My flowfile has an attribute "foo" with the value "foovalue1". I 
>>> want to create a new attribute "bar", that matches foo's value 
>>> (=>barvalue1). Therefore I have defined a csv lookup table like this:
>>>
>>>
>>>     foo,bar
>>>
>>>     foovalue1,barvalue1
>>>
>>>     foovalue2,barvalue2
>>>
>>>
>>> Therefore I'm creating  a new dynamic property "bar" with the value 
>>> "${foo}". Unfortunately this does not work. Any hints?
>>>
>>> I think this processor is very useful. Will it be integrated into an 
>>> official Nifi release anytime?
>>>
>>> Thanks
>>>
>>> Mika>
>>>
>>>
>>> [1] 
>>> https://github.com/jfrazee/nifi-lookup-service/tree/file-based-lookup-service
>>>
>


Re: LookupAttribute Processor how to config

Posted by Mika Borner <ni...@my2ndhead.com>.
Thanks for your answer.

After playing around with your example, I think I'm starting to 
understand how it works.

in the example, I've defined a property "foo" with a value of 
"foovalue1" within the GenerateFlowFile processor. Inside the 
LookupAttribute processor, I've changed the value of the property "bar" 
into "${foo}.1". This will create correctly an attribute "bar" with the 
lookup value of "barvalue1".

-What I'm still struggling with is to understand where the "dot one" 
comes from. Is this from the column index needed for multi-column tables?
-How often will the CSV controller service refresh when a file is 
changed? I think it would be handy if this could be done based on a 
schedule.

You asked for more feedback regarding the csv controller service. Coming 
from another platform (Splunk) where lookup tables are used extensively 
for enriching and filtering, following features would be useful :-)

-Being able to select multiple columns (input attributes) at the same 
time as a criteria for the lookup
-Being able to select one or more columns as output attributes
-Wildcard/Regex matching of values
-CaseInsensitive matching of values
-CIDR matching for IP networks

Probably larger CSV files should be indexed and cached in memory for 
reasonable performance.

Thanks
Mika>



On 04/16/2017 09:06 PM, Joey Frazee wrote:
> Mika,
>
> The values for the dynamic properties are the names of the lookup keys themselves. On top of that for the CSV lookup table they’re indexed for multi-column tables, so for your example you want to add a dynamic property bar with the value foo.1 (see [1] for a template). The reason ${foo} doesn’t work is twofold: (1) there’s no foo attribute yet so ${foo} is empty and so there’s no key, and (2) it’s not indexed. I suppose this indicates the documentation isn’t sufficient or that the CSV controller service is just confusing and should work differently.
>
> You might be wondering why the property values are the lookup keys. The answer is that it’s very useful for use cases where you’re using FlowFile attributes or contents to specify what the keys are. For example, you could imagine a flow that is:
>
> [GetHTTP to fetch some user profile JSON from a WS] -> [EvaluateJsonPath to extract a zipcode attribute, let’s say 10453] -> [LookupAttribute to enrich the user profile by doing a lookup for the key ${zipcode}, which is 10453, and then stuffing the result, NYC, in the attribute location]
>
> For your example, the key isn’t being generated from an existing attribute so you don’t need an Expression Language expression.
>
> As for it being official, I’ve been working on NIFI-3404 [2, 3] which I’ll probably PR this week and then we’ll see what people think. I actually wasn’t sure I was going to include the CSV controller service so your email couldn’t have come at a better time.
>
> If you have any feedback on the behavior of the CSV controller service, I’m all ears.
>
> 1. https://gist.github.com/jfrazee/a3b5558882b45228f768ef8dabb9ef54
> 2. https://issues.apache.org/jira/browse/NIFI-3404
> 3. https://github.com/jfrazee/nifi/tree/NIFI-3404
>
>> On Apr 16, 2017, at 1:50 AM, Mika Borner <ni...@my2ndhead.com> wrote:
>>
>> Hi
>>
>> I'm struggling with the LookupAttribute Processor [1].
>>
>> My flowfile has an attribute "foo" with the value "foovalue1". I want to create a new attribute "bar", that matches foo's value (=>barvalue1). Therefore I have defined a csv lookup table like this:
>>
>>
>>     foo,bar
>>
>>     foovalue1,barvalue1
>>
>>     foovalue2,barvalue2
>>
>>
>> Therefore I'm creating  a new dynamic property "bar" with the value "${foo}". Unfortunately this does not work. Any hints?
>>
>> I think this processor is very useful. Will it be integrated into an official Nifi release anytime?
>>
>> Thanks
>>
>> Mika>
>>
>>
>> [1] https://github.com/jfrazee/nifi-lookup-service/tree/file-based-lookup-service
>>


Re: LookupAttribute Processor how to config

Posted by Joey Frazee <jo...@icloud.com>.
Mika,

The values for the dynamic properties are the names of the lookup keys themselves. On top of that for the CSV lookup table they’re indexed for multi-column tables, so for your example you want to add a dynamic property bar with the value foo.1 (see [1] for a template). The reason ${foo} doesn’t work is twofold: (1) there’s no foo attribute yet so ${foo} is empty and so there’s no key, and (2) it’s not indexed. I suppose this indicates the documentation isn’t sufficient or that the CSV controller service is just confusing and should work differently.

You might be wondering why the property values are the lookup keys. The answer is that it’s very useful for use cases where you’re using FlowFile attributes or contents to specify what the keys are. For example, you could imagine a flow that is:

[GetHTTP to fetch some user profile JSON from a WS] -> [EvaluateJsonPath to extract a zipcode attribute, let’s say 10453] -> [LookupAttribute to enrich the user profile by doing a lookup for the key ${zipcode}, which is 10453, and then stuffing the result, NYC, in the attribute location]

For your example, the key isn’t being generated from an existing attribute so you don’t need an Expression Language expression.

As for it being official, I’ve been working on NIFI-3404 [2, 3] which I’ll probably PR this week and then we’ll see what people think. I actually wasn’t sure I was going to include the CSV controller service so your email couldn’t have come at a better time.

If you have any feedback on the behavior of the CSV controller service, I’m all ears.

1. https://gist.github.com/jfrazee/a3b5558882b45228f768ef8dabb9ef54
2. https://issues.apache.org/jira/browse/NIFI-3404
3. https://github.com/jfrazee/nifi/tree/NIFI-3404

> On Apr 16, 2017, at 1:50 AM, Mika Borner <ni...@my2ndhead.com> wrote:
> 
> Hi
> 
> I'm struggling with the LookupAttribute Processor [1].
> 
> My flowfile has an attribute "foo" with the value "foovalue1". I want to create a new attribute "bar", that matches foo's value (=>barvalue1). Therefore I have defined a csv lookup table like this:
> 
> 
>    foo,bar
> 
>    foovalue1,barvalue1
> 
>    foovalue2,barvalue2
> 
> 
> Therefore I'm creating  a new dynamic property "bar" with the value "${foo}". Unfortunately this does not work. Any hints?
> 
> I think this processor is very useful. Will it be integrated into an official Nifi release anytime?
> 
> Thanks
> 
> Mika>
> 
> 
> [1] https://github.com/jfrazee/nifi-lookup-service/tree/file-based-lookup-service
>