You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Nick Lange <ni...@gmail.com> on 2022/03/07 04:46:00 UTC

Records - Best Approach to Enrich Record From Cache

HI all -
 I have a record set of objects that each need enrichment of about 10/20
fields of data from a Redis Cache. In a perfect world, I'd hit the cache
once and return a json blob for further extraction  - ideally in a single
hop.  I don't see an easy way to do this with the record language, but
perhaps I've missed something.

Lacking any better sophistication, I'm currently doing this brute-force
with 10-20 hits to the cache for each field. I'm hoping that the mailing
list has better suggestions.

Thank you
Nick

Re: Records - Best Approach to Enrich Record From Cache

Posted by Mark Payne <ma...@hotmail.com>.
Mike, Steven, Nick,

The DistributedMapCache Client does already have that. The method is called subMap. But I don’t think it performs an MGET at present, it just loops over the keys calling GET.
But I think the more appropriate approach here would be to provide a RedisRecordLookupService so that it could be used with LookupRecord.

However, in version 1.16 you may be able to do this much more efficiently without that.
In 1.16 we introduce a pair of processors: ForkEnrichment and JoinEnrichment.
This allows you to take your incoming data, transform the data in some way to gather enrichment data, gather the enrichment data, and then join together the enrichment data with the original data.

So you may be able to do something like:

ForkEnrichment — (original) —> JoinEnrichment
                          — (enrichment) —> JoltTransform/QueryRecord/etc. —> InvokeHTTP —> JoinEnrichment

This would mean that you’d need to transform your data into a single REST-friendly web request and use InvokeHTTP to gather all of the data from Redis in a single call. You could then use JoinEnrichment with a SQL JOIN in order to join together your original data with the enrichment data. This wouldn’t be nearly as simple/straight-forward as just a single LookupRecord processor, but it should provide very good performance and should still be simpler than many enrichment processors.

Hope this helps!
-Mark


> On Mar 7, 2022, at 9:36 AM, Mike Thomsen <mi...@gmail.com> wrote:
> 
> I skimmed over the code in the Redis DMC client, and did not see any
> place where we could do a MGET there. Not sure if that's relevant to
> Nick's use case, but it would be relevant to that general pattern
> going forward. It wouldn't be hard to add a bulk get method to the DMC
> interface and provide a default interface that just loops and does
> multiple get operations and stacks them together. Then the Redis
> version could do a MGET and stack them together.
> 
> That said, AFAIK we'd need to create a new enrichment process or
> extend something like ScriptedTransformRecord to integrate with a DMC.
> 
> I have the time to work on this, but would like to hear from
> committers and users before I start banging out the code to make sure
> I'm not missing something.
> 
> On Mon, Mar 7, 2022 at 7:18 AM <st...@bt.com> wrote:
>> 
>> Redis does allow multiple gets in the one hit with MGET. If you search for all keys the response is an ordered list of matching values, with null in place where there is no match.
>> 
>> 
>> 
>> Steve Hindmarch
>> 
>> 
>> 
>> From: Nick Lange <ni...@gmail.com>
>> Sent: 07 March 2022 04:46
>> To: users@nifi.apache.org
>> Subject: Records - Best Approach to Enrich Record From Cache
>> 
>> 
>> 
>> HI all -
>> 
>> I have a record set of objects that each need enrichment of about 10/20 fields of data from a Redis Cache. In a perfect world, I'd hit the cache once and return a json blob for further extraction  - ideally in a single hop.  I don't see an easy way to do this with the record language, but perhaps I've missed something.
>> 
>> 
>> 
>> Lacking any better sophistication, I'm currently doing this brute-force with 10-20 hits to the cache for each field. I'm hoping that the mailing list has better suggestions.
>> 
>> 
>> 
>> Thank you
>> 
>> Nick
>> 
>> 


Re: Records - Best Approach to Enrich Record From Cache

Posted by Mike Thomsen <mi...@gmail.com>.
I skimmed over the code in the Redis DMC client, and did not see any
place where we could do a MGET there. Not sure if that's relevant to
Nick's use case, but it would be relevant to that general pattern
going forward. It wouldn't be hard to add a bulk get method to the DMC
interface and provide a default interface that just loops and does
multiple get operations and stacks them together. Then the Redis
version could do a MGET and stack them together.

That said, AFAIK we'd need to create a new enrichment process or
extend something like ScriptedTransformRecord to integrate with a DMC.

I have the time to work on this, but would like to hear from
committers and users before I start banging out the code to make sure
I'm not missing something.

On Mon, Mar 7, 2022 at 7:18 AM <st...@bt.com> wrote:
>
> Redis does allow multiple gets in the one hit with MGET. If you search for all keys the response is an ordered list of matching values, with null in place where there is no match.
>
>
>
> Steve Hindmarch
>
>
>
> From: Nick Lange <ni...@gmail.com>
> Sent: 07 March 2022 04:46
> To: users@nifi.apache.org
> Subject: Records - Best Approach to Enrich Record From Cache
>
>
>
> HI all -
>
>  I have a record set of objects that each need enrichment of about 10/20 fields of data from a Redis Cache. In a perfect world, I'd hit the cache once and return a json blob for further extraction  - ideally in a single hop.  I don't see an easy way to do this with the record language, but perhaps I've missed something.
>
>
>
> Lacking any better sophistication, I'm currently doing this brute-force with 10-20 hits to the cache for each field. I'm hoping that the mailing list has better suggestions.
>
>
>
> Thank you
>
> Nick
>
>

RE: Records - Best Approach to Enrich Record From Cache

Posted by st...@bt.com.
Redis does allow multiple gets in the one hit with MGET. If you search for all keys the response is an ordered list of matching values, with null in place where there is no match.

Steve Hindmarch

From: Nick Lange <ni...@gmail.com>
Sent: 07 March 2022 04:46
To: users@nifi.apache.org
Subject: Records - Best Approach to Enrich Record From Cache

HI all -
 I have a record set of objects that each need enrichment of about 10/20 fields of data from a Redis Cache. In a perfect world, I'd hit the cache once and return a json blob for further extraction  - ideally in a single hop.  I don't see an easy way to do this with the record language, but perhaps I've missed something.

Lacking any better sophistication, I'm currently doing this brute-force with 10-20 hits to the cache for each field. I'm hoping that the mailing list has better suggestions.

Thank you
Nick