You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Chris Norton (Jira)" <ji...@apache.org> on 2022/08/22 00:53:00 UTC

[jira] [Commented] (NIFI-10192) LookupRecord attempts first lookup multiple times

    [ https://issues.apache.org/jira/browse/NIFI-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582674#comment-17582674 ] 

Chris Norton commented on NIFI-10192:
-------------------------------------

A PR has been created that adds in simple caching as a workaround: https://github.com/apache/nifi/pull/6243

> LookupRecord attempts first lookup multiple times
> -------------------------------------------------
>
>                 Key: NIFI-10192
>                 URL: https://issues.apache.org/jira/browse/NIFI-10192
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.16.1, 1.16.2, 1.16.3
>            Reporter: Chris Norton
>            Priority: Minor
>
> A change was implemented in [NIFI-9903|https://issues.apache.org/jira/browse/NIFI-9903] which results in *lookupService.lookup()* being called twice per record ([at least until the first match|https://github.com/apache/nifi/blob/rel/nifi-1.16.1/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/LookupRecord.java#L350]). For lookup services which are idempotent (CSVRecordLookupService, DistributedMapCacheLookupService, PropertiesFileLookupService) making lookups twice won’t affect the result or have undesired side effects. However, the RestLookupService can make arbitrary HTTP requests for the standard HTTP methods (GET, POST, PUT, DELETE) and there is no guarantee that these requests will be idempotent. POST requests in particular are not expected to be idempotent and may cause undesirable behaviour if invoked multiple times (as in our case).
> As the name suggests, LookupRecord could be expected to be used only to perform lookups which are idempotent and do not have side effects. [Matt Burgess wrote an article|http://funnifi.blogspot.com/2018/08/database-sequence-lookup-with.html] where it seems the expected behaviour was that *lookupService.lookup()* would only be called once. The change in behaviour and being called twice would now cause IDs to be skipped.
> It was suggested by Mark Payne in a Slack discussion that lookup results could be cached up until the first match, which may alleviate the issues we are seeing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)