You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Dirk Arends <di...@fontis.com.au> on 2022/06/06 02:02:35 UTC

LookupRecord executing lookup for first record twice

Hi,

A change was implemented in NIFI-9903[1] which results in
`lookupService.lookup()` being called twice per record (at least until the
first match[2]). For lookup services which are idempotent
(CSVRecordLookupService, DistributedMapCacheLookupService,
PropertiesFileLookupService) making lookups twice won’t affect the result
or have undesired side effects. However, the RestLookupService can make
arbitrary HTTP requests for the standard HTTP methods (GET, POST, PUT,
DELETE) and there is no guarantee that these requests will be idempotent.
POST requests especially are not expected to be idempotent.

As the name suggests, LookupRecord could be expected to be used only to
perform lookups which are idempotent and do not have side effects. Matt
Burgess wrote this article[3] a little while back where it seems the
expected behaviour was that `lookupService.lookup()` would only be called
once. The change in behaviour and being called twice would now cause IDs to
be skipped, which isn’t desirable.

Unfortunately I currently make heavy use of LookupRecord with a
RestLookupService (relying on the behaviour before NIFI-9903) and a few of
these are POST requests that must be non-idempotent by nature. As the rest
of my flow is record-based I am in need of a record-based solution rather
than splitting and using something like InvokeHTTP. A new
InvokeHTTPRecord[4] processor was proposed in 2020, but there has not been
much further discussion.

Could Nifi community members comment on the expected usage of the
LookupRecord processor and RestLookup service? Am I using them in a way
that was never intended? Can anyone suggest another approach I could use to
make non-idempotent http requests per record, or is splitting records into
flowfiles and using InvokeHTTP really the only other option?

Regards,

Dirk Arends

[1] https://issues.apache.org/jira/browse/NIFI-9903

[2]
https://github.com/apache/nifi/blob/rel/nifi-1.16.1/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/LookupRecord.java
<https://github.com/apache/nifi/blob/rel/nifi-1.16.1/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/LookupRecord.java#L350>

Lines:

552: lookupValueOption = lookupService.lookup(lookupCoordinates,
flowFile.getAttributes());

636: final Optional<?> lookupResult =
lookupService.lookup(lookupCoordinates, flowFileAttributes);

[3] http://funnifi.blogspot.com/2018/08/database-sequence-lookup-with.html

[4] https://issues.apache.org/jira/browse/NIFI-7505