You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Joseph Witt (JIRA)" <ji...@apache.org> on 2015/01/03 08:16:34 UTC

[jira] [Commented] (NIFI-190) HoldFile processor

    [ https://issues.apache.org/jira/browse/NIFI-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263452#comment-14263452 ] 

Joseph Witt commented on NIFI-190:
----------------------------------

Hello Joe

Mark and I discussed this the other day in detail so wanted to recap it a bit here.  Having a processor operate in this manner might make for an awkward user experience in that it forces the flow to converge visually to this single point.  That seems undesirable since the whole point of this processor is to allow a wait/notify pattern where one part of the flow sites in wait for a notification to occur based on what happened elsewhere.

So an alternative which could be elegant is to have a 'Wait' processor and a 'Notify' processor.  They'd each be configured to talk to the same controller service 'WaitNotifyControllerService'.  That service would just hold a basic map type construct that those processors could each interrogate as appropriate.  This approach is annoying in that it requires a controller service.  However that service could be reusable in other processors that might also want to use a WaitNotify pattern.  And we intend to make instantiation and configuration of controller services a runtime thing like processors.

Another approach here is that we just make it possible for processors to have multiple input ports (as processor groups do).  We've resisted this though because of fears of how much burden that places on the user/UX.  This is actually one seemingly big departure for us from classical FBP.

But there is a catch that plagues both approaches actually.  The framework will swap flow files out if the backlog on the queue gets to a certain size.  If it were to swap out something that was awaiting notification - or the signal - it would get funky.  So we'll need to come up a way to allow a processor to scan through all flow files in its queue - even those that are swapped out.  For the likely use case this sort of wait/notify is good for swapping shouldn't really kick-in anyway presumably but is important to keep in mind. 

What do you think?

> HoldFile processor
> ------------------
>
>                 Key: NIFI-190
>                 URL: https://issues.apache.org/jira/browse/NIFI-190
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework, Extensions
>            Reporter: Joseph Gresock
>            Priority: Minor
>         Attachments: HoldFile_example.xml
>
>
> Our team has developed a processor for the following use case:
> * Format A needs to be sent to Endpoint A
> * Format B needs to be sent to Endpoint B, but should not proceed until A has reached Endpoint A.  We most commonly have this restriction when Endpoint B requires some output of Endpoint A.
> The proposed HoldFile processor takes 2 types of flow files as input:
> * Files to be held
> * Signal files that can release corresponding held files, based on the value of a configurable "release" attribute
> Signal files are distinguished from held files by the presence of the "flow.file.release.value" attribute.  The processor is configured with a "Release Signal Attribute".  Held files with this attribute whose value matches a received signal value will be released.
> An example:
> HoldFile is configured with Release Signal Attribute = "myId".  Its 'Hold' relationship routes back onto itself.
> 1. flowFile 1 { myId : "123" } enters HoldFile.  It is routed to the 'Hold' relationship
> 2. flowFile 2 { flow.file.release.value : "123" } enters HoldFile.  flowfile 1 is then routed to 'Release', and flow file 2 is removed from the session.
> Signal flow files will also copy their attributes to matching held files, unless otherwise indicated.  This is what allows the output of Endpoint A to pass to Endpoint B, above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)