You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Ryan Hendrickson <ry...@gmail.com> on 2021/05/11 21:16:33 UTC

Input/Output Ports backed by S3 Buckets for data overflows

Hi NiFi Users,
   I've got a scenario where we've set back-pressure through a set of
critical path processor relationships, which can "lock-up" the server from
ingressing additional data.

   In the mini diagram below, Server 2 has back pressure fully.  Once
Server 2 reaches its back-pressure thresholds up to the Input Port, then
Server 1 will stop sending data.  This happens to us when data is
surging/peaking.

Server 1 [Input Port ---> Processor A ---> Processor B ---> Processor C
---> Output Port]
Server 2 [Input Port --bp--> Processor A --bp--> Processor B --bp-->
Processor C --bp--> Output Port]

   It would be really nice....   if an Output Port, in the absence of being
able to send data to its destination Input Port, could send data to an S3
Bucket instead.  The Input Port would then read data from the S3 Bucket
until it caught-up, and then receive data directly from the Output Port.

Are there existing best practices that help with the mechanics of setting
something like this up?  Matching the S3 Buckets with Put/FetchS3, etc?

Is there a better way to do this (without just adding more servers).

Thanks,
Ryan