You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Oleksandr Lobunets (JIRA)" <ji...@apache.org> on 2018/03/07 20:07:00 UTC

[jira] [Created] (NIFI-4943) Batch Duration capability from ExecuteProcess added to ExecuteStreamCommand

Oleksandr Lobunets created NIFI-4943:
----------------------------------------

             Summary: Batch Duration capability from ExecuteProcess added to ExecuteStreamCommand
                 Key: NIFI-4943
                 URL: https://issues.apache.org/jira/browse/NIFI-4943
             Project: Apache NiFi
          Issue Type: Improvement
    Affects Versions: 1.5.0
            Reporter: Oleksandr Lobunets


It would be great to let the ExecuteStreamCommand processor to send FlowFiles per chunk of stdout using a given separator (common case: for each line from stdout).

I have a case  of running the 3rd party CLI (linux) with the following behaviour:
- Should be executed upon a FlowFile with attributes/content containing parameters to CLI
- Accepts params via flags or environment variables 
- Writes output to stdout as a stream of JSON objects
- The output might be huge (millions and millions of objects), which means caching stdout is not an option - each line/object should be sent as a separate FlowFile
- The errors/log is written to stderr (might be very chatty)

Using ExecuteProcessor is not an option (cannot be trigger by incoming FlowFile), but the way it treats stdout is what is desired.
Using ExecuteStreamCommand is not an option as it buffers the output until the binary exists with a status code 0.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)