You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Rick Braddy <rb...@softnas.com> on 2015/09/02 07:03:40 UTC

ExecuteProcess and stdout flush?

Hi,

I have a slightly modified version of ExecuteProcess that's been customized to do various "find targetdir -print" style commands, which generates standard output that results in a FlowFile.  Large outputs (long directory listings of 100 lines or more) work perfectly; however, it appears that brief outputs from find to stdout (e.g., 8 to 10 lines) are not being picked up at all (the FlowFile length is zero after the find command runs).  In the debugger I can see the find command is properly formed, then after it runs the FlowFile is zero bytes in length.

I suspect this is some kind of issue with stdout not being flushed sufficiently by "find" to be picked up as input, but not sure at this point.  The process flow is a bit of a mystery at this point, so wondering if anyone might shed some light on how to troubleshoot/resolve (this most likely happens on a standard ExecuteProcess processor, but haven't tried to reproduce it on that processor yet).

Rick

Re: ExecuteProcess and stdout flush?

Posted by Joe Witt <jo...@gmail.com>.
Got ya.  We've talked about instead of having 'GetFile' switching to
the pattern of 'ListFile' and 'FetchFile'.  With such an approach
perhaps we can offer you a cleaner path for implementation.

On Wed, Sep 2, 2015 at 9:49 PM, Rick Braddy <rb...@softnas.com> wrote:
> Joe,
>
> Appreciate the guidance.  Yeah, the standard processor is close but not exactly what we needed, as we need it to find all files the first time around, then only modified files from that point forward, along with some other flexibilities (and preferred to avoid scripting, so the processor is self-contained).
>
> What's odd is that I haven't change the I/O aspect of the original ExecuteProcess java code, just some properties and how commands get structured.  I suspect it has something to do with the timing associated with that "Batch" property, which seems a bit touchy from a timing standpoint... will track it down.
>
> Thanks
> Rick
>
> -----Original Message-----
> From: Joe Witt [mailto:joe.witt@gmail.com]
> Sent: Wednesday, September 02, 2015 7:24 PM
> To: dev@nifi.apache.org
> Subject: Re: ExecuteProcess and stdout flush?
>
> Rick,
>
> That function, calling a process from Java and reliably interacting with the streams, was a surprisingly tricky thing to get right.  One of the key things is to make sure you're always fully consuming the stream and such.  Pay close attention to the trickery in the standard processor.
>
> Now instead of modifying the processor in this manner you may consider simply executing a script instead and invoking that.  It may give you more control and more natural control.  Not trying to discourage you from building a NiFi processor by any means but in this case you're sort of on the 'edge of java and system specific commands'.  Tricky road there.
>
> Thanks
> Joe
>
> On Wed, Sep 2, 2015 at 6:59 PM, Rick Braddy <rb...@softnas.com> wrote:
>> Further troubleshooting today... used same "find" command via ExecuteProcess standard processor.  It works fine, so there's something wrong with my customized processor... will debug it to resolve.
>>
>> -----Original Message-----
>> From: Rick Braddy [mailto:rbraddy@softnas.com]
>> Sent: Tuesday, September 01, 2015 10:04 PM
>> To: dev@nifi.apache.org
>> Subject: ExecuteProcess and stdout flush?
>>
>> Hi,
>>
>> I have a slightly modified version of ExecuteProcess that's been customized to do various "find targetdir -print" style commands, which generates standard output that results in a FlowFile.  Large outputs (long directory listings of 100 lines or more) work perfectly; however, it appears that brief outputs from find to stdout (e.g., 8 to 10 lines) are not being picked up at all (the FlowFile length is zero after the find command runs).  In the debugger I can see the find command is properly formed, then after it runs the FlowFile is zero bytes in length.
>>
>> I suspect this is some kind of issue with stdout not being flushed sufficiently by "find" to be picked up as input, but not sure at this point.  The process flow is a bit of a mystery at this point, so wondering if anyone might shed some light on how to troubleshoot/resolve (this most likely happens on a standard ExecuteProcess processor, but haven't tried to reproduce it on that processor yet).
>>
>> Rick

RE: ExecuteProcess and stdout flush?

Posted by Rick Braddy <rb...@softnas.com>.
Joe,

Appreciate the guidance.  Yeah, the standard processor is close but not exactly what we needed, as we need it to find all files the first time around, then only modified files from that point forward, along with some other flexibilities (and preferred to avoid scripting, so the processor is self-contained).

What's odd is that I haven't change the I/O aspect of the original ExecuteProcess java code, just some properties and how commands get structured.  I suspect it has something to do with the timing associated with that "Batch" property, which seems a bit touchy from a timing standpoint... will track it down.

Thanks
Rick

-----Original Message-----
From: Joe Witt [mailto:joe.witt@gmail.com] 
Sent: Wednesday, September 02, 2015 7:24 PM
To: dev@nifi.apache.org
Subject: Re: ExecuteProcess and stdout flush?

Rick,

That function, calling a process from Java and reliably interacting with the streams, was a surprisingly tricky thing to get right.  One of the key things is to make sure you're always fully consuming the stream and such.  Pay close attention to the trickery in the standard processor.

Now instead of modifying the processor in this manner you may consider simply executing a script instead and invoking that.  It may give you more control and more natural control.  Not trying to discourage you from building a NiFi processor by any means but in this case you're sort of on the 'edge of java and system specific commands'.  Tricky road there.

Thanks
Joe

On Wed, Sep 2, 2015 at 6:59 PM, Rick Braddy <rb...@softnas.com> wrote:
> Further troubleshooting today... used same "find" command via ExecuteProcess standard processor.  It works fine, so there's something wrong with my customized processor... will debug it to resolve.
>
> -----Original Message-----
> From: Rick Braddy [mailto:rbraddy@softnas.com]
> Sent: Tuesday, September 01, 2015 10:04 PM
> To: dev@nifi.apache.org
> Subject: ExecuteProcess and stdout flush?
>
> Hi,
>
> I have a slightly modified version of ExecuteProcess that's been customized to do various "find targetdir -print" style commands, which generates standard output that results in a FlowFile.  Large outputs (long directory listings of 100 lines or more) work perfectly; however, it appears that brief outputs from find to stdout (e.g., 8 to 10 lines) are not being picked up at all (the FlowFile length is zero after the find command runs).  In the debugger I can see the find command is properly formed, then after it runs the FlowFile is zero bytes in length.
>
> I suspect this is some kind of issue with stdout not being flushed sufficiently by "find" to be picked up as input, but not sure at this point.  The process flow is a bit of a mystery at this point, so wondering if anyone might shed some light on how to troubleshoot/resolve (this most likely happens on a standard ExecuteProcess processor, but haven't tried to reproduce it on that processor yet).
>
> Rick

Re: ExecuteProcess and stdout flush?

Posted by Joe Witt <jo...@gmail.com>.
Rick,

That function, calling a process from Java and reliably interacting
with the streams, was a surprisingly tricky thing to get right.  One
of the key things is to make sure you're always fully consuming the
stream and such.  Pay close attention to the trickery in the standard
processor.

Now instead of modifying the processor in this manner you may consider
simply executing a script instead and invoking that.  It may give you
more control and more natural control.  Not trying to discourage you
from building a NiFi processor by any means but in this case you're
sort of on the 'edge of java and system specific commands'.  Tricky
road there.

Thanks
Joe

On Wed, Sep 2, 2015 at 6:59 PM, Rick Braddy <rb...@softnas.com> wrote:
> Further troubleshooting today... used same "find" command via ExecuteProcess standard processor.  It works fine, so there's something wrong with my customized processor... will debug it to resolve.
>
> -----Original Message-----
> From: Rick Braddy [mailto:rbraddy@softnas.com]
> Sent: Tuesday, September 01, 2015 10:04 PM
> To: dev@nifi.apache.org
> Subject: ExecuteProcess and stdout flush?
>
> Hi,
>
> I have a slightly modified version of ExecuteProcess that's been customized to do various "find targetdir -print" style commands, which generates standard output that results in a FlowFile.  Large outputs (long directory listings of 100 lines or more) work perfectly; however, it appears that brief outputs from find to stdout (e.g., 8 to 10 lines) are not being picked up at all (the FlowFile length is zero after the find command runs).  In the debugger I can see the find command is properly formed, then after it runs the FlowFile is zero bytes in length.
>
> I suspect this is some kind of issue with stdout not being flushed sufficiently by "find" to be picked up as input, but not sure at this point.  The process flow is a bit of a mystery at this point, so wondering if anyone might shed some light on how to troubleshoot/resolve (this most likely happens on a standard ExecuteProcess processor, but haven't tried to reproduce it on that processor yet).
>
> Rick

RE: ExecuteProcess and stdout flush?

Posted by Rick Braddy <rb...@softnas.com>.
Further troubleshooting today... used same "find" command via ExecuteProcess standard processor.  It works fine, so there's something wrong with my customized processor... will debug it to resolve.

-----Original Message-----
From: Rick Braddy [mailto:rbraddy@softnas.com] 
Sent: Tuesday, September 01, 2015 10:04 PM
To: dev@nifi.apache.org
Subject: ExecuteProcess and stdout flush?

Hi,

I have a slightly modified version of ExecuteProcess that's been customized to do various "find targetdir -print" style commands, which generates standard output that results in a FlowFile.  Large outputs (long directory listings of 100 lines or more) work perfectly; however, it appears that brief outputs from find to stdout (e.g., 8 to 10 lines) are not being picked up at all (the FlowFile length is zero after the find command runs).  In the debugger I can see the find command is properly formed, then after it runs the FlowFile is zero bytes in length.

I suspect this is some kind of issue with stdout not being flushed sufficiently by "find" to be picked up as input, but not sure at this point.  The process flow is a bit of a mystery at this point, so wondering if anyone might shed some light on how to troubleshoot/resolve (this most likely happens on a standard ExecuteProcess processor, but haven't tried to reproduce it on that processor yet).

Rick