You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Nick Carenza <ni...@thecontrolgroup.com> on 2020/02/25 22:54:04 UTC

Trouble with ReplaceText processor

Nifi v1.9.2

Hey guys, I have been having issues for months with maybe the simplest
processor and I just can't figure out why.

All I want to do is append a newline. For some reason this processor keeps
getting backed up. It appears to just stop processing periodically. This is
the only processor amongst a thousand that has this issue.

I have tried it both as an Append and a Regexp Replace.

[image: image.png]

[image: image.png]

I have tried it with added concurrency and that hasn't solved the problem.

This processor will process a bunch of files then stop for a time the start
again. No logs or messages. If I stop it, it hangs on the current process
and I end up having to terminate it or it will just hang indefinitely. Then
when I start it again it will process a bunch of files then hang again. So
on and so forth.

I don't even know where to go from here.

Thank,
Nick

Re: Trouble with ReplaceText processor

Posted by Nick Carenza <ni...@thecontrolgroup.com>.
Thanks Joe. How can I figure out what's going on with ReplaceText, now out
of curiosity/concern? The behavior is unlike anything I've seen in Nifi
before. I have moved the processor to different places in the flow, created
new instances of it, copied it, replaced it, used the Regex version of it
and the Append version and it always has the same issue.

On Wed, Feb 26, 2020 at 10:37 AM Joe Witt <jo...@gmail.com> wrote:

> Nick yeah you definitely can if your downstream readers of kinesis are
> happy to get message that way. That will certainly perform better.
>
> But again if you're splitting ahead of the ReplaceText/ControlRate there
> is a much better way...Record processors.
>
> On Wed, Feb 26, 2020 at 10:29 AM Nick Carenza <
> nick.carenza@thecontrolgroup.com> wrote:
>
>> Hey Shawn, ControlRate is intentionally backed up to enforce the API
>> Limits of Kinesis Firehose. There are 800k flowfiles in the queue above the
>> ReplaceText processor with plenty of room in the downstream queue.
>>
>> Hey Joe, maybe I can use MergeContent and then configure
>> PutKinesisFirehose to just push 1 file.
>>
>>
>>
>> On Tue, Feb 25, 2020 at 3:42 PM Shawn Weeks <sw...@weeksconsulting.us>
>> wrote:
>>
>>> In your screenshot ReplaceText is not where it’s backed up at, it’s
>>> backed up at your ControlRate processor. What’s your max timer driven
>>> thread count under Controller Settings.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Shawn
>>>
>>>
>>>
>>> *From: *Nick Carenza <ni...@thecontrolgroup.com>
>>> *Reply-To: *"users@nifi.apache.org" <us...@nifi.apache.org>
>>> *Date: *Tuesday, February 25, 2020 at 5:35 PM
>>> *To: *"users@nifi.apache.org" <us...@nifi.apache.org>
>>> *Subject: *Re: Trouble with ReplaceText processor
>>>
>>>
>>>
>>> Hey Joe,
>>>
>>>
>>>
>>> Each file contains a single line.
>>>
>>> Because this processor has caused so many backups of the flow, I have a
>>> large queue in front of it to prevent the upstream ListenHTTP processor
>>> from rejecting new messages.
>>>
>>> Downstream from this processor I emit them individually to AWS Kinesis
>>> Firehose.
>>>
>>> I suppose I could merge them and write 1 large file to Kinesis. That
>>> might solve my other problem of the PutKinesisFirehose processor not
>>> obeying the size constraints.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com> wrote:
>>>
>>> Nick,
>>>
>>>
>>>
>>> If I read this right you get flowfiles that each contain a single line.
>>> You want to add a newline to the end of these.  Is this right?  Do you want
>>> the objects to stay on their own or would you be ok with say 1000 of these
>>> lines being smashed together into a single resulting flowfile with newlines?
>>>
>>>
>>>
>>> What happens before and after this?  Can you show a big picture view of
>>> the flow with real measures and backlog revealed?
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
>>> nick.carenza@thecontrolgroup.com> wrote:
>>>
>>> Nifi v1.9.2
>>>
>>>
>>>
>>> Hey guys, I have been having issues for months with maybe the simplest
>>> processor and I just can't figure out why.
>>>
>>>
>>>
>>> All I want to do is append a newline. For some reason this processor
>>> keeps getting backed up. It appears to just stop processing periodically.
>>> This is the only processor amongst a thousand that has this issue.
>>>
>>>
>>>
>>> I have tried it both as an Append and a Regexp Replace.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> I have tried it with added concurrency and that hasn't solved the
>>> problem.
>>>
>>>
>>>
>>> This processor will process a bunch of files then stop for a time the
>>> start again. No logs or messages. If I stop it, it hangs on the current
>>> process and I end up having to terminate it or it will just hang
>>> indefinitely. Then when I start it again it will process a bunch of files
>>> then hang again. So on and so forth.
>>>
>>>
>>>
>>> I don't even know where to go from here.
>>>
>>>
>>>
>>> Thank,
>>>
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>>

Re: Trouble with ReplaceText processor

Posted by Joe Witt <jo...@gmail.com>.
Nick yeah you definitely can if your downstream readers of kinesis are
happy to get message that way. That will certainly perform better.

But again if you're splitting ahead of the ReplaceText/ControlRate there is
a much better way...Record processors.

On Wed, Feb 26, 2020 at 10:29 AM Nick Carenza <
nick.carenza@thecontrolgroup.com> wrote:

> Hey Shawn, ControlRate is intentionally backed up to enforce the API
> Limits of Kinesis Firehose. There are 800k flowfiles in the queue above the
> ReplaceText processor with plenty of room in the downstream queue.
>
> Hey Joe, maybe I can use MergeContent and then configure
> PutKinesisFirehose to just push 1 file.
>
>
>
> On Tue, Feb 25, 2020 at 3:42 PM Shawn Weeks <sw...@weeksconsulting.us>
> wrote:
>
>> In your screenshot ReplaceText is not where it’s backed up at, it’s
>> backed up at your ControlRate processor. What’s your max timer driven
>> thread count under Controller Settings.
>>
>>
>>
>> Thanks
>>
>> Shawn
>>
>>
>>
>> *From: *Nick Carenza <ni...@thecontrolgroup.com>
>> *Reply-To: *"users@nifi.apache.org" <us...@nifi.apache.org>
>> *Date: *Tuesday, February 25, 2020 at 5:35 PM
>> *To: *"users@nifi.apache.org" <us...@nifi.apache.org>
>> *Subject: *Re: Trouble with ReplaceText processor
>>
>>
>>
>> Hey Joe,
>>
>>
>>
>> Each file contains a single line.
>>
>> Because this processor has caused so many backups of the flow, I have a
>> large queue in front of it to prevent the upstream ListenHTTP processor
>> from rejecting new messages.
>>
>> Downstream from this processor I emit them individually to AWS Kinesis
>> Firehose.
>>
>> I suppose I could merge them and write 1 large file to Kinesis. That
>> might solve my other problem of the PutKinesisFirehose processor not
>> obeying the size constraints.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com> wrote:
>>
>> Nick,
>>
>>
>>
>> If I read this right you get flowfiles that each contain a single line.
>> You want to add a newline to the end of these.  Is this right?  Do you want
>> the objects to stay on their own or would you be ok with say 1000 of these
>> lines being smashed together into a single resulting flowfile with newlines?
>>
>>
>>
>> What happens before and after this?  Can you show a big picture view of
>> the flow with real measures and backlog revealed?
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
>> nick.carenza@thecontrolgroup.com> wrote:
>>
>> Nifi v1.9.2
>>
>>
>>
>> Hey guys, I have been having issues for months with maybe the simplest
>> processor and I just can't figure out why.
>>
>>
>>
>> All I want to do is append a newline. For some reason this processor
>> keeps getting backed up. It appears to just stop processing periodically.
>> This is the only processor amongst a thousand that has this issue.
>>
>>
>>
>> I have tried it both as an Append and a Regexp Replace.
>>
>>
>>
>>
>>
>>
>>
>> I have tried it with added concurrency and that hasn't solved the problem.
>>
>>
>>
>> This processor will process a bunch of files then stop for a time the
>> start again. No logs or messages. If I stop it, it hangs on the current
>> process and I end up having to terminate it or it will just hang
>> indefinitely. Then when I start it again it will process a bunch of files
>> then hang again. So on and so forth.
>>
>>
>>
>> I don't even know where to go from here.
>>
>>
>>
>> Thank,
>>
>> Nick
>>
>>
>>
>>
>>
>>

Re: Trouble with ReplaceText processor

Posted by Nick Carenza <ni...@thecontrolgroup.com>.
Hey Shawn, ControlRate is intentionally backed up to enforce the API Limits
of Kinesis Firehose. There are 800k flowfiles in the queue above the
ReplaceText processor with plenty of room in the downstream queue.

Hey Joe, maybe I can use MergeContent and then configure PutKinesisFirehose
to just push 1 file.



On Tue, Feb 25, 2020 at 3:42 PM Shawn Weeks <sw...@weeksconsulting.us>
wrote:

> In your screenshot ReplaceText is not where it’s backed up at, it’s backed
> up at your ControlRate processor. What’s your max timer driven thread count
> under Controller Settings.
>
>
>
> Thanks
>
> Shawn
>
>
>
> *From: *Nick Carenza <ni...@thecontrolgroup.com>
> *Reply-To: *"users@nifi.apache.org" <us...@nifi.apache.org>
> *Date: *Tuesday, February 25, 2020 at 5:35 PM
> *To: *"users@nifi.apache.org" <us...@nifi.apache.org>
> *Subject: *Re: Trouble with ReplaceText processor
>
>
>
> Hey Joe,
>
>
>
> Each file contains a single line.
>
> Because this processor has caused so many backups of the flow, I have a
> large queue in front of it to prevent the upstream ListenHTTP processor
> from rejecting new messages.
>
> Downstream from this processor I emit them individually to AWS Kinesis
> Firehose.
>
> I suppose I could merge them and write 1 large file to Kinesis. That might
> solve my other problem of the PutKinesisFirehose processor not obeying the
> size constraints.
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com> wrote:
>
> Nick,
>
>
>
> If I read this right you get flowfiles that each contain a single line.
> You want to add a newline to the end of these.  Is this right?  Do you want
> the objects to stay on their own or would you be ok with say 1000 of these
> lines being smashed together into a single resulting flowfile with newlines?
>
>
>
> What happens before and after this?  Can you show a big picture view of
> the flow with real measures and backlog revealed?
>
>
>
> Thanks
>
>
>
> On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
> nick.carenza@thecontrolgroup.com> wrote:
>
> Nifi v1.9.2
>
>
>
> Hey guys, I have been having issues for months with maybe the simplest
> processor and I just can't figure out why.
>
>
>
> All I want to do is append a newline. For some reason this processor keeps
> getting backed up. It appears to just stop processing periodically. This is
> the only processor amongst a thousand that has this issue.
>
>
>
> I have tried it both as an Append and a Regexp Replace.
>
>
>
>
>
>
>
> I have tried it with added concurrency and that hasn't solved the problem.
>
>
>
> This processor will process a bunch of files then stop for a time the
> start again. No logs or messages. If I stop it, it hangs on the current
> process and I end up having to terminate it or it will just hang
> indefinitely. Then when I start it again it will process a bunch of files
> then hang again. So on and so forth.
>
>
>
> I don't even know where to go from here.
>
>
>
> Thank,
>
> Nick
>
>
>
>
>
>

Re: Trouble with ReplaceText processor

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
In your screenshot ReplaceText is not where it’s backed up at, it’s backed up at your ControlRate processor. What’s your max timer driven thread count under Controller Settings.

Thanks
Shawn

From: Nick Carenza <ni...@thecontrolgroup.com>
Reply-To: "users@nifi.apache.org" <us...@nifi.apache.org>
Date: Tuesday, February 25, 2020 at 5:35 PM
To: "users@nifi.apache.org" <us...@nifi.apache.org>
Subject: Re: Trouble with ReplaceText processor

Hey Joe,

Each file contains a single line.
Because this processor has caused so many backups of the flow, I have a large queue in front of it to prevent the upstream ListenHTTP processor from rejecting new messages.
Downstream from this processor I emit them individually to AWS Kinesis Firehose.
I suppose I could merge them and write 1 large file to Kinesis. That might solve my other problem of the PutKinesisFirehose processor not obeying the size constraints.

[cid:image001.png@01D5EC02.E044E980]




On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com>> wrote:
Nick,

If I read this right you get flowfiles that each contain a single line.  You want to add a newline to the end of these.  Is this right?  Do you want the objects to stay on their own or would you be ok with say 1000 of these lines being smashed together into a single resulting flowfile with newlines?

What happens before and after this?  Can you show a big picture view of the flow with real measures and backlog revealed?

Thanks

On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <ni...@thecontrolgroup.com>> wrote:
Nifi v1.9.2

Hey guys, I have been having issues for months with maybe the simplest processor and I just can't figure out why.

All I want to do is append a newline. For some reason this processor keeps getting backed up. It appears to just stop processing periodically. This is the only processor amongst a thousand that has this issue.

I have tried it both as an Append and a Regexp Replace.

[cid:image002.png@01D5EC02.E044E980]

[cid:image003.png@01D5EC02.E044E980]

I have tried it with added concurrency and that hasn't solved the problem.

This processor will process a bunch of files then stop for a time the start again. No logs or messages. If I stop it, it hangs on the current process and I end up having to terminate it or it will just hang indefinitely. Then when I start it again it will process a bunch of files then hang again. So on and so forth.

I don't even know where to go from here.

Thank,
Nick



Re: Trouble with ReplaceText processor

Posted by Joe Witt <jo...@gmail.com>.
Nick,

What is ahead of ReplaceText?

And...then seeing that PutKinesisFirehose operates on a single flowfile at
a time - oye.  What we need is PutKinesisFirehose to take a demarcator so
it could take a big batch of lines and send each line as its own message.
Or make a PutKinesisFirehoseRecord...

That will be dramatically faster.  Your flow would be dramatically faster
with the record processors and the above improvements.

Thanks

On Tue, Feb 25, 2020 at 3:35 PM Nick Carenza <
nick.carenza@thecontrolgroup.com> wrote:

> Hey Joe,
>
> Each file contains a single line.
> Because this processor has caused so many backups of the flow, I have a
> large queue in front of it to prevent the upstream ListenHTTP processor
> from rejecting new messages.
> Downstream from this processor I emit them individually to AWS Kinesis
> Firehose.
> I suppose I could merge them and write 1 large file to Kinesis. That might
> solve my other problem of the PutKinesisFirehose processor not obeying the
> size constraints.
>
> [image: image.png]
>
>
>
>
> On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com> wrote:
>
>> Nick,
>>
>> If I read this right you get flowfiles that each contain a single line.
>> You want to add a newline to the end of these.  Is this right?  Do you want
>> the objects to stay on their own or would you be ok with say 1000 of these
>> lines being smashed together into a single resulting flowfile with newlines?
>>
>> What happens before and after this?  Can you show a big picture view of
>> the flow with real measures and backlog revealed?
>>
>> Thanks
>>
>> On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
>> nick.carenza@thecontrolgroup.com> wrote:
>>
>>> Nifi v1.9.2
>>>
>>> Hey guys, I have been having issues for months with maybe the simplest
>>> processor and I just can't figure out why.
>>>
>>> All I want to do is append a newline. For some reason this processor
>>> keeps getting backed up. It appears to just stop processing periodically.
>>> This is the only processor amongst a thousand that has this issue.
>>>
>>> I have tried it both as an Append and a Regexp Replace.
>>>
>>> [image: image.png]
>>>
>>> [image: image.png]
>>>
>>> I have tried it with added concurrency and that hasn't solved the
>>> problem.
>>>
>>> This processor will process a bunch of files then stop for a time the
>>> start again. No logs or messages. If I stop it, it hangs on the current
>>> process and I end up having to terminate it or it will just hang
>>> indefinitely. Then when I start it again it will process a bunch of files
>>> then hang again. So on and so forth.
>>>
>>> I don't even know where to go from here.
>>>
>>> Thank,
>>> Nick
>>>
>>>
>>>

Re: Trouble with ReplaceText processor

Posted by Nick Carenza <ni...@thecontrolgroup.com>.
Hey Joe,

Each file contains a single line.
Because this processor has caused so many backups of the flow, I have a
large queue in front of it to prevent the upstream ListenHTTP processor
from rejecting new messages.
Downstream from this processor I emit them individually to AWS Kinesis
Firehose.
I suppose I could merge them and write 1 large file to Kinesis. That might
solve my other problem of the PutKinesisFirehose processor not obeying the
size constraints.

[image: image.png]




On Tue, Feb 25, 2020 at 3:02 PM Joe Witt <jo...@gmail.com> wrote:

> Nick,
>
> If I read this right you get flowfiles that each contain a single line.
> You want to add a newline to the end of these.  Is this right?  Do you want
> the objects to stay on their own or would you be ok with say 1000 of these
> lines being smashed together into a single resulting flowfile with newlines?
>
> What happens before and after this?  Can you show a big picture view of
> the flow with real measures and backlog revealed?
>
> Thanks
>
> On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
> nick.carenza@thecontrolgroup.com> wrote:
>
>> Nifi v1.9.2
>>
>> Hey guys, I have been having issues for months with maybe the simplest
>> processor and I just can't figure out why.
>>
>> All I want to do is append a newline. For some reason this processor
>> keeps getting backed up. It appears to just stop processing periodically.
>> This is the only processor amongst a thousand that has this issue.
>>
>> I have tried it both as an Append and a Regexp Replace.
>>
>> [image: image.png]
>>
>> [image: image.png]
>>
>> I have tried it with added concurrency and that hasn't solved the problem.
>>
>> This processor will process a bunch of files then stop for a time the
>> start again. No logs or messages. If I stop it, it hangs on the current
>> process and I end up having to terminate it or it will just hang
>> indefinitely. Then when I start it again it will process a bunch of files
>> then hang again. So on and so forth.
>>
>> I don't even know where to go from here.
>>
>> Thank,
>> Nick
>>
>>
>>

Re: Trouble with ReplaceText processor

Posted by Joe Witt <jo...@gmail.com>.
Nick,

If I read this right you get flowfiles that each contain a single line.
You want to add a newline to the end of these.  Is this right?  Do you want
the objects to stay on their own or would you be ok with say 1000 of these
lines being smashed together into a single resulting flowfile with newlines?

What happens before and after this?  Can you show a big picture view of the
flow with real measures and backlog revealed?

Thanks

On Tue, Feb 25, 2020 at 2:54 PM Nick Carenza <
nick.carenza@thecontrolgroup.com> wrote:

> Nifi v1.9.2
>
> Hey guys, I have been having issues for months with maybe the simplest
> processor and I just can't figure out why.
>
> All I want to do is append a newline. For some reason this processor keeps
> getting backed up. It appears to just stop processing periodically. This is
> the only processor amongst a thousand that has this issue.
>
> I have tried it both as an Append and a Regexp Replace.
>
> [image: image.png]
>
> [image: image.png]
>
> I have tried it with added concurrency and that hasn't solved the problem.
>
> This processor will process a bunch of files then stop for a time the
> start again. No logs or messages. If I stop it, it hangs on the current
> process and I end up having to terminate it or it will just hang
> indefinitely. Then when I start it again it will process a bunch of files
> then hang again. So on and so forth.
>
> I don't even know where to go from here.
>
> Thank,
> Nick
>
>
>