You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by "Robert R. Bruno" <rb...@gmail.com> on 2020/04/24 20:38:09 UTC

MergeRecord performance

I wanted to see if anyone else has experienced performance issues with the
newest version of nifi and MergeRecord?  We have been running on nifi 1.9.2
for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded, our
identical flows were no longer able to keep up with our data mainly at
MergeRecord processors.

We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all was
keeping up again.  There were no errors to speak of when we were running
the flow with 1.11.4.  We did see higher load on the OS, but this may have
been caused by the fact there was such a tremendous backlog built up in the
flow.

Another side note, we saw one UpdateRecord processor producing errors when
I tested the flow with nifi 1.11.4 with a small test flow.  I was able to
fix this issue by changing some parameters in my RecordWriter.  So perhaps
some underlying ways records are being handled since 1.9.2 caused the
performance issue we saw?

Any insight anyone has would be greatly appreciated, as we very much would
like to upgrade to nifi 1.11.4.  One thought was switching the MergeRecord
processors to MergeContent since I've been told MergeContent seems to
perform better, but not sure if this is actually true.  We are using the
pattern of chaining a few MergeRecord processors together to help with
performance.

Thanks in advance!

Re: MergeRecord performance

Posted by "Robert R. Bruno" <rb...@gmail.com>.

I have back pressure object threshold set to 100000 on that queue and my
swap threshold is 200000.  I don't think though when I had the issue the
number of flow files was very high in the queue in question since the issue
was now at updaterecord after I did a mergecontent that greatly reduced the
number of flow files.

On Mon, Jun 1, 2020, 16:02 Mark Payne <ma...@hotmail.com> wrote:

> Hey Robert,
>
> How big are the FlowFile queues that you have in front of your
> MergeContent/MergeRecord processors? Or, more specifically, what do you
> have configured for the back pressure threshold? I ask because there was a
> fix in 1.11.0 [1] that had to do with ordering when swapping and ensuring
> that data remains in the same order after being swapped out and swapped
> back in when using the FIFO prioritizer.
>
> Some of the changes there can actually change the thresholds when we
> perform swapping. So I’m curious if you’re seeing a lot of swapping of
> FlowFiles to/from disk when running in 1.11.4 that you didn’t have in
> 1.9.2. Are you seeing logs about swapping occurring? And of note, when I
> talk about swapping, I’m talking about NiFi-level FlowFile swapping, not
> OS-level swapping.
>
> Thanks
> -Mark
>
> [1`] https://issues.apache.org/jira/browse/NIFI-7011
>
>
> On May 22, 2020, at 10:35 AM, Robert R. Bruno <rb...@gmail.com> wrote:
>
> Sorry one other thing I thought of that may help.  I noticed on 1.11.4
> when I would stop the updaterecord processor it would take a long period of
> time for the processor to stop (threads were hanging), but when I went back
> to 1.9.2 the processor would stop in a very timely manner.  Not sure if
> that helps, but just another data point.
>
> On Fri, May 22, 2020 at 9:22 AM Robert R. Bruno <rb...@gmail.com> wrote:
>
>> I had more updates on this.
>>
>> Yesterday I again attempted to upgrade one of our 1.9.2 clusters that is
>> now using mergecontent vs mergerecord.  The flow had been running on 1.9.2
>> for about a week with no issue.  I did the upgrade to 1.11.4, and saw about
>> 3 of 10 nodes not being able to keep up.  The load on these 3 nodes became
>> very high.  For perspective, a load of 80 is about as high as we like to
>> see these boxes, and some were getting as high as 120.  I saw one
>> bottleneck forming at an updaterecord.  I tried giving that processor a few
>> more threads to see if it would help work off the backlog.  No matter what
>> I tried (lowering thread, changing mergecontent sizes, etc) the load
>> wouldn't go down on those 3 boxes and they had either a slowing growing
>> backlog or would maintain the backlog they had.
>>
>> I then decide to downgrade the nifi back to 1.9.2 with out rebooting the
>> boxes.  I kept all flow files and content as they were.  Upon downgrading
>> no loads were above 50 and this was only on the boxes that had the backlog
>> that formed when we did the upgrade.  The backlog on the 3 boxes worked off
>> with no issue at all, and without me having to make changes to the flow.
>> Once backlogs were worked off then our loads all sat around 20.
>>
>> This is a similar behavior from what we saw before, but just in another
>> part of the flow.  Has anyone else seen anything like this on 1.11.4?
>> Unfortunately for now we can't upgrade due to this problem.  Any thoughts
>> from anyone would be greatly appreciated.
>>
>> Thanks,
>> Robert
>>
>> On Fri, May 8, 2020 at 4:47 PM Robert R. Bruno <rb...@gmail.com> wrote:
>>
>>> Sorry for the delayed answer, but was doing some testing this week and
>>> found a few more things out.
>>>
>>> First to answer some of your questions.
>>>
>>> I would say with no actual raw numbers, it was worse than a 10%
>>> degradation.  I say this since the flow was badly backing up, and a 10%
>>> decrease in performance should not have caused this since normally we can
>>> work off a backlog of data with no issues.  I looked at my mergerecord
>>> settings, and I am largely using size as the limiting factor.  I have a max
>>> size of 4MB and a max bin age of 1 minute followed by a second mergerecord
>>> with a max size of 32MB and a max bin age of 5 minutes.
>>>
>>> I changed our flow a bit on a test system that was running 1.11.4, and
>>> discovered the following:
>>>
>>> I changed mergerecords to mergecontents.  I used pretty much all of the
>>> same settings in the mergecontent but had the mergecontent deal with the
>>> avro natively.  In this flow, it currently seems like I don't need to chain
>>> multiple mergecontents together like I did with mergerecords.
>>>
>>> I then fed the merged avro from the mergecontent to a convertrecord to
>>> convert the data to parquet.  The convertrecord was tremendously slower
>>> than the mergecontent and become a bottleneck.  I then switched the
>>> convertrecord to the convertavrotoparquet processor.  Convertavrotoparquet
>>> can easily handle the output speed of the mergecontent and then some.
>>>
>>> My hope is to make these changes to our actual flow soon, and then
>>> upgrade to 1.11.4 again.  I'll let you know how that goes.
>>>
>>> Thanks,
>>> Robert
>>>
>>> On Mon, Apr 27, 2020 at 9:26 AM Mark Payne <ma...@hotmail.com> wrote:
>>>
>>>> Robert,
>>>>
>>>> What kind of performance degradation were you seeing here? I put
>>>> together some simple flows to see if I could reproduce using 1.9.2 and
>>>> current master.
>>>> My flow consisted of GenerateFlowFile (generating 2 CSV rows per
>>>> FlowFile) -> ConvertRecord (to Avro) -> MergeRecord (read Avro, write Avro)
>>>> -> UpdateAttribute to try to mimic what you’ve got, given the details that
>>>> I have.
>>>>
>>>> I did see a performance degradation on the order of about 10%. So on my
>>>> laptop I went from processing 2.49 MM FlowFiles in 1.9.2 in 5 mins to 2.25
>>>> MM on the master branch. Interestingly, I saw no real change when I enabled
>>>> Snappy compression.
>>>>
>>>> For a point of reference, I also tried removing MergeRecord and just
>>>> Generate -> Convert -> UpdateAttribute. I saw the same roughly 10%
>>>> performance degradation.
>>>>
>>>> I’m curious if you’re seeing more than that. If so, I think a template
>>>> would be helpful to understand what’s different.
>>>>
>>>> Thanks
>>>> -Mark
>>>>
>>>>
>>>> On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com> wrote:
>>>>
>>>> Joe,
>>>>
>>>> In that part of the flow, we are using avro readers and writers.  We
>>>> are using snappy compression (which could be part of the problem).  Since
>>>> we are using avro at that point the embedded schema is being used by the
>>>> reader and the writer is using the schema name property along with an
>>>> internal schema registry in nifi.
>>>>
>>>> I can see what could potentially be shared.
>>>>
>>>> Thanks
>>>>
>>>> On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com> wrote:
>>>>
>>>>> Robert,
>>>>>
>>>>> Can you please detail the record readers and writers involved and how
>>>>> schemas are accessed?  There can be very important performance related
>>>>> changes in the parsers/serializers of the given formats.  And we've added a
>>>>> lot to make schema caching really capable but you have to opt into it.  It
>>>>> is of course possible MergeRecord itself is the culprit for performance
>>>>> reduction but lets get a more full picture here.
>>>>>
>>>>> Are you able to share a template and sample data which we can use to
>>>>> replicate?
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I wanted to see if anyone else has experienced performance issues
>>>>>> with the newest version of nifi and MergeRecord?  We have been running on
>>>>>> nifi 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4.  Once
>>>>>> upgraded, our identical flows were no longer able to keep up with our data
>>>>>> mainly at MergeRecord processors.
>>>>>>
>>>>>> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all
>>>>>> was keeping up again.  There were no errors to speak of when we were
>>>>>> running the flow with 1.11.4.  We did see higher load on the OS, but this
>>>>>> may have been caused by the fact there was such a tremendous backlog built
>>>>>> up in the flow.
>>>>>>
>>>>>> Another side note, we saw one UpdateRecord processor producing errors
>>>>>> when I tested the flow with nifi 1.11.4 with a small test flow.  I was able
>>>>>> to fix this issue by changing some parameters in my RecordWriter.  So
>>>>>> perhaps some underlying ways records are being handled since 1.9.2 caused
>>>>>> the performance issue we saw?
>>>>>>
>>>>>> Any insight anyone has would be greatly appreciated, as we very much
>>>>>> would like to upgrade to nifi 1.11.4.  One thought was switching the
>>>>>> MergeRecord processors to MergeContent since I've been told MergeContent
>>>>>> seems to perform better, but not sure if this is actually true.  We are
>>>>>> using the pattern of chaining a few MergeRecord processors together to help
>>>>>> with performance.
>>>>>>
>>>>>> Thanks in advance!
>>>>>>
>>>>>
>>>>
>

Re: MergeRecord performance

Posted by Mark Payne <ma...@hotmail.com>.

Hey Robert,

How big are the FlowFile queues that you have in front of your MergeContent/MergeRecord processors? Or, more specifically, what do you have configured for the back pressure threshold? I ask because there was a fix in 1.11.0 [1] that had to do with ordering when swapping and ensuring that data remains in the same order after being swapped out and swapped back in when using the FIFO prioritizer.

Some of the changes there can actually change the thresholds when we perform swapping. So I’m curious if you’re seeing a lot of swapping of FlowFiles to/from disk when running in 1.11.4 that you didn’t have in 1.9.2. Are you seeing logs about swapping occurring? And of note, when I talk about swapping, I’m talking about NiFi-level FlowFile swapping, not OS-level swapping.

Thanks
-Mark

[1`] https://issues.apache.org/jira/browse/NIFI-7011

On May 22, 2020, at 10:35 AM, Robert R. Bruno <rb...@gmail.com>> wrote:

Sorry one other thing I thought of that may help. I noticed on 1.11.4 when I would stop the updaterecord processor it would take a long period of time for the processor to stop (threads were hanging), but when I went back to 1.9.2 the processor would stop in a very timely manner. Not sure if that helps, but just another data point.

On Fri, May 22, 2020 at 9:22 AM Robert R. Bruno <rb...@gmail.com>> wrote:
I had more updates on this.

Yesterday I again attempted to upgrade one of our 1.9.2 clusters that is now using mergecontent vs mergerecord. The flow had been running on 1.9.2 for about a week with no issue. I did the upgrade to 1.11.4, and saw about 3 of 10 nodes not being able to keep up. The load on these 3 nodes became very high. For perspective, a load of 80 is about as high as we like to see these boxes, and some were getting as high as 120. I saw one bottleneck forming at an updaterecord. I tried giving that processor a few more threads to see if it would help work off the backlog. No matter what I tried (lowering thread, changing mergecontent sizes, etc) the load wouldn't go down on those 3 boxes and they had either a slowing growing backlog or would maintain the backlog they had.

I then decide to downgrade the nifi back to 1.9.2 with out rebooting the boxes. I kept all flow files and content as they were. Upon downgrading no loads were above 50 and this was only on the boxes that had the backlog that formed when we did the upgrade. The backlog on the 3 boxes worked off with no issue at all, and without me having to make changes to the flow. Once backlogs were worked off then our loads all sat around 20.

This is a similar behavior from what we saw before, but just in another part of the flow. Has anyone else seen anything like this on 1.11.4? Unfortunately for now we can't upgrade due to this problem. Any thoughts from anyone would be greatly appreciated.

Thanks,
Robert

On Fri, May 8, 2020 at 4:47 PM Robert R. Bruno <rb...@gmail.com>> wrote:
Sorry for the delayed answer, but was doing some testing this week and found a few more things out.

First to answer some of your questions.

I would say with no actual raw numbers, it was worse than a 10% degradation. I say this since the flow was badly backing up, and a 10% decrease in performance should not have caused this since normally we can work off a backlog of data with no issues. I looked at my mergerecord settings, and I am largely using size as the limiting factor. I have a max size of 4MB and a max bin age of 1 minute followed by a second mergerecord with a max size of 32MB and a max bin age of 5 minutes.

I changed our flow a bit on a test system that was running 1.11.4, and discovered the following:

I changed mergerecords to mergecontents. I used pretty much all of the same settings in the mergecontent but had the mergecontent deal with the avro natively. In this flow, it currently seems like I don't need to chain multiple mergecontents together like I did with mergerecords.

I then fed the merged avro from the mergecontent to a convertrecord to convert the data to parquet. The convertrecord was tremendously slower than the mergecontent and become a bottleneck. I then switched the convertrecord to the convertavrotoparquet processor. Convertavrotoparquet can easily handle the output speed of the mergecontent and then some.

My hope is to make these changes to our actual flow soon, and then upgrade to 1.11.4 again. I'll let you know how that goes.

Thanks,
Robert

On Mon, Apr 27, 2020 at 9:26 AM Mark Payne <ma...@hotmail.com>> wrote:
Robert,

What kind of performance degradation were you seeing here? I put together some simple flows to see if I could reproduce using 1.9.2 and current master.
My flow consisted of GenerateFlowFile (generating 2 CSV rows per FlowFile) -> ConvertRecord (to Avro) -> MergeRecord (read Avro, write Avro) -> UpdateAttribute to try to mimic what you’ve got, given the details that I have.

I did see a performance degradation on the order of about 10%. So on my laptop I went from processing 2.49 MM FlowFiles in 1.9.2 in 5 mins to 2.25 MM on the master branch. Interestingly, I saw no real change when I enabled Snappy compression.

For a point of reference, I also tried removing MergeRecord and just Generate -> Convert -> UpdateAttribute. I saw the same roughly 10% performance degradation.

I’m curious if you’re seeing more than that. If so, I think a template would be helpful to understand what’s different.

Thanks
-Mark

On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com>> wrote:

Joe,

In that part of the flow, we are using avro readers and writers. We are using snappy compression (which could be part of the problem). Since we are using avro at that point the embedded schema is being used by the reader and the writer is using the schema name property along with an internal schema registry in nifi.

I can see what could potentially be shared.

Thanks

On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com>> wrote:
Robert,

Can you please detail the record readers and writers involved and how schemas are accessed? There can be very important performance related changes in the parsers/serializers of the given formats. And we've added a lot to make schema caching really capable but you have to opt into it. It is of course possible MergeRecord itself is the culprit for performance reduction but lets get a more full picture here.

Are you able to share a template and sample data which we can use to replicate?

Thanks

On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com>> wrote:
I wanted to see if anyone else has experienced performance issues with the newest version of nifi and MergeRecord? We have been running on nifi 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4. Once upgraded, our identical flows were no longer able to keep up with our data mainly at MergeRecord processors.

We ended up downgrading back to nifi 1.9.2. Once we downgraded, all was keeping up again. There were no errors to speak of when we were running the flow with 1.11.4. We did see higher load on the OS, but this may have been caused by the fact there was such a tremendous backlog built up in the flow.

Another side note, we saw one UpdateRecord processor producing errors when I tested the flow with nifi 1.11.4 with a small test flow. I was able to fix this issue by changing some parameters in my RecordWriter. So perhaps some underlying ways records are being handled since 1.9.2 caused the performance issue we saw?

Any insight anyone has would be greatly appreciated, as we very much would like to upgrade to nifi 1.11.4. One thought was switching the MergeRecord processors to MergeContent since I've been told MergeContent seems to perform better, but not sure if this is actually true. We are using the pattern of chaining a few MergeRecord processors together to help with performance.

Thanks in advance!

Re: MergeRecord performance

Posted by "Robert R. Bruno" <rb...@gmail.com>.

Sorry one other thing I thought of that may help.  I noticed on 1.11.4 when
I would stop the updaterecord processor it would take a long period of time
for the processor to stop (threads were hanging), but when I went back to
1.9.2 the processor would stop in a very timely manner.  Not sure if that
helps, but just another data point.

On Fri, May 22, 2020 at 9:22 AM Robert R. Bruno <rb...@gmail.com> wrote:

> I had more updates on this.
>
> Yesterday I again attempted to upgrade one of our 1.9.2 clusters that is
> now using mergecontent vs mergerecord.  The flow had been running on 1.9.2
> for about a week with no issue.  I did the upgrade to 1.11.4, and saw about
> 3 of 10 nodes not being able to keep up.  The load on these 3 nodes became
> very high.  For perspective, a load of 80 is about as high as we like to
> see these boxes, and some were getting as high as 120.  I saw one
> bottleneck forming at an updaterecord.  I tried giving that processor a few
> more threads to see if it would help work off the backlog.  No matter what
> I tried (lowering thread, changing mergecontent sizes, etc) the load
> wouldn't go down on those 3 boxes and they had either a slowing growing
> backlog or would maintain the backlog they had.
>
> I then decide to downgrade the nifi back to 1.9.2 with out rebooting the
> boxes.  I kept all flow files and content as they were.  Upon downgrading
> no loads were above 50 and this was only on the boxes that had the backlog
> that formed when we did the upgrade.  The backlog on the 3 boxes worked off
> with no issue at all, and without me having to make changes to the flow.
> Once backlogs were worked off then our loads all sat around 20.
>
> This is a similar behavior from what we saw before, but just in another
> part of the flow.  Has anyone else seen anything like this on 1.11.4?
> Unfortunately for now we can't upgrade due to this problem.  Any thoughts
> from anyone would be greatly appreciated.
>
> Thanks,
> Robert
>
> On Fri, May 8, 2020 at 4:47 PM Robert R. Bruno <rb...@gmail.com> wrote:
>
>> Sorry for the delayed answer, but was doing some testing this week and
>> found a few more things out.
>>
>> First to answer some of your questions.
>>
>> I would say with no actual raw numbers, it was worse than a 10%
>> degradation.  I say this since the flow was badly backing up, and a 10%
>> decrease in performance should not have caused this since normally we can
>> work off a backlog of data with no issues.  I looked at my mergerecord
>> settings, and I am largely using size as the limiting factor.  I have a max
>> size of 4MB and a max bin age of 1 minute followed by a second mergerecord
>> with a max size of 32MB and a max bin age of 5 minutes.
>>
>> I changed our flow a bit on a test system that was running 1.11.4, and
>> discovered the following:
>>
>> I changed mergerecords to mergecontents.  I used pretty much all of the
>> same settings in the mergecontent but had the mergecontent deal with the
>> avro natively.  In this flow, it currently seems like I don't need to chain
>> multiple mergecontents together like I did with mergerecords.
>>
>> I then fed the merged avro from the mergecontent to a convertrecord to
>> convert the data to parquet.  The convertrecord was tremendously slower
>> than the mergecontent and become a bottleneck.  I then switched the
>> convertrecord to the convertavrotoparquet processor.  Convertavrotoparquet
>> can easily handle the output speed of the mergecontent and then some.
>>
>> My hope is to make these changes to our actual flow soon, and then
>> upgrade to 1.11.4 again.  I'll let you know how that goes.
>>
>> Thanks,
>> Robert
>>
>> On Mon, Apr 27, 2020 at 9:26 AM Mark Payne <ma...@hotmail.com> wrote:
>>
>>> Robert,
>>>
>>> What kind of performance degradation were you seeing here? I put
>>> together some simple flows to see if I could reproduce using 1.9.2 and
>>> current master.
>>> My flow consisted of GenerateFlowFile (generating 2 CSV rows per
>>> FlowFile) -> ConvertRecord (to Avro) -> MergeRecord (read Avro, write Avro)
>>> -> UpdateAttribute to try to mimic what you’ve got, given the details that
>>> I have.
>>>
>>> I did see a performance degradation on the order of about 10%. So on my
>>> laptop I went from processing 2.49 MM FlowFiles in 1.9.2 in 5 mins to 2.25
>>> MM on the master branch. Interestingly, I saw no real change when I enabled
>>> Snappy compression.
>>>
>>> For a point of reference, I also tried removing MergeRecord and just
>>> Generate -> Convert -> UpdateAttribute. I saw the same roughly 10%
>>> performance degradation.
>>>
>>> I’m curious if you’re seeing more than that. If so, I think a template
>>> would be helpful to understand what’s different.
>>>
>>> Thanks
>>> -Mark
>>>
>>>
>>> On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com> wrote:
>>>
>>> Joe,
>>>
>>> In that part of the flow, we are using avro readers and writers.  We are
>>> using snappy compression (which could be part of the problem).  Since we
>>> are using avro at that point the embedded schema is being used by the
>>> reader and the writer is using the schema name property along with an
>>> internal schema registry in nifi.
>>>
>>> I can see what could potentially be shared.
>>>
>>> Thanks
>>>
>>> On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com> wrote:
>>>
>>>> Robert,
>>>>
>>>> Can you please detail the record readers and writers involved and how
>>>> schemas are accessed?  There can be very important performance related
>>>> changes in the parsers/serializers of the given formats.  And we've added a
>>>> lot to make schema caching really capable but you have to opt into it.  It
>>>> is of course possible MergeRecord itself is the culprit for performance
>>>> reduction but lets get a more full picture here.
>>>>
>>>> Are you able to share a template and sample data which we can use to
>>>> replicate?
>>>>
>>>> Thanks
>>>>
>>>> On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com>
>>>> wrote:
>>>>
>>>>> I wanted to see if anyone else has experienced performance issues with
>>>>> the newest version of nifi and MergeRecord?  We have been running on nifi
>>>>> 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded,
>>>>> our identical flows were no longer able to keep up with our data mainly at
>>>>> MergeRecord processors.
>>>>>
>>>>> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all
>>>>> was keeping up again.  There were no errors to speak of when we were
>>>>> running the flow with 1.11.4.  We did see higher load on the OS, but this
>>>>> may have been caused by the fact there was such a tremendous backlog built
>>>>> up in the flow.
>>>>>
>>>>> Another side note, we saw one UpdateRecord processor producing errors
>>>>> when I tested the flow with nifi 1.11.4 with a small test flow.  I was able
>>>>> to fix this issue by changing some parameters in my RecordWriter.  So
>>>>> perhaps some underlying ways records are being handled since 1.9.2 caused
>>>>> the performance issue we saw?
>>>>>
>>>>> Any insight anyone has would be greatly appreciated, as we very much
>>>>> would like to upgrade to nifi 1.11.4.  One thought was switching the
>>>>> MergeRecord processors to MergeContent since I've been told MergeContent
>>>>> seems to perform better, but not sure if this is actually true.  We are
>>>>> using the pattern of chaining a few MergeRecord processors together to help
>>>>> with performance.
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>
>>>

Re: MergeRecord performance

Posted by "Robert R. Bruno" <rb...@gmail.com>.

I had more updates on this.

Yesterday I again attempted to upgrade one of our 1.9.2 clusters that is
now using mergecontent vs mergerecord.  The flow had been running on 1.9.2
for about a week with no issue.  I did the upgrade to 1.11.4, and saw about
3 of 10 nodes not being able to keep up.  The load on these 3 nodes became
very high.  For perspective, a load of 80 is about as high as we like to
see these boxes, and some were getting as high as 120.  I saw one
bottleneck forming at an updaterecord.  I tried giving that processor a few
more threads to see if it would help work off the backlog.  No matter what
I tried (lowering thread, changing mergecontent sizes, etc) the load
wouldn't go down on those 3 boxes and they had either a slowing growing
backlog or would maintain the backlog they had.

I then decide to downgrade the nifi back to 1.9.2 with out rebooting the
boxes.  I kept all flow files and content as they were.  Upon downgrading
no loads were above 50 and this was only on the boxes that had the backlog
that formed when we did the upgrade.  The backlog on the 3 boxes worked off
with no issue at all, and without me having to make changes to the flow.
Once backlogs were worked off then our loads all sat around 20.

This is a similar behavior from what we saw before, but just in another
part of the flow.  Has anyone else seen anything like this on 1.11.4?
Unfortunately for now we can't upgrade due to this problem.  Any thoughts
from anyone would be greatly appreciated.

Thanks,
Robert

On Fri, May 8, 2020 at 4:47 PM Robert R. Bruno <rb...@gmail.com> wrote:

> Sorry for the delayed answer, but was doing some testing this week and
> found a few more things out.
>
> First to answer some of your questions.
>
> I would say with no actual raw numbers, it was worse than a 10%
> degradation.  I say this since the flow was badly backing up, and a 10%
> decrease in performance should not have caused this since normally we can
> work off a backlog of data with no issues.  I looked at my mergerecord
> settings, and I am largely using size as the limiting factor.  I have a max
> size of 4MB and a max bin age of 1 minute followed by a second mergerecord
> with a max size of 32MB and a max bin age of 5 minutes.
>
> I changed our flow a bit on a test system that was running 1.11.4, and
> discovered the following:
>
> I changed mergerecords to mergecontents.  I used pretty much all of the
> same settings in the mergecontent but had the mergecontent deal with the
> avro natively.  In this flow, it currently seems like I don't need to chain
> multiple mergecontents together like I did with mergerecords.
>
> I then fed the merged avro from the mergecontent to a convertrecord to
> convert the data to parquet.  The convertrecord was tremendously slower
> than the mergecontent and become a bottleneck.  I then switched the
> convertrecord to the convertavrotoparquet processor.  Convertavrotoparquet
> can easily handle the output speed of the mergecontent and then some.
>
> My hope is to make these changes to our actual flow soon, and then upgrade
> to 1.11.4 again.  I'll let you know how that goes.
>
> Thanks,
> Robert
>
> On Mon, Apr 27, 2020 at 9:26 AM Mark Payne <ma...@hotmail.com> wrote:
>
>> Robert,
>>
>> What kind of performance degradation were you seeing here? I put together
>> some simple flows to see if I could reproduce using 1.9.2 and current
>> master.
>> My flow consisted of GenerateFlowFile (generating 2 CSV rows per
>> FlowFile) -> ConvertRecord (to Avro) -> MergeRecord (read Avro, write Avro)
>> -> UpdateAttribute to try to mimic what you’ve got, given the details that
>> I have.
>>
>> I did see a performance degradation on the order of about 10%. So on my
>> laptop I went from processing 2.49 MM FlowFiles in 1.9.2 in 5 mins to 2.25
>> MM on the master branch. Interestingly, I saw no real change when I enabled
>> Snappy compression.
>>
>> For a point of reference, I also tried removing MergeRecord and just
>> Generate -> Convert -> UpdateAttribute. I saw the same roughly 10%
>> performance degradation.
>>
>> I’m curious if you’re seeing more than that. If so, I think a template
>> would be helpful to understand what’s different.
>>
>> Thanks
>> -Mark
>>
>>
>> On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com> wrote:
>>
>> Joe,
>>
>> In that part of the flow, we are using avro readers and writers.  We are
>> using snappy compression (which could be part of the problem).  Since we
>> are using avro at that point the embedded schema is being used by the
>> reader and the writer is using the schema name property along with an
>> internal schema registry in nifi.
>>
>> I can see what could potentially be shared.
>>
>> Thanks
>>
>> On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com> wrote:
>>
>>> Robert,
>>>
>>> Can you please detail the record readers and writers involved and how
>>> schemas are accessed?  There can be very important performance related
>>> changes in the parsers/serializers of the given formats.  And we've added a
>>> lot to make schema caching really capable but you have to opt into it.  It
>>> is of course possible MergeRecord itself is the culprit for performance
>>> reduction but lets get a more full picture here.
>>>
>>> Are you able to share a template and sample data which we can use to
>>> replicate?
>>>
>>> Thanks
>>>
>>> On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com>
>>> wrote:
>>>
>>>> I wanted to see if anyone else has experienced performance issues with
>>>> the newest version of nifi and MergeRecord?  We have been running on nifi
>>>> 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded,
>>>> our identical flows were no longer able to keep up with our data mainly at
>>>> MergeRecord processors.
>>>>
>>>> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all
>>>> was keeping up again.  There were no errors to speak of when we were
>>>> running the flow with 1.11.4.  We did see higher load on the OS, but this
>>>> may have been caused by the fact there was such a tremendous backlog built
>>>> up in the flow.
>>>>
>>>> Another side note, we saw one UpdateRecord processor producing errors
>>>> when I tested the flow with nifi 1.11.4 with a small test flow.  I was able
>>>> to fix this issue by changing some parameters in my RecordWriter.  So
>>>> perhaps some underlying ways records are being handled since 1.9.2 caused
>>>> the performance issue we saw?
>>>>
>>>> Any insight anyone has would be greatly appreciated, as we very much
>>>> would like to upgrade to nifi 1.11.4.  One thought was switching the
>>>> MergeRecord processors to MergeContent since I've been told MergeContent
>>>> seems to perform better, but not sure if this is actually true.  We are
>>>> using the pattern of chaining a few MergeRecord processors together to help
>>>> with performance.
>>>>
>>>> Thanks in advance!
>>>>
>>>
>>

Re: MergeRecord performance

Posted by "Robert R. Bruno" <rb...@gmail.com>.

Sorry for the delayed answer, but was doing some testing this week and
found a few more things out.

First to answer some of your questions.

I would say with no actual raw numbers, it was worse than a 10%
degradation.  I say this since the flow was badly backing up, and a 10%
decrease in performance should not have caused this since normally we can
work off a backlog of data with no issues.  I looked at my mergerecord
settings, and I am largely using size as the limiting factor.  I have a max
size of 4MB and a max bin age of 1 minute followed by a second mergerecord
with a max size of 32MB and a max bin age of 5 minutes.

I changed our flow a bit on a test system that was running 1.11.4, and
discovered the following:

I changed mergerecords to mergecontents.  I used pretty much all of the
same settings in the mergecontent but had the mergecontent deal with the
avro natively.  In this flow, it currently seems like I don't need to chain
multiple mergecontents together like I did with mergerecords.

I then fed the merged avro from the mergecontent to a convertrecord to
convert the data to parquet.  The convertrecord was tremendously slower
than the mergecontent and become a bottleneck.  I then switched the
convertrecord to the convertavrotoparquet processor.  Convertavrotoparquet
can easily handle the output speed of the mergecontent and then some.

My hope is to make these changes to our actual flow soon, and then upgrade
to 1.11.4 again.  I'll let you know how that goes.

Thanks,
Robert

On Mon, Apr 27, 2020 at 9:26 AM Mark Payne <ma...@hotmail.com> wrote:

> Robert,
>
> What kind of performance degradation were you seeing here? I put together
> some simple flows to see if I could reproduce using 1.9.2 and current
> master.
> My flow consisted of GenerateFlowFile (generating 2 CSV rows per FlowFile)
> -> ConvertRecord (to Avro) -> MergeRecord (read Avro, write Avro) ->
> UpdateAttribute to try to mimic what you’ve got, given the details that I
> have.
>
> I did see a performance degradation on the order of about 10%. So on my
> laptop I went from processing 2.49 MM FlowFiles in 1.9.2 in 5 mins to 2.25
> MM on the master branch. Interestingly, I saw no real change when I enabled
> Snappy compression.
>
> For a point of reference, I also tried removing MergeRecord and just
> Generate -> Convert -> UpdateAttribute. I saw the same roughly 10%
> performance degradation.
>
> I’m curious if you’re seeing more than that. If so, I think a template
> would be helpful to understand what’s different.
>
> Thanks
> -Mark
>
>
> On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com> wrote:
>
> Joe,
>
> In that part of the flow, we are using avro readers and writers.  We are
> using snappy compression (which could be part of the problem).  Since we
> are using avro at that point the embedded schema is being used by the
> reader and the writer is using the schema name property along with an
> internal schema registry in nifi.
>
> I can see what could potentially be shared.
>
> Thanks
>
> On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com> wrote:
>
>> Robert,
>>
>> Can you please detail the record readers and writers involved and how
>> schemas are accessed?  There can be very important performance related
>> changes in the parsers/serializers of the given formats.  And we've added a
>> lot to make schema caching really capable but you have to opt into it.  It
>> is of course possible MergeRecord itself is the culprit for performance
>> reduction but lets get a more full picture here.
>>
>> Are you able to share a template and sample data which we can use to
>> replicate?
>>
>> Thanks
>>
>> On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com>
>> wrote:
>>
>>> I wanted to see if anyone else has experienced performance issues with
>>> the newest version of nifi and MergeRecord?  We have been running on nifi
>>> 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded,
>>> our identical flows were no longer able to keep up with our data mainly at
>>> MergeRecord processors.
>>>
>>> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all was
>>> keeping up again.  There were no errors to speak of when we were running
>>> the flow with 1.11.4.  We did see higher load on the OS, but this may have
>>> been caused by the fact there was such a tremendous backlog built up in the
>>> flow.
>>>
>>> Another side note, we saw one UpdateRecord processor producing errors
>>> when I tested the flow with nifi 1.11.4 with a small test flow.  I was able
>>> to fix this issue by changing some parameters in my RecordWriter.  So
>>> perhaps some underlying ways records are being handled since 1.9.2 caused
>>> the performance issue we saw?
>>>
>>> Any insight anyone has would be greatly appreciated, as we very much
>>> would like to upgrade to nifi 1.11.4.  One thought was switching the
>>> MergeRecord processors to MergeContent since I've been told MergeContent
>>> seems to perform better, but not sure if this is actually true.  We are
>>> using the pattern of chaining a few MergeRecord processors together to help
>>> with performance.
>>>
>>> Thanks in advance!
>>>
>>
>

Re: MergeRecord performance

Posted by Mark Payne <ma...@hotmail.com>.

Robert,

For a point of reference, I also tried removing MergeRecord and just Generate -> Convert -> UpdateAttribute. I saw the same roughly 10% performance degradation.

I’m curious if you’re seeing more than that. If so, I think a template would be helpful to understand what’s different.

Thanks
-Mark

On Apr 24, 2020, at 4:50 PM, Robert R. Bruno <rb...@gmail.com>> wrote:

Joe,

I can see what could potentially be shared.

Thanks

On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com>> wrote:
Robert,

Are you able to share a template and sample data which we can use to replicate?

Thanks

Thanks in advance!

Re: MergeRecord performance

Posted by "Robert R. Bruno" <rb...@gmail.com>.

Joe,

In that part of the flow, we are using avro readers and writers.  We are
using snappy compression (which could be part of the problem).  Since we
are using avro at that point the embedded schema is being used by the
reader and the writer is using the schema name property along with an
internal schema registry in nifi.

I can see what could potentially be shared.

Thanks

On Fri, Apr 24, 2020 at 4:41 PM Joe Witt <jo...@gmail.com> wrote:

> Robert,
>
> Can you please detail the record readers and writers involved and how
> schemas are accessed?  There can be very important performance related
> changes in the parsers/serializers of the given formats.  And we've added a
> lot to make schema caching really capable but you have to opt into it.  It
> is of course possible MergeRecord itself is the culprit for performance
> reduction but lets get a more full picture here.
>
> Are you able to share a template and sample data which we can use to
> replicate?
>
> Thanks
>
> On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com> wrote:
>
>> I wanted to see if anyone else has experienced performance issues with
>> the newest version of nifi and MergeRecord?  We have been running on nifi
>> 1.9.2 for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded,
>> our identical flows were no longer able to keep up with our data mainly at
>> MergeRecord processors.
>>
>> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all was
>> keeping up again.  There were no errors to speak of when we were running
>> the flow with 1.11.4.  We did see higher load on the OS, but this may have
>> been caused by the fact there was such a tremendous backlog built up in the
>> flow.
>>
>> Another side note, we saw one UpdateRecord processor producing errors
>> when I tested the flow with nifi 1.11.4 with a small test flow.  I was able
>> to fix this issue by changing some parameters in my RecordWriter.  So
>> perhaps some underlying ways records are being handled since 1.9.2 caused
>> the performance issue we saw?
>>
>> Any insight anyone has would be greatly appreciated, as we very much
>> would like to upgrade to nifi 1.11.4.  One thought was switching the
>> MergeRecord processors to MergeContent since I've been told MergeContent
>> seems to perform better, but not sure if this is actually true.  We are
>> using the pattern of chaining a few MergeRecord processors together to help
>> with performance.
>>
>> Thanks in advance!
>>
>

Re: MergeRecord performance

Posted by Joe Witt <jo...@gmail.com>.

Robert,

Can you please detail the record readers and writers involved and how
schemas are accessed?  There can be very important performance related
changes in the parsers/serializers of the given formats.  And we've added a
lot to make schema caching really capable but you have to opt into it.  It
is of course possible MergeRecord itself is the culprit for performance
reduction but lets get a more full picture here.

Are you able to share a template and sample data which we can use to
replicate?

Thanks

On Fri, Apr 24, 2020 at 4:38 PM Robert R. Bruno <rb...@gmail.com> wrote:

> I wanted to see if anyone else has experienced performance issues with the
> newest version of nifi and MergeRecord?  We have been running on nifi 1.9.2
> for awhile now, and recently upgraded to nifi 1.11.4.  Once upgraded, our
> identical flows were no longer able to keep up with our data mainly at
> MergeRecord processors.
>
> We ended up downgrading back to nifi 1.9.2.  Once we downgraded, all was
> keeping up again.  There were no errors to speak of when we were running
> the flow with 1.11.4.  We did see higher load on the OS, but this may have
> been caused by the fact there was such a tremendous backlog built up in the
> flow.
>
> Another side note, we saw one UpdateRecord processor producing errors when
> I tested the flow with nifi 1.11.4 with a small test flow.  I was able to
> fix this issue by changing some parameters in my RecordWriter.  So perhaps
> some underlying ways records are being handled since 1.9.2 caused the
> performance issue we saw?
>
> Any insight anyone has would be greatly appreciated, as we very much would
> like to upgrade to nifi 1.11.4.  One thought was switching the MergeRecord
> processors to MergeContent since I've been told MergeContent seems to
> perform better, but not sure if this is actually true.  We are using the
> pattern of chaining a few MergeRecord processors together to help with
> performance.
>
> Thanks in advance!
>