You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Matthieu Ré <re...@gmail.com> on 2022/03/29 16:20:32 UTC

VolatileContentRepository removal

Hi everyone,

We wanted to talk about this ticket
https://issues.apache.org/jira/browse/NIFI-8760 and the
VolatileContentRepository... I understood that we weren't many to still use
this Repository, but in our use case with a very limited cloud environment
with strict IOps regulations, it fitted perfectly and we managed several To
of data per day efficiently.

We tested other repositories, even a FileSystemContentRepo with RAM based
disk that did not match the case since we experimented numerous OOMs with
the same amount of RAM mounted.

I provided a patch to fix it, that should be applied after 1.13.0 and a
refactor of Claims handling, waiting for a discussion about it. Now I read
that it should disappear in 1.17.0 :(

Is it due to a technical limitation for further features ? Or is it  too
costly to maintain it ?

Thanks! Regards,
Matthieu

Re: VolatileContentRepository removal

Posted by Mark Payne <ma...@hotmail.com>.
Hey Matthieu,

If using a RAM disk, I would recommend trying 1.16 and also setting “nifi.content.claim.max.appendable.size” in nifi.properties to “1 byte”. This will help to ensure that you’re eliminating data from the repository as quickly as possible. Additionally, I would recommend that you configure “nifi.flowfile.repository.checkpoint.interval” to a very small value, such as “5 secs” or “10 secs” as this will also help in eliminating data from the Content Repo as quickly as possible. And you’ll want to ensure that you have “nifi.content.repository.archive.enabled” set to “false”.

Thanks
-Mark


On Mar 31, 2022, at 8:46 AM, Matthieu Ré <re...@gmail.com>> wrote:

Hi Mike, David,

Thanks for your answers and your clarity !

To sum up our use case, our team has set up two cloud-based NiFi ("stateful") clusters. The second one deals with a big amount of record-based data, to drag data from sources (Kafka, files and databases) to other systems (OpenSearch and internal tools) performing a lot of costly transformations (in the last 5 minutes for instance, my prod instance indicates 300G of read, 170G of write, within 400 processors). This cluster deals with data we can recover from a process, so in this use case we are data-loss tolerant. The 12 NiFi nodes of the cluster (as well as a 3-nodes ZK cluster, and 1 NiFi Registry) are running on private-cloud VM instances. Their operations on disks are not as performant as it could be on physical machines, and are restricted to a certain amount of IOps that we can easily reach with disk-based Repositories ; that is the main reason why we tried the VolatileContentRepository. Since 1.13.1, we use a version of NiFi that we build with our custom bundles and the fix that was linked in NIFI-8760 and for now it is working fine for us. We had some memory leaks on custom processors but after correction and limitation on queue sizes we never encountered OOM on the Volatile again.
Don't hesitate to ask for more information or precision on the use case, or for any advice !

About the OOM on RAM-disk approach, the last time we tried it was on 1.14.0. In the next few weeks we will try to migrate to 1.16.0, I'll be glad to investigate if we still experience OOMs and report it if we do. And if we don't, it could be a great solution for us to replace the VolatileContentRepository. I have to add that I just discovered the NiFi Stateless engine and especially the ExecuteStateless processor and the repository you mentioned, with some refacto we could reduce the number of queues in between ExecuteStateless processors that would induce FlowFiles to be written on disks. We will investigate if this is enough for our nodes to be below the threshold of authorized IOps per volume.

So thanks to you I understand that the Volatile has some weaknesses and could have great alternatives. If all these alternatives fail, I would be glad to investigate for 1.17 the possibility to promote the ByteArrayContentRepository to the main framework.

Thank you!
Matthieu

Le mer. 30 mars 2022 à 15:18, David Handermann <ex...@apache.org>> a écrit :
Hi Matthieu,

Thanks for raising this question for discussion. Other maintainers may be able to provide additional background, but part of the reason for removing the VolatileContentRepository implementation was that there were some more fundamental problems with the implementation. Although various framework updates included patching the implementation along the way, the repository was not maintained on a regular basis, which resulted in it being broken for several releases.

As Mike said, it would be helpful to share more details about your use case, and also to hear more about whether you still experience memory issues with the file system repository in current releases.

On a related note, NiFi Stateless includes a new in-memory content repository named ByteArrayContentRepository [1]. It is currently packaged in the NiFi Stateless bundle, but it might be possible to consider promoting it to the framework level, if there is value in a non-persistent content repository going forward.

Regards,
David Handermann

[1] https://github.com/apache/nifi/blob/main/nifi-stateless/nifi-stateless-bundle/nifi-stateless-engine/src/main/java/org/apache/nifi/stateless/repository/ByteArrayContentRepository.java

On Wed, Mar 30, 2022 at 7:45 AM Mike Thomsen <mi...@gmail.com>> wrote:
We've been moving away from supporting it for a while, and I think it
comes down to a lot of both factors when you consider the time
involved in getting good patches and reviewing them. That said, until
1.17 is released, I think there's room for community members like you
and your team to work with us on fixing the gaps that made a strong
case for removing it.

I think I saw in your ticket that you provided patches through Jira.
My recommendation would be to do a feature branch that reverts the
removal, applies your patches and submit it as a PR on GitHub. Then
request a review. Obviously, there's no guarantees there because it's
based on folks' time and energy to do a review, but that would be the
right process at least to move your request forward.

In the long run, I think it would be a lot better for you to share
your use case with us and to see if there's a better route ahead for
your team and NiFi. Sounds like an interesting use case, so it would
be good to get those requirements on the table since most users aren't
operating with those constraints.

Thanks,

Mike

On Tue, Mar 29, 2022 at 12:20 PM Matthieu Ré <re...@gmail.com>> wrote:
>
> Hi everyone,
>
> We wanted to talk about this ticket https://issues.apache.org/jira/browse/NIFI-8760 and the VolatileContentRepository... I understood that we weren't many to still use this Repository, but in our use case with a very limited cloud environment with strict IOps regulations, it fitted perfectly and we managed several To of data per day efficiently.
>
> We tested other repositories, even a FileSystemContentRepo with RAM based disk that did not match the case since we experimented numerous OOMs with the same amount of RAM mounted.
>
> I provided a patch to fix it, that should be applied after 1.13.0 and a refactor of Claims handling, waiting for a discussion about it. Now I read that it should disappear in 1.17.0 :(
>
> Is it due to a technical limitation for further features ? Or is it  too costly to maintain it ?
>
> Thanks! Regards,
> Matthieu


Re: VolatileContentRepository removal

Posted by Matthieu Ré <re...@gmail.com>.
Hi Mike, David,

Thanks for your answers and your clarity !

To sum up our use case, our team has set up two cloud-based NiFi
("stateful") clusters. The second one deals with a big amount of
record-based data, to drag data from sources (Kafka, files and databases)
to other systems (OpenSearch and internal tools) performing a lot of costly
transformations (in the last 5 minutes for instance, my prod instance
indicates 300G of read, 170G of write, within 400 processors). This cluster
deals with data we can recover from a process, so in this use case we are
data-loss tolerant. The 12 NiFi nodes of the cluster (as well as a 3-nodes
ZK cluster, and 1 NiFi Registry) are running on private-cloud VM instances.
Their operations on disks are not as performant as it could be on physical
machines, and are restricted to a certain amount of IOps that we can easily
reach with disk-based Repositories ; that is the main reason why we tried
the VolatileContentRepository. Since 1.13.1, we use a version of NiFi that
we build with our custom bundles and the fix that was linked in NIFI-8760
and for now it is working fine for us. We had some memory leaks on custom
processors but after correction and limitation on queue sizes we never
encountered OOM on the Volatile again.
Don't hesitate to ask for more information or precision on the use case, or
for any advice !

About the OOM on RAM-disk approach, the last time we tried it was on
1.14.0. In the next few weeks we will try to migrate to 1.16.0, I'll be
glad to investigate if we still experience OOMs and report it if we do. And
if we don't, it could be a great solution for us to replace the
VolatileContentRepository. I have to add that I just discovered the NiFi
Stateless engine and especially the ExecuteStateless processor and the
repository you mentioned, with some refacto we could reduce the number of
queues in between ExecuteStateless processors that would induce FlowFiles
to be written on disks. We will investigate if this is enough for our nodes
to be below the threshold of authorized IOps per volume.

So thanks to you I understand that the Volatile has some weaknesses and
could have great alternatives. If all these alternatives fail, I would be
glad to investigate for 1.17 the possibility to promote the
ByteArrayContentRepository to the main framework.

Thank you!
Matthieu

Le mer. 30 mars 2022 à 15:18, David Handermann <ex...@apache.org>
a écrit :

> Hi Matthieu,
>
> Thanks for raising this question for discussion. Other maintainers may be
> able to provide additional background, but part of the reason for removing
> the VolatileContentRepository implementation was that there were some more
> fundamental problems with the implementation. Although various framework
> updates included patching the implementation along the way, the repository
> was not maintained on a regular basis, which resulted in it being broken
> for several releases.
>
> As Mike said, it would be helpful to share more details about your use
> case, and also to hear more about whether you still experience memory
> issues with the file system repository in current releases.
>
> On a related note, NiFi Stateless includes a new in-memory content
> repository named ByteArrayContentRepository [1]. It is currently packaged
> in the NiFi Stateless bundle, but it might be possible to consider
> promoting it to the framework level, if there is value in a non-persistent
> content repository going forward.
>
> Regards,
> David Handermann
>
> [1]
> https://github.com/apache/nifi/blob/main/nifi-stateless/nifi-stateless-bundle/nifi-stateless-engine/src/main/java/org/apache/nifi/stateless/repository/ByteArrayContentRepository.java
>
> On Wed, Mar 30, 2022 at 7:45 AM Mike Thomsen <mi...@gmail.com>
> wrote:
>
>> We've been moving away from supporting it for a while, and I think it
>> comes down to a lot of both factors when you consider the time
>> involved in getting good patches and reviewing them. That said, until
>> 1.17 is released, I think there's room for community members like you
>> and your team to work with us on fixing the gaps that made a strong
>> case for removing it.
>>
>> I think I saw in your ticket that you provided patches through Jira.
>> My recommendation would be to do a feature branch that reverts the
>> removal, applies your patches and submit it as a PR on GitHub. Then
>> request a review. Obviously, there's no guarantees there because it's
>> based on folks' time and energy to do a review, but that would be the
>> right process at least to move your request forward.
>>
>> In the long run, I think it would be a lot better for you to share
>> your use case with us and to see if there's a better route ahead for
>> your team and NiFi. Sounds like an interesting use case, so it would
>> be good to get those requirements on the table since most users aren't
>> operating with those constraints.
>>
>> Thanks,
>>
>> Mike
>>
>> On Tue, Mar 29, 2022 at 12:20 PM Matthieu Ré <re...@gmail.com>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > We wanted to talk about this ticket
>> https://issues.apache.org/jira/browse/NIFI-8760 and the
>> VolatileContentRepository... I understood that we weren't many to still use
>> this Repository, but in our use case with a very limited cloud environment
>> with strict IOps regulations, it fitted perfectly and we managed several To
>> of data per day efficiently.
>> >
>> > We tested other repositories, even a FileSystemContentRepo with RAM
>> based disk that did not match the case since we experimented numerous OOMs
>> with the same amount of RAM mounted.
>> >
>> > I provided a patch to fix it, that should be applied after 1.13.0 and a
>> refactor of Claims handling, waiting for a discussion about it. Now I read
>> that it should disappear in 1.17.0 :(
>> >
>> > Is it due to a technical limitation for further features ? Or is it
>> too costly to maintain it ?
>> >
>> > Thanks! Regards,
>> > Matthieu
>>
>

Re: VolatileContentRepository removal

Posted by David Handermann <ex...@apache.org>.
Hi Matthieu,

Thanks for raising this question for discussion. Other maintainers may be
able to provide additional background, but part of the reason for removing
the VolatileContentRepository implementation was that there were some more
fundamental problems with the implementation. Although various framework
updates included patching the implementation along the way, the repository
was not maintained on a regular basis, which resulted in it being broken
for several releases.

As Mike said, it would be helpful to share more details about your use
case, and also to hear more about whether you still experience memory
issues with the file system repository in current releases.

On a related note, NiFi Stateless includes a new in-memory content
repository named ByteArrayContentRepository [1]. It is currently packaged
in the NiFi Stateless bundle, but it might be possible to consider
promoting it to the framework level, if there is value in a non-persistent
content repository going forward.

Regards,
David Handermann

[1]
https://github.com/apache/nifi/blob/main/nifi-stateless/nifi-stateless-bundle/nifi-stateless-engine/src/main/java/org/apache/nifi/stateless/repository/ByteArrayContentRepository.java

On Wed, Mar 30, 2022 at 7:45 AM Mike Thomsen <mi...@gmail.com> wrote:

> We've been moving away from supporting it for a while, and I think it
> comes down to a lot of both factors when you consider the time
> involved in getting good patches and reviewing them. That said, until
> 1.17 is released, I think there's room for community members like you
> and your team to work with us on fixing the gaps that made a strong
> case for removing it.
>
> I think I saw in your ticket that you provided patches through Jira.
> My recommendation would be to do a feature branch that reverts the
> removal, applies your patches and submit it as a PR on GitHub. Then
> request a review. Obviously, there's no guarantees there because it's
> based on folks' time and energy to do a review, but that would be the
> right process at least to move your request forward.
>
> In the long run, I think it would be a lot better for you to share
> your use case with us and to see if there's a better route ahead for
> your team and NiFi. Sounds like an interesting use case, so it would
> be good to get those requirements on the table since most users aren't
> operating with those constraints.
>
> Thanks,
>
> Mike
>
> On Tue, Mar 29, 2022 at 12:20 PM Matthieu Ré <re...@gmail.com>
> wrote:
> >
> > Hi everyone,
> >
> > We wanted to talk about this ticket
> https://issues.apache.org/jira/browse/NIFI-8760 and the
> VolatileContentRepository... I understood that we weren't many to still use
> this Repository, but in our use case with a very limited cloud environment
> with strict IOps regulations, it fitted perfectly and we managed several To
> of data per day efficiently.
> >
> > We tested other repositories, even a FileSystemContentRepo with RAM
> based disk that did not match the case since we experimented numerous OOMs
> with the same amount of RAM mounted.
> >
> > I provided a patch to fix it, that should be applied after 1.13.0 and a
> refactor of Claims handling, waiting for a discussion about it. Now I read
> that it should disappear in 1.17.0 :(
> >
> > Is it due to a technical limitation for further features ? Or is it  too
> costly to maintain it ?
> >
> > Thanks! Regards,
> > Matthieu
>

Re: VolatileContentRepository removal

Posted by Mike Thomsen <mi...@gmail.com>.
We've been moving away from supporting it for a while, and I think it
comes down to a lot of both factors when you consider the time
involved in getting good patches and reviewing them. That said, until
1.17 is released, I think there's room for community members like you
and your team to work with us on fixing the gaps that made a strong
case for removing it.

I think I saw in your ticket that you provided patches through Jira.
My recommendation would be to do a feature branch that reverts the
removal, applies your patches and submit it as a PR on GitHub. Then
request a review. Obviously, there's no guarantees there because it's
based on folks' time and energy to do a review, but that would be the
right process at least to move your request forward.

In the long run, I think it would be a lot better for you to share
your use case with us and to see if there's a better route ahead for
your team and NiFi. Sounds like an interesting use case, so it would
be good to get those requirements on the table since most users aren't
operating with those constraints.

Thanks,

Mike

On Tue, Mar 29, 2022 at 12:20 PM Matthieu Ré <re...@gmail.com> wrote:
>
> Hi everyone,
>
> We wanted to talk about this ticket https://issues.apache.org/jira/browse/NIFI-8760 and the VolatileContentRepository... I understood that we weren't many to still use this Repository, but in our use case with a very limited cloud environment with strict IOps regulations, it fitted perfectly and we managed several To of data per day efficiently.
>
> We tested other repositories, even a FileSystemContentRepo with RAM based disk that did not match the case since we experimented numerous OOMs with the same amount of RAM mounted.
>
> I provided a patch to fix it, that should be applied after 1.13.0 and a refactor of Claims handling, waiting for a discussion about it. Now I read that it should disappear in 1.17.0 :(
>
> Is it due to a technical limitation for further features ? Or is it  too costly to maintain it ?
>
> Thanks! Regards,
> Matthieu