You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "Greene (US), Geoffrey N" <ge...@boeing.com> on 2022/11/02 15:37:58 UTC

DistributedMapCacheServer

I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).

I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it's a reference implementation only, and it is not really recommended for production.

Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
I don't have an HDFS implementation anywhere, but I do need the cacheing part.

Can someone explain?  Can I use DistributedMapCacheServer in my production flows?




Re: DistributedMapCacheServer

Posted by Bryan Bende <bb...@gmail.com>.
ZK could probably work for some use cases, we do have state manager
implementations for both ZK and Redis, so this would be a similar case
where if you have high volume state updates or cache interactions then
you use the Redis implementations, for low volume you can use the ZK
implementations, but not sure anyone is going to invest in a ZK DMC
implementation at this point. We have quite a few DMC clients for
external caches. In my opinion, if you are willing to stand up
something external then Redis is your best bet.

On Fri, Nov 4, 2022 at 10:00 AM Mike Thomsen <mi...@gmail.com> wrote:
>
> Perhaps I'm mistaken, but ZK is designed for managing configuration
> data and not the sort of large scale key/value lookup that is implied
> with DistributedMapCache implementations.
>
> On Fri, Nov 4, 2022 at 6:28 AM ta.fiat.belastingdienst.nl via users
> <us...@nifi.apache.org> wrote:
> >
> >
> > Hello,
> >
> > I'm investigating Redis, seems to work easy.
> >
> > I think it is a bit strange that Zookeeper is not in de list with providers. Zookeeper is already used as cluster manager for Nifi, that would be easy to add in my opinion.
> >
> > regards,
> >
> > Tiemen.
> >
> >
> >
> > ----- Oorspronkelijk bericht -----
> > Van: "Mike Thomsen" <mi...@gmail.com>
> > Aan: users@nifi.apache.org
> > Cc:
> > Onderwerp: Re: DistributedMapCacheServer
> > Datum: do 3 nov. 2022 15:58
> >
> > [EXTERNE E-MAIL] Dit bericht is afkomstig van een externe afzender. Wees voorzichtig met het openen van linkjes en bijlagen.
> >
> >
> > You can also use the Cassandra DMC, which is something we are starting
> > to use a lot where I work.
> >
> > Admittedly, the documentation is non-existent at the moment, but we
> > also open sourced an experimental delegating DMC client that can be
> > used to chain multiple DMCs together so you can do Redis for hot
> > caching and something like Cassandra for broader cold caches
> >
> > https://github.com/Domestic-Resilience-FOSS/nifi-delegating-distributedmapcache-bundle
> >
> > On Wed, Nov 2, 2022 at 5:25 PM Peter Turcsanyi <tu...@apache.org> wrote:
> > >
> > > Embedded Hazelcast can also be an option. In that case, there is no
> > > need to set up an external cache but the Hazelcast instances are
> > > running on the NiFi nodes (in the same JVM as NiFi).
> > > Please note: no security/authentication is supported in embedded mode.
> > >
> > > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html
> > > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cachemanager.EmbeddedHazelcastCacheManager/index.html
> > >
> > > On Wed, Nov 2, 2022 at 5:10 PM Bryan Bende <bb...@gmail.com> wrote:
> > > >
> > > > The DMC Server does not support high availability. If you have a 3
> > > > node nifi cluster, each node will have a DMC client and DMC server,
> > > > but the clients all have to point at only one of the servers, and if
> > > > the node where that server is running goes down, there is no
> > > > replication or failover to another node. So it is really up to you to
> > > > decide if that is acceptable for your use case. If its not then you
> > > > need to use a different DMC client implementation that can communicate
> > > > with an external HA cache, like Redis.
> > > >
> > > > On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N
> > > > <ge...@boeing.com> wrote:
> > > > >
> > > > > I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).
> > > > >
> > > > >
> > > > >
> > > > > I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it’s a reference implementation only, and it is not really recommended for production.
> > > > >
> > > > >
> > > > >
> > > > > Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
> > > > >
> > > > > I don’t have an HDFS implementation anywhere, but I do need the cacheing part.
> > > > >
> > > > >
> > > > >
> > > > > Can someone explain?  Can I use DistributedMapCacheServer in my production flows?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> > De Belastingdienst stelt e-mail niet open voor aanvragen, aangiften, bezwaarschriften, verzoeken, klachten, ingebrekestellingen en soortgelijke formele berichten.
> > Dit bericht is uitsluitend bestemd voor de geadresseerde. Het bericht kan vertrouwelijke informatie bevatten waarvoor de fiscale geheimhoudingsplicht geldt. Als u dit bericht per abuis hebt ontvangen, wordt u verzocht het te verwijderen en de afzender te informeren.
> >
> > The Dutch Tax Administration does not accept filings, requests, appeals, complaints, notices of default or similar formal notices, sent by email.
> > This message is solely intended for the addressee. It may contain information that is confidential and legally privileged. If you are not the intended recipient please delete this message and notify the sender.

Re: DistributedMapCacheServer

Posted by Mike Thomsen <mi...@gmail.com>.
Perhaps I'm mistaken, but ZK is designed for managing configuration
data and not the sort of large scale key/value lookup that is implied
with DistributedMapCache implementations.

On Fri, Nov 4, 2022 at 6:28 AM ta.fiat.belastingdienst.nl via users
<us...@nifi.apache.org> wrote:
>
>
> Hello,
>
> I'm investigating Redis, seems to work easy.
>
> I think it is a bit strange that Zookeeper is not in de list with providers. Zookeeper is already used as cluster manager for Nifi, that would be easy to add in my opinion.
>
> regards,
>
> Tiemen.
>
>
>
> ----- Oorspronkelijk bericht -----
> Van: "Mike Thomsen" <mi...@gmail.com>
> Aan: users@nifi.apache.org
> Cc:
> Onderwerp: Re: DistributedMapCacheServer
> Datum: do 3 nov. 2022 15:58
>
> [EXTERNE E-MAIL] Dit bericht is afkomstig van een externe afzender. Wees voorzichtig met het openen van linkjes en bijlagen.
>
>
> You can also use the Cassandra DMC, which is something we are starting
> to use a lot where I work.
>
> Admittedly, the documentation is non-existent at the moment, but we
> also open sourced an experimental delegating DMC client that can be
> used to chain multiple DMCs together so you can do Redis for hot
> caching and something like Cassandra for broader cold caches
>
> https://github.com/Domestic-Resilience-FOSS/nifi-delegating-distributedmapcache-bundle
>
> On Wed, Nov 2, 2022 at 5:25 PM Peter Turcsanyi <tu...@apache.org> wrote:
> >
> > Embedded Hazelcast can also be an option. In that case, there is no
> > need to set up an external cache but the Hazelcast instances are
> > running on the NiFi nodes (in the same JVM as NiFi).
> > Please note: no security/authentication is supported in embedded mode.
> >
> > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html
> > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cachemanager.EmbeddedHazelcastCacheManager/index.html
> >
> > On Wed, Nov 2, 2022 at 5:10 PM Bryan Bende <bb...@gmail.com> wrote:
> > >
> > > The DMC Server does not support high availability. If you have a 3
> > > node nifi cluster, each node will have a DMC client and DMC server,
> > > but the clients all have to point at only one of the servers, and if
> > > the node where that server is running goes down, there is no
> > > replication or failover to another node. So it is really up to you to
> > > decide if that is acceptable for your use case. If its not then you
> > > need to use a different DMC client implementation that can communicate
> > > with an external HA cache, like Redis.
> > >
> > > On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N
> > > <ge...@boeing.com> wrote:
> > > >
> > > > I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).
> > > >
> > > >
> > > >
> > > > I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it’s a reference implementation only, and it is not really recommended for production.
> > > >
> > > >
> > > >
> > > > Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
> > > >
> > > > I don’t have an HDFS implementation anywhere, but I do need the cacheing part.
> > > >
> > > >
> > > >
> > > > Can someone explain?  Can I use DistributedMapCacheServer in my production flows?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
>
>
>
>
> ------------------------------------------------------------------------
> De Belastingdienst stelt e-mail niet open voor aanvragen, aangiften, bezwaarschriften, verzoeken, klachten, ingebrekestellingen en soortgelijke formele berichten.
> Dit bericht is uitsluitend bestemd voor de geadresseerde. Het bericht kan vertrouwelijke informatie bevatten waarvoor de fiscale geheimhoudingsplicht geldt. Als u dit bericht per abuis hebt ontvangen, wordt u verzocht het te verwijderen en de afzender te informeren.
>
> The Dutch Tax Administration does not accept filings, requests, appeals, complaints, notices of default or similar formal notices, sent by email.
> This message is solely intended for the addressee. It may contain information that is confidential and legally privileged. If you are not the intended recipient please delete this message and notify the sender.

Re: DistributedMapCacheServer

Posted by "ta.fiat.belastingdienst.nl via users" <us...@nifi.apache.org>.

Hello,



I'm investigating Redis, seems to work easy.



I think it is a bit strange that Zookeeper is not in de list with providers.
Zookeeper is already used as cluster manager for Nifi, that would be easy to
add in my opinion.



regards,



Tiemen.





> \----- Oorspronkelijk bericht -----  
>  Van: "Mike Thomsen" <mi...@gmail.com>  
>  Aan: users@nifi.apache.org  
>  Cc:  
>  Onderwerp: Re: DistributedMapCacheServer  
>  Datum: do 3 nov. 2022 15:58  
>  
>
> [EXTERNE E-MAIL] Dit bericht is afkomstig van een externe afzender. Wees
> voorzichtig met het openen van linkjes en bijlagen.  
>  
>  
>  You can also use the Cassandra DMC, which is something we are starting  
>  to use a lot where I work.  
>  
>  Admittedly, the documentation is non-existent at the moment, but we  
>  also open sourced an experimental delegating DMC client that can be  
>  used to chain multiple DMCs together so you can do Redis for hot  
>  caching and something like Cassandra for broader cold caches  
>  
>  <https://github.com/Domestic-Resilience-FOSS/nifi-delegating-
> distributedmapcache-bundle>  
>  
>  On Wed, Nov 2, 2022 at 5:25 PM Peter Turcsanyi <tu...@apache.org>
> wrote:  
>  >  
>  > Embedded Hazelcast can also be an option. In that case, there is no  
>  > need to set up an external cache but the Hazelcast instances are  
>  > running on the NiFi nodes (in the same JVM as NiFi).  
>  > Please note: no security/authentication is supported in embedded mode.  
>  >  
>  > <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-
> hazelcast-services-
> nar/1.18.0/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html>  
>  > <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-
> hazelcast-services-
> nar/1.18.0/org.apache.nifi.hazelcast.services.cachemanager.EmbeddedHazelcastCacheManager/index.html>  
>  >  
>  > On Wed, Nov 2, 2022 at 5:10 PM Bryan Bende <bb...@gmail.com> wrote:  
>  > >  
>  > > The DMC Server does not support high availability. If you have a 3  
>  > > node nifi cluster, each node will have a DMC client and DMC server,  
>  > > but the clients all have to point at only one of the servers, and if  
>  > > the node where that server is running goes down, there is no  
>  > > replication or failover to another node. So it is really up to you to  
>  > > decide if that is acceptable for your use case. If its not then you  
>  > > need to use a different DMC client implementation that can communicate  
>  > > with an external HA cache, like Redis.  
>  > >  
>  > > On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N  
>  > > <ge...@boeing.com> wrote:  
>  > > >  
>  > > > I make heavy use of DistributedMapCacheServer in my nifi flows (one
> node; not clustered).  
>  > > >  
>  > > >  
>  > > >  
>  > > > I seem to remember reading that the DistributedMapCacheServer is not
> to be used in production; it’s a reference implementation only, and it is
> not really recommended for production.  
>  > > >  
>  > > >  
>  > > >  
>  > > > Unfortunately, I can no longer find the reference saying that
> DistributedMapCacheServer is not trustworthy for prod.  
>  > > >  
>  > > > I don’t have an HDFS implementation anywhere, but I do need the
> cacheing part.  
>  > > >  
>  > > >  
>  > > >  
>  > > > Can someone explain?  Can I use DistributedMapCacheServer in my
> production flows?  
>  > > >  
>  > > >  
>  > > >  
>  > > >  
>  > > >  
>  > > >



  
\------------------------------------------------------------------------  
De Belastingdienst stelt e-mail niet open voor aanvragen, aangiften,
bezwaarschriften, verzoeken, klachten, ingebrekestellingen en soortgelijke
formele berichten.  
Dit bericht is uitsluitend bestemd voor de geadresseerde. Het bericht kan
vertrouwelijke informatie bevatten waarvoor de fiscale geheimhoudingsplicht
geldt. Als u dit bericht per abuis hebt ontvangen, wordt u verzocht het te
verwijderen en de afzender te informeren.

The Dutch Tax Administration does not accept filings, requests, appeals,
complaints, notices of default or similar formal notices, sent by email.  
This message is solely intended for the addressee. It may contain information
that is confidential and legally privileged. If you are not the intended
recipient please delete this message and notify the sender.


Re: DistributedMapCacheServer

Posted by Mike Thomsen <mi...@gmail.com>.
You can also use the Cassandra DMC, which is something we are starting
to use a lot where I work.

Admittedly, the documentation is non-existent at the moment, but we
also open sourced an experimental delegating DMC client that can be
used to chain multiple DMCs together so you can do Redis for hot
caching and something like Cassandra for broader cold caches

https://github.com/Domestic-Resilience-FOSS/nifi-delegating-distributedmapcache-bundle

On Wed, Nov 2, 2022 at 5:25 PM Peter Turcsanyi <tu...@apache.org> wrote:
>
> Embedded Hazelcast can also be an option. In that case, there is no
> need to set up an external cache but the Hazelcast instances are
> running on the NiFi nodes (in the same JVM as NiFi).
> Please note: no security/authentication is supported in embedded mode.
>
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cachemanager.EmbeddedHazelcastCacheManager/index.html
>
> On Wed, Nov 2, 2022 at 5:10 PM Bryan Bende <bb...@gmail.com> wrote:
> >
> > The DMC Server does not support high availability. If you have a 3
> > node nifi cluster, each node will have a DMC client and DMC server,
> > but the clients all have to point at only one of the servers, and if
> > the node where that server is running goes down, there is no
> > replication or failover to another node. So it is really up to you to
> > decide if that is acceptable for your use case. If its not then you
> > need to use a different DMC client implementation that can communicate
> > with an external HA cache, like Redis.
> >
> > On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N
> > <ge...@boeing.com> wrote:
> > >
> > > I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).
> > >
> > >
> > >
> > > I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it’s a reference implementation only, and it is not really recommended for production.
> > >
> > >
> > >
> > > Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
> > >
> > > I don’t have an HDFS implementation anywhere, but I do need the cacheing part.
> > >
> > >
> > >
> > > Can someone explain?  Can I use DistributedMapCacheServer in my production flows?
> > >
> > >
> > >
> > >
> > >
> > >

Re: DistributedMapCacheServer

Posted by Peter Turcsanyi <tu...@apache.org>.
Embedded Hazelcast can also be an option. In that case, there is no
need to set up an external cache but the Hazelcast instances are
running on the NiFi nodes (in the same JVM as NiFi).
Please note: no security/authentication is supported in embedded mode.

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hazelcast-services-nar/1.18.0/org.apache.nifi.hazelcast.services.cachemanager.EmbeddedHazelcastCacheManager/index.html

On Wed, Nov 2, 2022 at 5:10 PM Bryan Bende <bb...@gmail.com> wrote:
>
> The DMC Server does not support high availability. If you have a 3
> node nifi cluster, each node will have a DMC client and DMC server,
> but the clients all have to point at only one of the servers, and if
> the node where that server is running goes down, there is no
> replication or failover to another node. So it is really up to you to
> decide if that is acceptable for your use case. If its not then you
> need to use a different DMC client implementation that can communicate
> with an external HA cache, like Redis.
>
> On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N
> <ge...@boeing.com> wrote:
> >
> > I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).
> >
> >
> >
> > I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it’s a reference implementation only, and it is not really recommended for production.
> >
> >
> >
> > Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
> >
> > I don’t have an HDFS implementation anywhere, but I do need the cacheing part.
> >
> >
> >
> > Can someone explain?  Can I use DistributedMapCacheServer in my production flows?
> >
> >
> >
> >
> >
> >

Re: DistributedMapCacheServer

Posted by Bryan Bende <bb...@gmail.com>.
The DMC Server does not support high availability. If you have a 3
node nifi cluster, each node will have a DMC client and DMC server,
but the clients all have to point at only one of the servers, and if
the node where that server is running goes down, there is no
replication or failover to another node. So it is really up to you to
decide if that is acceptable for your use case. If its not then you
need to use a different DMC client implementation that can communicate
with an external HA cache, like Redis.

On Wed, Nov 2, 2022 at 11:38 AM Greene (US), Geoffrey N
<ge...@boeing.com> wrote:
>
> I make heavy use of DistributedMapCacheServer in my nifi flows (one node; not clustered).
>
>
>
> I seem to remember reading that the DistributedMapCacheServer is not to be used in production; it’s a reference implementation only, and it is not really recommended for production.
>
>
>
> Unfortunately, I can no longer find the reference saying that DistributedMapCacheServer is not trustworthy for prod.
>
> I don’t have an HDFS implementation anywhere, but I do need the cacheing part.
>
>
>
> Can someone explain?  Can I use DistributedMapCacheServer in my production flows?
>
>
>
>
>
>