You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@nifi.apache.org by "Vos, Walter" <wa...@ns.nl> on 2019/02/12 12:50:58 UTC

Is the DistributedMapCacheService a single point of failure?

Hi,

I'm on NiFi 1.5 and we're currently having an issue with one of the nodes in our three node cluster. No biggie, just disconnect it from the cluster and let the other two nodes run things for a while, right? Unfortunately, some of our flows are using a DistributedMapCacheService that have that particular node that we took out set as the server hostname. For me as an admin, this is worrying :-)

Is there anything I can do in terms of configuration to "clusterize" the DistributedMapCacheServices? I can already see that the DistributedMapCacheServer doesn't define a hostname, so I guess that runs on all nodes. Can we set multiple hostnames in the DistributedMapCacheService then? Or should I just change it over in case of node failure? Is the cache shared among the cluster? I.e. do all nodes have the same values for each signal identifier/counter name?

Kind regards,

Walter

________________________________

Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.

Informatie vennootschap<http://www.ns.nl/emaildisclaimer>

Re: Is the DistributedMapCacheService a single point of failure?

Posted by James Srinivasan <ja...@gmail.com>.

(you can make it slightly less yucky by persisting the cache to shared
storage so you don't lose the contents when another node starts up,
but you do have to manually poke the clients)

On Tue, 12 Feb 2019 at 14:06, Bryan Bende <bb...@gmail.com> wrote:
>
> As James pointed out, there are alternate implementations of the DMC
> client that use external services that can be configured for high
> availability, such as HBase or Redis.
>
> When using the DMC client service, which is meant to work with the DMC
> server, the server is a single point of failure. In a cluster, the
> server runs on all nodes, but it doesn't replicate data between them,
> and the client can only point at one of these nodes. If you have to
> switch the client to point at a new server, then the cache will be
> starting over on the new server.
>
> On Tue, Feb 12, 2019 at 8:11 AM James Srinivasan
> <ja...@gmail.com> wrote:
> >
> > We switched to HBase_1_1_2_ClientMapCacheService for precisely this
> > reason. It works great (we already had HBase which probably helped)
> >
> > On Tue, 12 Feb 2019 at 12:51, Vos, Walter <wa...@ns.nl> wrote:
> > >
> > > Hi,
> > >
> > > I'm on NiFi 1.5 and we're currently having an issue with one of the nodes in our three node cluster. No biggie, just disconnect it from the cluster and let the other two nodes run things for a while, right? Unfortunately, some of our flows are using a DistributedMapCacheService that have that particular node that we took out set as the server hostname. For me as an admin, this is worrying :-)
> > >
> > > Is there anything I can do in terms of configuration to "clusterize" the DistributedMapCacheServices? I can already see that the DistributedMapCacheServer doesn't define a hostname, so I guess that runs on all nodes. Can we set multiple hostnames in the DistributedMapCacheService then? Or should I just change it over in case of node failure? Is the cache shared among the cluster? I.e. do all nodes have the same values for each signal identifier/counter name?
> > >
> > > Kind regards,
> > >
> > > Walter
> > >
> > > ________________________________
> > >
> > > Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.
> > >
> > > Informatie vennootschap<http://www.ns.nl/emaildisclaimer>

Re: Is the DistributedMapCacheService a single point of failure?

Posted by Bryan Bende <bb...@gmail.com>.

As James pointed out, there are alternate implementations of the DMC
client that use external services that can be configured for high
availability, such as HBase or Redis.

When using the DMC client service, which is meant to work with the DMC
server, the server is a single point of failure. In a cluster, the
server runs on all nodes, but it doesn't replicate data between them,
and the client can only point at one of these nodes. If you have to
switch the client to point at a new server, then the cache will be
starting over on the new server.

On Tue, Feb 12, 2019 at 8:11 AM James Srinivasan
<ja...@gmail.com> wrote:
>
> We switched to HBase_1_1_2_ClientMapCacheService for precisely this
> reason. It works great (we already had HBase which probably helped)
>
> On Tue, 12 Feb 2019 at 12:51, Vos, Walter <wa...@ns.nl> wrote:
> >
> > Hi,
> >
> > I'm on NiFi 1.5 and we're currently having an issue with one of the nodes in our three node cluster. No biggie, just disconnect it from the cluster and let the other two nodes run things for a while, right? Unfortunately, some of our flows are using a DistributedMapCacheService that have that particular node that we took out set as the server hostname. For me as an admin, this is worrying :-)
> >
> > Is there anything I can do in terms of configuration to "clusterize" the DistributedMapCacheServices? I can already see that the DistributedMapCacheServer doesn't define a hostname, so I guess that runs on all nodes. Can we set multiple hostnames in the DistributedMapCacheService then? Or should I just change it over in case of node failure? Is the cache shared among the cluster? I.e. do all nodes have the same values for each signal identifier/counter name?
> >
> > Kind regards,
> >
> > Walter
> >
> > ________________________________
> >
> > Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.
> >
> > Informatie vennootschap<http://www.ns.nl/emaildisclaimer>

Re: Is the DistributedMapCacheService a single point of failure?

Posted by James Srinivasan <ja...@gmail.com>.

We switched to HBase_1_1_2_ClientMapCacheService for precisely this
reason. It works great (we already had HBase which probably helped)

On Tue, 12 Feb 2019 at 12:51, Vos, Walter <wa...@ns.nl> wrote:
>
> Hi,
>
> I'm on NiFi 1.5 and we're currently having an issue with one of the nodes in our three node cluster. No biggie, just disconnect it from the cluster and let the other two nodes run things for a while, right? Unfortunately, some of our flows are using a DistributedMapCacheService that have that particular node that we took out set as the server hostname. For me as an admin, this is worrying :-)
>
> Is there anything I can do in terms of configuration to "clusterize" the DistributedMapCacheServices? I can already see that the DistributedMapCacheServer doesn't define a hostname, so I guess that runs on all nodes. Can we set multiple hostnames in the DistributedMapCacheService then? Or should I just change it over in case of node failure? Is the cache shared among the cluster? I.e. do all nodes have the same values for each signal identifier/counter name?
>
> Kind regards,
>
> Walter
>
> ________________________________
>
> Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.
>
> Informatie vennootschap<http://www.ns.nl/emaildisclaimer>