You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by sanjeet rath <ra...@gmail.com> on 2020/12/16 15:37:24 UTC

Regarding setting up multiple DistribitedMapCacheServer controler service

Hi All,

Hope you are well.
I need one clarification regarding DistribitedMapCacheServer controler
service.
Our build structure is on same cluster 2 teams are working in 2 different
PG.
Now both team are using DetectDuplicate processor for which they need
DustributedMapCacheClient.

My question is should i set up 2 different  DistribitedMapCacheServer on 2
different port or should i use 1
DistribitedMapCacheServer with one port (lets say 4557 default ) and that
port will be used by both the teams(both the PG)

I have gone through previous internet artcle and comunity discussion, where
it is mentioned the DistribitedMapCacheServer should set up only once per
cluster with one port and  multiple DMCclient can access this port.

Please advise is there any restriction setting up multiple
DistribitedMapCacheServer
in a cluster.

Thank you in advance,
Sanjeet

Re: Regarding setting up multiple DistribitedMapCacheServer controler service

Posted by James Srinivasan <ja...@gmail.com>.
If you are running on a cluster, you might want to consider an
alternative such as HBase_2_ClientMapCacheService otherwise the node
running the DistributedMapCacheServer becomes a SPOF.

(others: please correct me if I am wrong, this was the case a while
back when I moved to using the HBase server for our systems)

On Wed, 16 Dec 2020 at 16:13, sanjeet rath <ra...@gmail.com> wrote:
>
> Thanks Mark for clarifying.
>
> On Wed, 16 Dec 2020, 9:20 pm Mark Payne, <ma...@hotmail.com> wrote:
>>
>> Sanjeet,
>>
>> You can certainly setup multiple instances of the DistributedMapCacheServer. I think the point that the article was trying to get at is probably that adding a second DistributedMapCacheClient does not necessitate adding a second server. Multiple clients can certainly use the same server.
>>
>> That said, there may be benefits to having multiple servers. Specifically, for DetectDuplicate, there may be some things to consider. Because the server is configured with a max number of elements to add, if you have two flows, and Flow A processes 1 million FlowFiles per hour, and Flow B processes 100 FlowFiles per hour, you will almost certainly want two different servers. That’s because you could have a FlowFile come into Flow B, not a duplicate. Then Flow A fills up the cache with 10,000 FlowFiles of its own. Then a duplicate comes into Flow B, but the cache doesn’t know about it because Flow A has already filled the cache. So in that case, it would help to have two. Only down side is that now you have to many two different Controller Services (generally not a problem) and ensure that you have firewalls opened, etc. to access it.
>>
>> Thanks
>> -Mark
>>
>> On Dec 16, 2020, at 10:37 AM, sanjeet rath <ra...@gmail.com> wrote:
>>
>> Hi All,
>>
>> Hope you are well.
>> I need one clarification regarding DistribitedMapCacheServer controler service.
>> Our build structure is on same cluster 2 teams are working in 2 different PG.
>> Now both team are using DetectDuplicate processor for which they need DustributedMapCacheClient.
>>
>> My question is should i set up 2 different  DistribitedMapCacheServer on 2 different port or should i use 1
>> DistribitedMapCacheServer with one port (lets say 4557 default ) and that port will be used by both the teams(both the PG)
>>
>> I have gone through previous internet artcle and comunity discussion, where it is mentioned the DistribitedMapCacheServer should set up only once per cluster with one port and  multiple DMCclient can access this port.
>>
>> Please advise is there any restriction setting up multiple DistribitedMapCacheServer in a cluster.
>>
>> Thank you in advance,
>> Sanjeet
>>
>>
>>

Re: Regarding setting up multiple DistribitedMapCacheServer controler service

Posted by sanjeet rath <ra...@gmail.com>.
Thanks Mark for clarifying.

On Wed, 16 Dec 2020, 9:20 pm Mark Payne, <ma...@hotmail.com> wrote:

> Sanjeet,
>
> You can certainly setup multiple instances of the
> DistributedMapCacheServer. I think the point that the article was trying to
> get at is probably that adding a second DistributedMapCacheClient does not
> necessitate adding a second server. Multiple clients can certainly use the
> same server.
>
> That said, there may be benefits to having multiple servers. Specifically,
> for DetectDuplicate, there may be some things to consider. Because the
> server is configured with a max number of elements to add, if you have two
> flows, and Flow A processes 1 million FlowFiles per hour, and Flow B
> processes 100 FlowFiles per hour, you will almost certainly want two
> different servers. That’s because you could have a FlowFile come into Flow
> B, not a duplicate. Then Flow A fills up the cache with 10,000 FlowFiles of
> its own. Then a duplicate comes into Flow B, but the cache doesn’t know
> about it because Flow A has already filled the cache. So in that case, it
> would help to have two. Only down side is that now you have to many two
> different Controller Services (generally not a problem) and ensure that you
> have firewalls opened, etc. to access it.
>
> Thanks
> -Mark
>
> On Dec 16, 2020, at 10:37 AM, sanjeet rath <ra...@gmail.com> wrote:
>
> Hi All,
>
> Hope you are well.
> I need one clarification regarding DistribitedMapCacheServer controler
> service.
> Our build structure is on same cluster 2 teams are working in 2 different
> PG.
> Now both team are using DetectDuplicate processor for which they need
> DustributedMapCacheClient.
>
> My question is should i set up 2 different  DistribitedMapCacheServer on
> 2 different port or should i use 1
> DistribitedMapCacheServer with one port (lets say 4557 default ) and that
> port will be used by both the teams(both the PG)
>
> I have gone through previous internet artcle and comunity discussion,
> where it is mentioned the DistribitedMapCacheServer should set up only
> once per cluster with one port and  multiple DMCclient can access this port.
>
> Please advise is there any restriction setting up multiple DistribitedMapCacheServer
> in a cluster.
>
> Thank you in advance,
> Sanjeet
>
>
>

Re: Regarding setting up multiple DistribitedMapCacheServer controler service

Posted by Mark Payne <ma...@hotmail.com>.
Sanjeet,

You can certainly setup multiple instances of the DistributedMapCacheServer. I think the point that the article was trying to get at is probably that adding a second DistributedMapCacheClient does not necessitate adding a second server. Multiple clients can certainly use the same server.

That said, there may be benefits to having multiple servers. Specifically, for DetectDuplicate, there may be some things to consider. Because the server is configured with a max number of elements to add, if you have two flows, and Flow A processes 1 million FlowFiles per hour, and Flow B processes 100 FlowFiles per hour, you will almost certainly want two different servers. That’s because you could have a FlowFile come into Flow B, not a duplicate. Then Flow A fills up the cache with 10,000 FlowFiles of its own. Then a duplicate comes into Flow B, but the cache doesn’t know about it because Flow A has already filled the cache. So in that case, it would help to have two. Only down side is that now you have to many two different Controller Services (generally not a problem) and ensure that you have firewalls opened, etc. to access it.

Thanks
-Mark

On Dec 16, 2020, at 10:37 AM, sanjeet rath <ra...@gmail.com>> wrote:

Hi All,

Hope you are well.
I need one clarification regarding DistribitedMapCacheServer controler service.
Our build structure is on same cluster 2 teams are working in 2 different PG.
Now both team are using DetectDuplicate processor for which they need DustributedMapCacheClient.

My question is should i set up 2 different  DistribitedMapCacheServer on 2 different port or should i use 1
DistribitedMapCacheServer with one port (lets say 4557 default ) and that port will be used by both the teams(both the PG)

I have gone through previous internet artcle and comunity discussion, where it is mentioned the DistribitedMapCacheServer should set up only once per cluster with one port and  multiple DMCclient can access this port.

Please advise is there any restriction setting up multiple DistribitedMapCacheServer in a cluster.

Thank you in advance,
Sanjeet