You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geode.apache.org by Aravind Musigumpula <Ar...@amdocs.com> on 2017/06/20 15:18:06 UTC

Different member-timeout for particular jvm’s

Hi,

Is there any way to configure different member-timeout for particular jvm’s.

According to my understanding each jvm monitors its neighbor. If any jvm is missing heart beat from its neighbor, it waits for member-timeout interval and sends a suspect message. Then Coordinator tries to contact that particular jvm, if it is unable to connect to that jvm. Coordinator waits for its configured member-timeout interval and removes that member if it is unable to connect to that jvm.

Scenario:
Locator: member-timeout=10000
Server1: member-timeout=20000
Server2: member-timeout=30000
Server3: member-timeout=20000
Server4: member-timeout=20000

Suppose server1 is monitoring server2. I made server2 stuck. So server1 tries to contact the server2 , waits for 20000ms and sends suspect message. Then locator tries to connect with server2 , if unable to connect waits for 10000 and removes server3 from view.

My requirement is I don’t want to kick server2 until 40000ms. This can be done by setting 30000 for the jvm which monitors server2. But how can we see that this particular jvm monitors server2. In my case every time different jvm is monitoring server2.
Please correct me if I am wrong.


Thanks,
Aravind Musigumpula

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer <https://www.amdocs.com/about/email-disclaimer>

Re: Different member-timeout for particular jvm’s

Posted by Bruce Schuchardt <bs...@pivotal.io>.
I think we would have to send the member-timeout setting in the "join" 
message and distribute it in the membership view message.

I think this is a good idea


On 7/5/17 5:03 AM, Aravind Musigumpula wrote:
>
> Hi,
>
> May be I was not clear in my last mail. My question is that can we 
> monitor a jvm based on it’s own member timeout instead ofthe member 
> timeout of some other jvm(which is monitoring the thisjvm).
>
> Now if my view is [s1(coordinator), s2, s3, s4, s5] and member timeout 
> for each member is different, the member s3 will suspect s4 on the 
> basis of s3 member timeout and then final check will be done by 
> coordinator member timeout. Every time the view changes, the order of 
> monitoring also changes. So we cannot determine for how much time will 
> a particular jvm will be removed from the view.
>
> This can be solved if we use the member timeout of the jvm which is 
> being monitored by the current member.
>
> In the above view suppose s3 is monitoring s4. Now s3 marks s4 as 
> suspect member on the basis of  s3 member timeout. Instead of this if 
> s3 gets the member timeout of s4 and uses this new timeout to monitor 
> s4, then we can determine for how much time a member will be removed 
> from the view.
>
> Is there any way to get the member timeout of one member from an another ?
>
> Thanks,
>
> Aravind Musigumpula
>
> *From:*Aravind Musigumpula
> *Sent:* Monday, July 03, 2017 9:35 PM
> *To:* user@geode.apache.org
> *Cc:* bschuchardt@pivotal.io
> *Subject:* RE: Different member-timeout for particular jvm’s
>
> Hi,
>
> Can the member-timeout of a particular jvm can be used by the 
> monitoring jvm.
>
> Example: jvm1 monitors jvm2, jvm2 monitors jvm 3. Member timeout for 
> jvm1 is 10, jvm2 is 20 and jvm3 is 30. Suppose the jvm4 is coordinator 
> and its member timeout is 30. So what if we want jvm1 should be 
> monitored by other jvm’s for a deterministic time like 10 and jvm2 
> should be monitored for 20.
>
> Right now, I understand that a jvm will be monitored by member timeout 
> of the monitoring jvm and coordinator. My requirement is the each jvm 
> should be monitored by its own member timeout followed by 
> coordinator’s member-timeout.
>
> In code there is a wait for a member timeout variable in 
> GMSHealthMonitor.java.
>
> *if*(pingResp.getResponseMsg() == *null*) {
>
> pingResp.wait(memberTimeout);
>
>           }
>
> What if we get the member timeout of the jvm which is monitored by 
> this one. We can do this in setNextNeighborfunction 
> inGMSHealthMonitor.java
>
> But how to get the member-timeout of other jvm. Is it possible?
>
> Thanks,
>
> Aravind Musigumpula
>
> *From:*Bruce Schuchardt [mailto:bschuchardt@pivotal.io]
> *Sent:* Tuesday, June 20, 2017 9:16 PM
> *To:* user@geode.apache.org <ma...@geode.apache.org>
> *Subject:* Re: Different member-timeout for particular jvm’s
>
> It is the membership coordinator that performs the final check on a 
> suspect member.  If you have network partition detection enabled or 
> are using authentication of peers the role of membership coordinator 
> will be a locator (if one is in the system) so in your scenario it 
> will be the Locator that performs this check.  It will use its own 
> member-timeout to determine how long to wait for a response to a 
> "final check" message to the suspected member.
>
> If the Locator is down then the oldest member in the system will take 
> over the role.  This might be server1 if the membership view is [ s1, 
> s2, s4, s3 ].  If there is a problem with s2 then s1 will use its own 
> member-timeout setting to determine how long to wait for a final-check 
> response from s1.
>
> On 6/20/17 8:18 AM, Aravind Musigumpula wrote:
>
>     Hi,
>
>     Is there any way to configure different member-timeout for
>     particular jvm’s.
>
>     According to my understanding each jvm monitors its neighbor. If
>     any jvm is missing heart beat from its neighbor, it waits for
>     member-timeout interval and sends a suspect message. Then
>     Coordinator tries to contact that particular jvm, if it is unable
>     to connect to that jvm. Coordinator waits for its configured
>     member-timeout interval and removes that member if it is unable to
>     connect to that jvm.
>
>     Scenario:
>
>     Locator: member-timeout=10000
>
>     Server1: member-timeout=20000
>
>     Server2: member-timeout=30000
>
>     Server3: member-timeout=20000
>
>     Server4: member-timeout=20000
>
>     Suppose server1 is monitoring server2. I made server2 stuck. So
>     server1 tries to contact the server2 , waits for 20000ms and sends
>     suspect message. Then locator tries to connect with server2 , if
>     unable to connect waits for 10000 and removes server3 from view.
>
>     My requirement is I don’t want to kick server2 until 40000ms. This
>     can be done by setting 30000 for the jvm which monitors server2.
>     But how can we see that this particular jvm monitors server2. In
>     my case every time different jvm is monitoring server2.
>
>     Please correct me if I am wrong.
>
>     Thanks,
>
>     Aravind Musigumpula
>
>     This message and the information contained herein is proprietary
>     and confidential and subject to the Amdocs policy statement,
>
>     you may review at https://www.amdocs.com/about/email-disclaimer
>
> This message and the information contained herein is proprietary and 
> confidential and subject to the Amdocs policy statement,
>
> you may review at https://www.amdocs.com/about/email-disclaimer
>
> This message and the information contained herein is proprietary and 
> confidential and subject to the Amdocs policy statement,
> you may review at https://www.amdocs.com/about/email-disclaimer


RE: Different member-timeout for particular jvm’s

Posted by Aravind Musigumpula <Ar...@amdocs.com>.
Hi,
May be I was not clear in my last mail. My question is that can we monitor a jvm based on it’s own member timeout instead of the member timeout of some other jvm(which is monitoring the this jvm).

Now if my view is [s1(coordinator), s2, s3, s4, s5] and member timeout for each member is different, the member s3 will suspect s4 on the basis of s3 member timeout and then final check will be done by coordinator member timeout. Every time the view changes, the order of monitoring also changes. So we cannot determine for how much time will a particular jvm will be removed from the view.

This can be solved if we use the member timeout of the jvm which is being monitored by the current member.

In the above view suppose s3 is monitoring s4. Now s3 marks s4 as suspect member on the basis of  s3 member timeout. Instead of this if s3 gets the member timeout of s4 and uses this new timeout to monitor s4, then we can determine for how much time a member will be removed from the view.

Is there any way to get the member timeout of one member from an another ?


Thanks,
Aravind Musigumpula

From: Aravind Musigumpula
Sent: Monday, July 03, 2017 9:35 PM
To: user@geode.apache.org
Cc: bschuchardt@pivotal.io
Subject: RE: Different member-timeout for particular jvm’s

Hi,

Can the member-timeout of a particular jvm can be used by the monitoring jvm.
Example: jvm1 monitors jvm2, jvm2 monitors jvm 3. Member timeout for jvm1 is 10, jvm2 is 20 and jvm3 is 30. Suppose the jvm4 is coordinator and its member timeout is 30. So what if we want jvm1 should be monitored by other jvm’s for a deterministic time like 10 and jvm2 should be monitored for 20.

Right now, I understand that a jvm will be monitored by member timeout of the monitoring jvm and coordinator. My requirement is the each jvm should be monitored by its own member timeout followed by coordinator’s member-timeout.

In code there is a wait for a member timeout variable in GMSHealthMonitor.java.
if (pingResp.getResponseMsg() == null) {
            pingResp.wait(memberTimeout);
          }

What if we get the member timeout of the jvm which is monitored by this one. We can do this in setNextNeighbor function in GMSHealthMonitor.java
But how to get the member-timeout of other jvm. Is it possible?

Thanks,
Aravind Musigumpula

From: Bruce Schuchardt [mailto:bschuchardt@pivotal.io]
Sent: Tuesday, June 20, 2017 9:16 PM
To: user@geode.apache.org<ma...@geode.apache.org>
Subject: Re: Different member-timeout for particular jvm’s

It is the membership coordinator that performs the final check on a suspect member.  If you have network partition detection enabled or are using authentication of peers the role of membership coordinator will be a locator (if one is in the system) so in your scenario it will be the Locator that performs this check.  It will use its own member-timeout to determine how long to wait for a response to a "final check" message to the suspected member.

If the Locator is down then the oldest member in the system will take over the role.  This might be server1 if the membership view is [ s1, s2, s4, s3 ].  If there is a problem with s2 then s1 will use its own member-timeout setting to determine how long to wait for a final-check response from s1.

On 6/20/17 8:18 AM, Aravind Musigumpula wrote:
Hi,

Is there any way to configure different member-timeout for particular jvm’s.

According to my understanding each jvm monitors its neighbor. If any jvm is missing heart beat from its neighbor, it waits for member-timeout interval and sends a suspect message. Then Coordinator tries to contact that particular jvm, if it is unable to connect to that jvm. Coordinator waits for its configured member-timeout interval and removes that member if it is unable to connect to that jvm.

Scenario:
Locator: member-timeout=10000
Server1: member-timeout=20000
Server2: member-timeout=30000
Server3: member-timeout=20000
Server4: member-timeout=20000

Suppose server1 is monitoring server2. I made server2 stuck. So server1 tries to contact the server2 , waits for 20000ms and sends suspect message. Then locator tries to connect with server2 , if unable to connect waits for 10000 and removes server3 from view.

My requirement is I don’t want to kick server2 until 40000ms. This can be done by setting 30000 for the jvm which monitors server2. But how can we see that this particular jvm monitors server2. In my case every time different jvm is monitoring server2.
Please correct me if I am wrong.


Thanks,
Aravind Musigumpula

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer <https://www.amdocs.com/about/email-disclaimer>

RE: Different member-timeout for particular jvm’s

Posted by Aravind Musigumpula <Ar...@amdocs.com>.
Hi,

Can the member-timeout of a particular jvm can be used by the monitoring jvm.
Example: jvm1 monitors jvm2, jvm2 monitors jvm 3. Member timeout for jvm1 is 10, jvm2 is 20 and jvm3 is 30. Suppose the jvm4 is coordinator and its member timeout is 30. So what if we want jvm1 should be monitored by other jvm’s for a deterministic time like 10 and jvm2 should be monitored for 20.

Right now, I understand that a jvm will be monitored by member timeout of the monitoring jvm and coordinator. My requirement is the each jvm should be monitored by its own member timeout followed by coordinator’s member-timeout.

In code there is a wait for a member timeout variable in GMSHealthMonitor.java.
if (pingResp.getResponseMsg() == null) {
            pingResp.wait(memberTimeout);
          }

What if we get the member timeout of the jvm which is monitored by this one. We can do this in setNextNeighbor function in GMSHealthMonitor.java
But how to get the member-timeout of other jvm. Is it possible?

Thanks,
Aravind Musigumpula

From: Bruce Schuchardt [mailto:bschuchardt@pivotal.io]
Sent: Tuesday, June 20, 2017 9:16 PM
To: user@geode.apache.org
Subject: Re: Different member-timeout for particular jvm’s

It is the membership coordinator that performs the final check on a suspect member.  If you have network partition detection enabled or are using authentication of peers the role of membership coordinator will be a locator (if one is in the system) so in your scenario it will be the Locator that performs this check.  It will use its own member-timeout to determine how long to wait for a response to a "final check" message to the suspected member.

If the Locator is down then the oldest member in the system will take over the role.  This might be server1 if the membership view is [ s1, s2, s4, s3 ].  If there is a problem with s2 then s1 will use its own member-timeout setting to determine how long to wait for a final-check response from s1.


On 6/20/17 8:18 AM, Aravind Musigumpula wrote:
Hi,

Is there any way to configure different member-timeout for particular jvm’s.

According to my understanding each jvm monitors its neighbor. If any jvm is missing heart beat from its neighbor, it waits for member-timeout interval and sends a suspect message. Then Coordinator tries to contact that particular jvm, if it is unable to connect to that jvm. Coordinator waits for its configured member-timeout interval and removes that member if it is unable to connect to that jvm.

Scenario:
Locator: member-timeout=10000
Server1: member-timeout=20000
Server2: member-timeout=30000
Server3: member-timeout=20000
Server4: member-timeout=20000

Suppose server1 is monitoring server2. I made server2 stuck. So server1 tries to contact the server2 , waits for 20000ms and sends suspect message. Then locator tries to connect with server2 , if unable to connect waits for 10000 and removes server3 from view.

My requirement is I don’t want to kick server2 until 40000ms. This can be done by setting 30000 for the jvm which monitors server2. But how can we see that this particular jvm monitors server2. In my case every time different jvm is monitoring server2.
Please correct me if I am wrong.


Thanks,
Aravind Musigumpula

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer <https://www.amdocs.com/about/email-disclaimer>

Re: Different member-timeout for particular jvm’s

Posted by Bruce Schuchardt <bs...@pivotal.io>.
It is the membership coordinator that performs the final check on a 
suspect member.  If you have network partition detection enabled or are 
using authentication of peers the role of membership coordinator will be 
a locator (if one is in the system) so in your scenario it will be the 
Locator that performs this check.  It will use its own member-timeout to 
determine how long to wait for a response to a "final check" message to 
the suspected member.

If the Locator is down then the oldest member in the system will take 
over the role.  This might be server1 if the membership view is [ s1, 
s2, s4, s3 ].  If there is a problem with s2 then s1 will use its own 
member-timeout setting to determine how long to wait for a final-check 
response from s1.



On 6/20/17 8:18 AM, Aravind Musigumpula wrote:
>
> Hi,
>
> Is there any way to configure different member-timeout for particular 
> jvm’s.
>
> According to my understanding each jvm monitors its neighbor. If any 
> jvm is missing heart beat from its neighbor, it waits for 
> member-timeout interval and sends a suspect message. Then Coordinator 
> tries to contact that particular jvm, if it is unable to connect to 
> that jvm. Coordinator waits for its configured member-timeout interval 
> and removes that member if it is unable to connect to that jvm.
>
> Scenario:
>
> Locator: member-timeout=10000
>
> Server1: member-timeout=20000
>
> Server2: member-timeout=30000
>
> Server3: member-timeout=20000
>
> Server4: member-timeout=20000
>
> Suppose server1 is monitoring server2. I made server2 stuck. So 
> server1 tries to contact the server2 , waits for 20000ms and sends 
> suspect message. Then locator tries to connect with server2 , if 
> unable to connect waits for 10000 and removes server3 from view.
>
> My requirement is I don’t want to kick server2 until 40000ms. This can 
> be done by setting 30000 for the jvm which monitors server2. But how 
> can we see that this particular jvm monitors server2. In my case every 
> time different jvm is monitoring server2.
>
> Please correct me if I am wrong.
>
> Thanks,
>
> Aravind Musigumpula
>
> This message and the information contained herein is proprietary and 
> confidential and subject to the Amdocs policy statement,
> you may review at https://www.amdocs.com/about/email-disclaimer