You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geode.apache.org by Mario Ivanac <ma...@est.tech> on 2021/05/05 09:28:19 UTC

Odg: Odg: Geode retry/acknowledge improvement

Hi,

I think that we have problem that Darrel was suspicious, and that some kind of notification could be send from peer-to-peer to acknowledge that message is received on receiving side.

Regarding test with ip tables, execution gets stuck with conserve-sockets set to false or true.

BR,
Mario
________________________________
Šalje: Darrel Schneider <da...@vmware.com>
Poslano: 30. travnja 2021. 18:38
Prima: dev@geode.apache.org <de...@geode.apache.org>
Predmet: Re: Odg: Geode retry/acknowledge improvement

In the geode hang you describe would the forced tcp-reset using iptables have cause the put send message to fail with an exception writing it to the socket? If so then I'd expect the geode Connection class to keep trying to send that message by creating a new connection to the member. It will keep doing this until the send is successful or the member leaves the cluster.

But if the tcp-reset allows the send to complete, without actually sending the request to the other member, then geode will be in trouble and will wait forever for a reply. Once geode successfully writes a p2p message on a socket, it expects it to be processed on the other side OR it expects the other side to leave the geode cluster. If neither of these happen then it will wait forever for a response. I've wondered in the past if this was a safe expectation. If not then do we need to send some type of msg id and after waiting for a reply for too long be able to check with the member to see if it has received the message we think we already sent?

You might see different behavior with your iptables test if you use conserve-sockets=false. In that case the socket used to write the p2p message is also used to read the response. But in the default conserve-sockets=true case, the reply comes on a different socket than the one used to send the message. It might be hard to get the thread doing the put for gfsh to use conserve-sockets=false. You could try just setting that on your server and the stuck thread stack should look different from what you are currently seeing.
________________________________
From: Anthony Baker <ba...@vmware.com>
Sent: Friday, April 30, 2021 8:43 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Odg: Geode retry/acknowledge improvement

Can you explain the scenario further?  Does the sidecar proxy both the sending and receiving socket (geode creates 2 sockets for each p2p member)?  In normal cases, closing these sockets should clear up any unacknowledged messages, freeing up the thread.

Anthony

> On Apr 20, 2021, at 7:31 AM, Mario Ivanac <ma...@est.tech> wrote:
>
> Hi,
>
> after analysis, we  assume that proxy at reception of packets,  sends ACK on TCP level, and after that moment proxy is restarted.
> This is the reason, we dont see tcp retries.
>
> Simular problem to this (but not packet loss), can be reproduce on geode,
> if on existing connection, after request is sent, tcp reset is received. In that case, at reception of reset
> connection will be closed, and thread will get stuck while waiting on reply.
> I will add reproduction steps in ticket.
>
> ________________________________
> Šalje: Anthony Baker <ba...@vmware.com>
> Poslano: 19. travnja 2021. 22:54
> Prima: dev@geode.apache.org <de...@geode.apache.org>
> Predmet: Re: Geode retry/acknowledge improvement
>
> Do you have a tcpdump that demonstrates the packet loss? How long did you wait for TCP to retry the failed packet delivery (sometimes this can be tweaked with tcp_retries2).  Does this manifest as a failed socket connection in geode?  That ought to trigger some error handling IIRC.
>
> Anthony
>
>
>> On Apr 19, 2021, at 7:16 AM, Mario Ivanac <ma...@est.tech> wrote:
>>
>> Hi all,
>>
>> we have deployed geode cluster in kubernetes environment, and Istio/SideCars are injected between cluster members.
>> While running traffic, if any Istio/SideCar is restarted, thread will get stuck indefinitely, while waiting for reply on sent message.
>> It seams that due to restarting of proxy, in some cases, messages are lost, and sending side is waiting indefinitely for reply.
>>
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9075&amp;data=04%7C01%7Cdarrel%40vmware.com%7C34dc38a12a744a5594a108d90beec365%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637553942381055798%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VBtRAp6cQx1FEN6h4vBrjcqr3Rxa98JBUBc2Jfl%2F5iU%3D&amp;reserved=0
>>
>> My question is, what is your estimation, how much effort/work is needed to implement message retry/acknowledge logic in geode,
>> to solve this problem?
>>
>> BR,
>> Mario
>

Odg: Odg: Geode retry/acknowledge improvement

Posted by Mario Ivanac <ma...@est.tech>.

I think that this is enough.
________________________________
Šalje: Alberto Gomez <al...@est.tech>
Poslano: 5. svibnja 2021. 11:29
Prima: dev@geode.apache.org <de...@geode.apache.org>
Predmet: Re: Odg: Geode retry/acknowledge improvement

You could answer to their latest e-mail to confirm that Darrel's suspicion could happen. Let's see if in that case they are willing to collaborate.

Alberto
________________________________
From: Mario Ivanac <ma...@est.tech>
Sent: Wednesday, May 5, 2021 11:28 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Odg: Odg: Geode retry/acknowledge improvement

Hi,

I think that we have problem that Darrel was suspicious, and that some kind of notification could be send from peer-to-peer to acknowledge that message is received on receiving side.

Regarding test with ip tables, execution gets stuck with conserve-sockets set to false or true.

BR,
Mario
________________________________
Šalje: Darrel Schneider <da...@vmware.com>
Poslano: 30. travnja 2021. 18:38
Prima: dev@geode.apache.org <de...@geode.apache.org>
Predmet: Re: Odg: Geode retry/acknowledge improvement

In the geode hang you describe would the forced tcp-reset using iptables have cause the put send message to fail with an exception writing it to the socket? If so then I'd expect the geode Connection class to keep trying to send that message by creating a new connection to the member. It will keep doing this until the send is successful or the member leaves the cluster.

But if the tcp-reset allows the send to complete, without actually sending the request to the other member, then geode will be in trouble and will wait forever for a reply. Once geode successfully writes a p2p message on a socket, it expects it to be processed on the other side OR it expects the other side to leave the geode cluster. If neither of these happen then it will wait forever for a response. I've wondered in the past if this was a safe expectation. If not then do we need to send some type of msg id and after waiting for a reply for too long be able to check with the member to see if it has received the message we think we already sent?

You might see different behavior with your iptables test if you use conserve-sockets=false. In that case the socket used to write the p2p message is also used to read the response. But in the default conserve-sockets=true case, the reply comes on a different socket than the one used to send the message. It might be hard to get the thread doing the put for gfsh to use conserve-sockets=false. You could try just setting that on your server and the stuck thread stack should look different from what you are currently seeing.
________________________________
From: Anthony Baker <ba...@vmware.com>
Sent: Friday, April 30, 2021 8:43 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Odg: Geode retry/acknowledge improvement

Can you explain the scenario further?  Does the sidecar proxy both the sending and receiving socket (geode creates 2 sockets for each p2p member)?  In normal cases, closing these sockets should clear up any unacknowledged messages, freeing up the thread.

Anthony

> On Apr 20, 2021, at 7:31 AM, Mario Ivanac <ma...@est.tech> wrote:
>
> Hi,
>
> after analysis, we  assume that proxy at reception of packets,  sends ACK on TCP level, and after that moment proxy is restarted.
> This is the reason, we dont see tcp retries.
>
> Simular problem to this (but not packet loss), can be reproduce on geode,
> if on existing connection, after request is sent, tcp reset is received. In that case, at reception of reset
> connection will be closed, and thread will get stuck while waiting on reply.
> I will add reproduction steps in ticket.
>
> ________________________________
> Šalje: Anthony Baker <ba...@vmware.com>
> Poslano: 19. travnja 2021. 22:54
> Prima: dev@geode.apache.org <de...@geode.apache.org>
> Predmet: Re: Geode retry/acknowledge improvement
>
> Do you have a tcpdump that demonstrates the packet loss? How long did you wait for TCP to retry the failed packet delivery (sometimes this can be tweaked with tcp_retries2).  Does this manifest as a failed socket connection in geode?  That ought to trigger some error handling IIRC.
>
> Anthony
>
>
>> On Apr 19, 2021, at 7:16 AM, Mario Ivanac <ma...@est.tech> wrote:
>>
>> Hi all,
>>
>> we have deployed geode cluster in kubernetes environment, and Istio/SideCars are injected between cluster members.
>> While running traffic, if any Istio/SideCar is restarted, thread will get stuck indefinitely, while waiting for reply on sent message.
>> It seams that due to restarting of proxy, in some cases, messages are lost, and sending side is waiting indefinitely for reply.
>>
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9075&amp;data=04%7C01%7Cdarrel%40vmware.com%7C34dc38a12a744a5594a108d90beec365%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637553942381055798%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VBtRAp6cQx1FEN6h4vBrjcqr3Rxa98JBUBc2Jfl%2F5iU%3D&amp;reserved=0
>>
>> My question is, what is your estimation, how much effort/work is needed to implement message retry/acknowledge logic in geode,
>> to solve this problem?
>>
>> BR,
>> Mario
>

Re: Odg: Geode retry/acknowledge improvement

Posted by Alberto Gomez <al...@est.tech>.

Please, disregard my last e-mail.

I was having a parallel conversation by e-mail with Mario on this topic and sent the e-mail to the list by mistake.

BR,

Alberto
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Wednesday, May 5, 2021 11:29 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Odg: Geode retry/acknowledge improvement

You could answer to their latest e-mail to confirm that Darrel's suspicion could happen. Let's see if in that case they are willing to collaborate.

Alberto
________________________________
From: Mario Ivanac <ma...@est.tech>
Sent: Wednesday, May 5, 2021 11:28 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Odg: Odg: Geode retry/acknowledge improvement

Hi,

I think that we have problem that Darrel was suspicious, and that some kind of notification could be send from peer-to-peer to acknowledge that message is received on receiving side.

Regarding test with ip tables, execution gets stuck with conserve-sockets set to false or true.

BR,
Mario
________________________________
Šalje: Darrel Schneider <da...@vmware.com>
Poslano: 30. travnja 2021. 18:38
Prima: dev@geode.apache.org <de...@geode.apache.org>
Predmet: Re: Odg: Geode retry/acknowledge improvement

In the geode hang you describe would the forced tcp-reset using iptables have cause the put send message to fail with an exception writing it to the socket? If so then I'd expect the geode Connection class to keep trying to send that message by creating a new connection to the member. It will keep doing this until the send is successful or the member leaves the cluster.

But if the tcp-reset allows the send to complete, without actually sending the request to the other member, then geode will be in trouble and will wait forever for a reply. Once geode successfully writes a p2p message on a socket, it expects it to be processed on the other side OR it expects the other side to leave the geode cluster. If neither of these happen then it will wait forever for a response. I've wondered in the past if this was a safe expectation. If not then do we need to send some type of msg id and after waiting for a reply for too long be able to check with the member to see if it has received the message we think we already sent?

You might see different behavior with your iptables test if you use conserve-sockets=false. In that case the socket used to write the p2p message is also used to read the response. But in the default conserve-sockets=true case, the reply comes on a different socket than the one used to send the message. It might be hard to get the thread doing the put for gfsh to use conserve-sockets=false. You could try just setting that on your server and the stuck thread stack should look different from what you are currently seeing.
________________________________
From: Anthony Baker <ba...@vmware.com>
Sent: Friday, April 30, 2021 8:43 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Odg: Geode retry/acknowledge improvement

Can you explain the scenario further?  Does the sidecar proxy both the sending and receiving socket (geode creates 2 sockets for each p2p member)?  In normal cases, closing these sockets should clear up any unacknowledged messages, freeing up the thread.

Anthony

> On Apr 20, 2021, at 7:31 AM, Mario Ivanac <ma...@est.tech> wrote:
>
> Hi,
>
> after analysis, we  assume that proxy at reception of packets,  sends ACK on TCP level, and after that moment proxy is restarted.
> This is the reason, we dont see tcp retries.
>
> Simular problem to this (but not packet loss), can be reproduce on geode,
> if on existing connection, after request is sent, tcp reset is received. In that case, at reception of reset
> connection will be closed, and thread will get stuck while waiting on reply.
> I will add reproduction steps in ticket.
>
> ________________________________
> Šalje: Anthony Baker <ba...@vmware.com>
> Poslano: 19. travnja 2021. 22:54
> Prima: dev@geode.apache.org <de...@geode.apache.org>
> Predmet: Re: Geode retry/acknowledge improvement
>
> Do you have a tcpdump that demonstrates the packet loss? How long did you wait for TCP to retry the failed packet delivery (sometimes this can be tweaked with tcp_retries2).  Does this manifest as a failed socket connection in geode?  That ought to trigger some error handling IIRC.
>
> Anthony
>
>
>> On Apr 19, 2021, at 7:16 AM, Mario Ivanac <ma...@est.tech> wrote:
>>
>> Hi all,
>>
>> we have deployed geode cluster in kubernetes environment, and Istio/SideCars are injected between cluster members.
>> While running traffic, if any Istio/SideCar is restarted, thread will get stuck indefinitely, while waiting for reply on sent message.
>> It seams that due to restarting of proxy, in some cases, messages are lost, and sending side is waiting indefinitely for reply.
>>
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9075&amp;data=04%7C01%7Cdarrel%40vmware.com%7C34dc38a12a744a5594a108d90beec365%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637553942381055798%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VBtRAp6cQx1FEN6h4vBrjcqr3Rxa98JBUBc2Jfl%2F5iU%3D&amp;reserved=0
>>
>> My question is, what is your estimation, how much effort/work is needed to implement message retry/acknowledge logic in geode,
>> to solve this problem?
>>
>> BR,
>> Mario
>

Re: Odg: Geode retry/acknowledge improvement

Posted by Alberto Gomez <al...@est.tech>.

You could answer to their latest e-mail to confirm that Darrel's suspicion could happen. Let's see if in that case they are willing to collaborate.

Alberto
________________________________
From: Mario Ivanac <ma...@est.tech>
Sent: Wednesday, May 5, 2021 11:28 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Odg: Odg: Geode retry/acknowledge improvement

Hi,

I think that we have problem that Darrel was suspicious, and that some kind of notification could be send from peer-to-peer to acknowledge that message is received on receiving side.

Regarding test with ip tables, execution gets stuck with conserve-sockets set to false or true.

BR,
Mario
________________________________
Šalje: Darrel Schneider <da...@vmware.com>
Poslano: 30. travnja 2021. 18:38
Prima: dev@geode.apache.org <de...@geode.apache.org>
Predmet: Re: Odg: Geode retry/acknowledge improvement

In the geode hang you describe would the forced tcp-reset using iptables have cause the put send message to fail with an exception writing it to the socket? If so then I'd expect the geode Connection class to keep trying to send that message by creating a new connection to the member. It will keep doing this until the send is successful or the member leaves the cluster.

But if the tcp-reset allows the send to complete, without actually sending the request to the other member, then geode will be in trouble and will wait forever for a reply. Once geode successfully writes a p2p message on a socket, it expects it to be processed on the other side OR it expects the other side to leave the geode cluster. If neither of these happen then it will wait forever for a response. I've wondered in the past if this was a safe expectation. If not then do we need to send some type of msg id and after waiting for a reply for too long be able to check with the member to see if it has received the message we think we already sent?

You might see different behavior with your iptables test if you use conserve-sockets=false. In that case the socket used to write the p2p message is also used to read the response. But in the default conserve-sockets=true case, the reply comes on a different socket than the one used to send the message. It might be hard to get the thread doing the put for gfsh to use conserve-sockets=false. You could try just setting that on your server and the stuck thread stack should look different from what you are currently seeing.
________________________________
From: Anthony Baker <ba...@vmware.com>
Sent: Friday, April 30, 2021 8:43 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Odg: Geode retry/acknowledge improvement

Can you explain the scenario further?  Does the sidecar proxy both the sending and receiving socket (geode creates 2 sockets for each p2p member)?  In normal cases, closing these sockets should clear up any unacknowledged messages, freeing up the thread.

Anthony

> On Apr 20, 2021, at 7:31 AM, Mario Ivanac <ma...@est.tech> wrote:
>
> Hi,
>
> after analysis, we  assume that proxy at reception of packets,  sends ACK on TCP level, and after that moment proxy is restarted.
> This is the reason, we dont see tcp retries.
>
> Simular problem to this (but not packet loss), can be reproduce on geode,
> if on existing connection, after request is sent, tcp reset is received. In that case, at reception of reset
> connection will be closed, and thread will get stuck while waiting on reply.
> I will add reproduction steps in ticket.
>
> ________________________________
> Šalje: Anthony Baker <ba...@vmware.com>
> Poslano: 19. travnja 2021. 22:54
> Prima: dev@geode.apache.org <de...@geode.apache.org>
> Predmet: Re: Geode retry/acknowledge improvement
>
> Do you have a tcpdump that demonstrates the packet loss? How long did you wait for TCP to retry the failed packet delivery (sometimes this can be tweaked with tcp_retries2).  Does this manifest as a failed socket connection in geode?  That ought to trigger some error handling IIRC.
>
> Anthony
>
>
>> On Apr 19, 2021, at 7:16 AM, Mario Ivanac <ma...@est.tech> wrote:
>>
>> Hi all,
>>
>> we have deployed geode cluster in kubernetes environment, and Istio/SideCars are injected between cluster members.
>> While running traffic, if any Istio/SideCar is restarted, thread will get stuck indefinitely, while waiting for reply on sent message.
>> It seams that due to restarting of proxy, in some cases, messages are lost, and sending side is waiting indefinitely for reply.
>>
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9075&amp;data=04%7C01%7Cdarrel%40vmware.com%7C34dc38a12a744a5594a108d90beec365%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637553942381055798%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VBtRAp6cQx1FEN6h4vBrjcqr3Rxa98JBUBc2Jfl%2F5iU%3D&amp;reserved=0
>>
>> My question is, what is your estimation, how much effort/work is needed to implement message retry/acknowledge logic in geode,
>> to solve this problem?
>>
>> BR,
>> Mario
>