You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Alberto Bustamante Reyes <al...@est.tech> on 2019/11/29 11:14:36 UTC

WAN replication issue in cloud native environments

Hi all,

We have a problem with Geode WAN replication when GW receivers are configured with the same hostname-for-senders and port on all servers.

The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port.

Other kinds of configuration (different hostname and/or different port for each GW receiver) are not cheap from OAM and resources perspective in cloud native environments and also limit some important use-cases (like scaling).

The problem experienced is that shutting down one server is stopping replication to this cluster until the server is up again. We suspect this is because Geode incorrectly assumes there are no more alive servers when just one of them is down (since they share hostname-for-senders and port).

Has anyone experienced a similar problem configuring Geode WAN replication in cloud native environments?

Thinking about possible solutions in Geode code, our proposal would be to expand internal data in locators with enough information to distinguish servers in the beforementioned use case. The same intervention is likely needed in the client pools and possibly elsewhere in the source code. Any comments about this proposal is welcome.

Thanks in advance!

Alberto B.

Re: WAN replication issue in cloud native environments

Posted by Jacob Barrett <jb...@pivotal.io>.
My initial guess without looking is that the client pool is sending a ping to each ServerLocation on only one of the available Connections. This logic should be changed to send to each unique member, since ServerLocation is not unique anymore.

-Jake


> On Jan 27, 2020, at 8:55 AM, Alberto Bustamante Reyes <al...@est.tech> wrote:
> 
> 
> Hi again,
> 
> Status update: the simplification of the maps suggested by Jacob made useless the new proposed class containing the ServerLocation and the member id. With this refactoring, replication is working in the scenario we have been discussing in this conversation. Thats great, and I think the code can be merged into develop if there are no extra comments in the PR.
> 
> But this does not mean we can say that Geode is able to work properly when using gw receivers with the same ip + port. We have seen that when working with this configuration, there is a problem with the pings sent from gw senders (that acts as clients) to the gw receivers (servers). The pings are reaching just one of the receivers, so the sender-receiver connection is finally closed by the ClientHealthMonitor.
> 
> Do you have any suggestion about how to handle this issue? My first idea was to identify where the connection is created, to check if the sender could be aware in some way there are more than one server to which the ping should be sent, but Im not sure if it could be possible. Or if the alternative could be to change the ClientHealthMonitor to be "clever" enough to not close connections in this case. Any comment is welcome 🙂
> 
> Thanks,
> 
> Alberto B.
>   
> De: Jacob Barrett <jb...@pivotal.io>
> Enviado: miércoles, 22 de enero de 2020 19:01
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>  
> 
> 
>>> On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes <al...@est.tech> wrote:
>>> 
>>> Thanks Naba & Jacob for your comments!
>>> 
>>> 
>>> 
>>> @Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.
>>> 
>>> (current code is here: https://github.com/apache/geode/pull/4616 )
>>> 
>>> For example, in:
>>> 
>>> LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)
>>> 
>>> In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
>>> Inside that method, this other method is called:
>> 
>> 
>> Given that your setup is masquerading multiple members behind the same host and port (ServerLocation) it doesn’t matter. When the pool opens a new socket to the replacement server it will be to the shared hostname and port and the Kubenetes service at that host and port will just pick a backend host. In the solution we suggested we preserved that behavior since the k8s service can’t determine which backend member to route the connection to based on the member id.
>> 
>> 
>> LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)
>> 
>> where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.
> 
> Again, given your use case the behavior of this method is lost when a new connection is establish by the pool through the shared hostname anyway. 
> 
>> @Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.
> 
> The client isn’t keeping these load maps, the locator is, and the locator knows all the member ids. The client end only needs to know the host/port combination. In your example where the wan replication (a client to the remote cluster) connects to the shared host/port service and get randomly routed to one of the backend servers in that service.
> 
> All of this locator balancing code is unnecessarily in this model where something else is choosing the final destination. The goal of our proposed changes was to recognize that all we need is to make sure the locator keeps the shared ServerLocation alive in its responses to clients by tracking the members associated and reducing that set to the set of unit ServerLocations. In your case that will always reduce to 1 ServerLocation for N number of members, as long as 1 member is still up.
> 
> -Jake
> 
> 

Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
I'm fine with ip+port+id.

The way we usually test multiple servers in geode is with dunit tests where
the servers are in separate processes. But yeah, it makes it harder to do a
test with multiple servers in the same process.

-Dan

On Wed, Apr 1, 2020 at 5:29 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Dan,
>
> I have realized that after this change, if you want to do a quick test in
> your laptop, it will be not possible to run two servers properly. There
> could be different scenarios you could not test. For example, you could not
> test what happens when a server is restarted, as both will be considered
> down.
>
> So I think it would be better to use ip+port+id, it will have less impacts.
>
> BR/
>
> Alberto B.
> ________________________________
> De: Dan Smith <ds...@pivotal.io>
> Enviado: viernes, 27 de marzo de 2020 19:14
> Para: dev@geode.apache.org <de...@geode.apache.org>
> Asunto: Re: WAN replication issue in cloud native environments
>
> With this PR, it would be possible to identify servers running with the
> > same ip and port, because now they will be identified by member id. But
> > Bruce realized that it could be a problem if two servers are running in
> the
> > same JVM, as they will share the same member id. It seems its very
> unlikely
> > that people are doing it, but its not explicitly prohibited.
> >
>
> What is going to happen if a user does set things up this way? The things I
> can think of are:
>
> 1. When a connection to one of the cache server fails, the client will
> close all of the connections to both. But this doesn't seem like a bad
> outcome, since it's likely the whole server crashed anyway.
> 2. Pings might not reach the correct server - but it looks like we have a
> single ClientHealthMonitor for the server process anyway? So I think the
> pings are getting to the right place.
>
> If there aren't any other negative outcomes, I think it's ok to proceed
> with the current solution. But I'd also be ok going to ip+port+id.
>
> I also agree that this use case of a single pool connecting to multiple
> cache servers in the same process doesn't make much sense.
>
> -Dan
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi Dan,

I have realized that after this change, if you want to do a quick test in your laptop, it will be not possible to run two servers properly. There could be different scenarios you could not test. For example, you could not test what happens when a server is restarted, as both will be considered down.

So I think it would be better to use ip+port+id, it will have less impacts.

BR/

Alberto B.
________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: viernes, 27 de marzo de 2020 19:14
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: Re: WAN replication issue in cloud native environments

With this PR, it would be possible to identify servers running with the
> same ip and port, because now they will be identified by member id. But
> Bruce realized that it could be a problem if two servers are running in the
> same JVM, as they will share the same member id. It seems its very unlikely
> that people are doing it, but its not explicitly prohibited.
>

What is going to happen if a user does set things up this way? The things I
can think of are:

1. When a connection to one of the cache server fails, the client will
close all of the connections to both. But this doesn't seem like a bad
outcome, since it's likely the whole server crashed anyway.
2. Pings might not reach the correct server - but it looks like we have a
single ClientHealthMonitor for the server process anyway? So I think the
pings are getting to the right place.

If there aren't any other negative outcomes, I think it's ok to proceed
with the current solution. But I'd also be ok going to ip+port+id.

I also agree that this use case of a single pool connecting to multiple
cache servers in the same process doesn't make much sense.

-Dan

Re: WAN replication issue in cloud native environments

Posted by Aaron Lindsey <aa...@apache.org>.
I thought the deadline for comments was extended until today (27th), so I added a new comment on the RFC. I’m confused about the direction we are taking with this proposal.

- Aaron

> On Mar 27, 2020, at 11:14 AM, Dan Smith <ds...@pivotal.io> wrote:
> 
> With this PR, it would be possible to identify servers running with the
>> same ip and port, because now they will be identified by member id. But
>> Bruce realized that it could be a problem if two servers are running in the
>> same JVM, as they will share the same member id. It seems its very unlikely
>> that people are doing it, but its not explicitly prohibited.
>> 
> 
> What is going to happen if a user does set things up this way? The things I
> can think of are:
> 
> 1. When a connection to one of the cache server fails, the client will
> close all of the connections to both. But this doesn't seem like a bad
> outcome, since it's likely the whole server crashed anyway.
> 2. Pings might not reach the correct server - but it looks like we have a
> single ClientHealthMonitor for the server process anyway? So I think the
> pings are getting to the right place.
> 
> If there aren't any other negative outcomes, I think it's ok to proceed
> with the current solution. But I'd also be ok going to ip+port+id.
> 
> I also agree that this use case of a single pool connecting to multiple
> cache servers in the same process doesn't make much sense.
> 
> -Dan


Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
With this PR, it would be possible to identify servers running with the
> same ip and port, because now they will be identified by member id. But
> Bruce realized that it could be a problem if two servers are running in the
> same JVM, as they will share the same member id. It seems its very unlikely
> that people are doing it, but its not explicitly prohibited.
>

What is going to happen if a user does set things up this way? The things I
can think of are:

1. When a connection to one of the cache server fails, the client will
close all of the connections to both. But this doesn't seem like a bad
outcome, since it's likely the whole server crashed anyway.
2. Pings might not reach the correct server - but it looks like we have a
single ClientHealthMonitor for the server process anyway? So I think the
pings are getting to the right place.

If there aren't any other negative outcomes, I think it's ok to proceed
with the current solution. But I'd also be ok going to ip+port+id.

I also agree that this use case of a single pool connecting to multiple
cache servers in the same process doesn't make much sense.

-Dan

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

We need some advice from the dev list. We have faced a problem with the PR of this RFC (https://github.com/apache/geode/pull/4824 ), and we would like to hear your opinion about it.

With this PR, it would be possible to identify servers running with the same ip and port, because now they will be identified by member id. But Bruce realized that it could be a problem if two servers are running in the same JVM, as they will share the same member id. It seems its very unlikely that people are doing it, but its not explicitly prohibited.

Should we take this setup as something to be allowed so the code has to be adapted?(1) Or should we take for granted that no one is using it so we can keep this solution?

Thanks!


(1) We already have a version of the code in which the servers were identified by "ip+port+id" that will cover this case (this was the original solution but was changed after comments on a previous PR)
________________________________
De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: jueves, 26 de marzo de 2020 20:17
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: RE: WAN replication issue in cloud native environments

Ok, I have moved the RFC then. Thanks again for your time & help!
________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 26 de marzo de 2020 18:54
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: Re: WAN replication issue in cloud native environments

+1

After talking through this with Bruce a bit, I think the changes you are
proposing to LocatorLoadSnapshot and EndPointManager manager make sense.
For the ping issue, I like the proposed solution to forward the ping to the
correct server. Sounds good!

-Dan

On Thu, Mar 26, 2020 at 10:47 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> +1
>
> I think this could move to the "In Development" state
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 25, 2020 at 4:13 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi,
>
>
>
> I have modified the RFC to include the alternative suggested by Bruce. Im
> also extending the deadline for sending comments to next Friday 27th March
> EOB.
>
>
>
> Thanks!
>
>
>
> BR/
>
>
>
> Alberto B.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: lunes, 23 de marzo de 2020 22:38
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I think what Dan did was pass in a socket factory that would connect to
> his gateway instead of the requested server.  Doing it like that would
> require a lot less code change than what you’re currently doing and would
> get past the unit test problem.
>
>
>
> I can point you to where you’d need to make changes for the Ping
> operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to
> reach.  PingOp.execute() gets that as a parameter and
> PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s
> cmdExecute would need to read that data if
> serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then
> it would have to compare the server location it read to that server’s
> coordinates and, if not equal, find the server with those coordinates and
> send a new DistributionMessage to it with the client’s identity.  There are
> plenty of DistributionMessage classes around to look at as precedents.  You
> send the message with
> serverConnection.getCache().getDistributionManager().putOutgoing(message).
>
>
>
> You can PM me any time.  Dan could answer questions about his gateway work.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Monday, March 23, 2020 at 2:18 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for your answer and your comment in the wiki Bruce. I will take a
> closer look at what you mentioned, it is not clear enough for me how to
> implement it.
>
>
>
> BTW, I forgot to set a deadline for the wiki review, I hope that Thursday
> 26th March is enough to receive comments.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: jueves, 19 de marzo de 2020 16:30
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I wonder if an approach similar to the SNI hostname PoolFactory changes
> would work for this non-TLS gateway.  The client needs to differentiate
> between the different servers so that it doesn’t declare all of them dead
> should one of them fail.  If the pool knew about the gateway it could
> direct all traffic there and the servers wouldn’t need to set a
> hostname-for-clients.
>
>
>
> It’s not an ideal solution since the gateway wouldn’t know which server
> the client wanted to contact and there are sure to be other problems like
> creating a backup queue for subscriptions.  But that’s the case with the
> hostname-for-clients approach, too.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 18, 2020 at 8:35 AM
> To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <
> dev@geode.apache.org>
> Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <
> jbarrett@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie
> Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi all,
>
>
>
> As Bruce suggested me, I have created a wiki page describing the problem
> we are trying to solve:
> https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers
>
>
>
> Please let me know if further clarifications are needed.
>
>
>
> Also, I have closed the PR I have been using until now, and created a new
> one with the current status of the solution, with one commit per issue
> described in the wiki: https://github.com/apache/geode/pull/4824
>
>
>
> Thanks in advance!
>
> De: Alberto Bustamante Reyes <al...@est.tech>
> Enviado: lunes, 9 de marzo de 2020 11:24
> Para: Dan Smith <ds...@pivotal.io>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for point that out Dan. Sorry for the misunderstanding, as I only
> found that "affinity" (setServerAffinityLocation method) on the client code
> I thought you were talking about it.
> Anyway, I did some more tests and it does not solve our problem...
>
> I tried configuring the service affinity on k8s, but it breaks the first
> part of the solution (the changes implemented on LocatorLoadSnapshot that
> solves the problem of the replication) and senders do not connect to other
> receivers when the one they were connected to is down.
>
> The only alternative we have in mind to try to solve the ping problem is
> to keep on investigating if changing the ping task creation could be a
> solution (the changes implemented are clearly breaking something, so the
> solution is not complete yet).
>
>
>
>
>
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>
> Enviado: jueves, 5 de marzo de 2020 21:03
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I think there is some confusion here.
>
> The client side class ExecutablePool has a method called
> setServerAffinityLocation. It looks like that is used for some internal
> transaction code to make sure transactions go to the same server. I don't
> think it makes any sense for the gateway to be messing with this setting.
>
> What I was talking about was session affinity in your proxy server. For
> example, if you are using k8s, session affinity as defined in this page -
> https://kubernetes.io/docs/concepts/services-networking/service/
>
> "If you want to make sure that connections from a particular client are
> passed to the same Pod each time, you can select the session affinity based
> on the client’s IP addresses by setting service.spec.sessionAffinity to
> “ClientIP” (the default is “None”)"
>
> I think setting session affinity might help your use case, because it
> sounds like you are having issues with the proxy directing pings to a
> different server than the data.
>
> -Dan
>
> On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I think that was what I did when I tried, but I realized I had a failure
> in the code. Now that I have tried again, reverting the change of executing
> ping by endpoint, and applying the server affinity, the connections are
> much more stable! Looks promising 🙂
>
> I suppose that if I want to introduce this change, setting the server
> affinity in the gateway sender should be introduced as a new option in the
> sender configuration, right?
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: jueves, 5 de marzo de 2020 4:41
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Bruce Schuchardt <
> bschuchardt@pivotal.io<ma...@pivotal.io>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> Oh, sorry, I meant server affinity with the proxy itself. So that it will
> always route traffic from the same gateway sender to the same gateway
> receiver. Hopefully that would ensure that pings go to the same receiver
> data is sent to.
>
> -Dan
>
> On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I have tried setting the server affinity on the gateway sender's pool in
> AbstractGatewaySender class, when the server location is set, but I dont
> see any difference on the behavior of the connections.
>
> I did not mention that the connections are reset every 5 seconds due to
> "java.io.EOFException: The connection has been reset while reading the
> header". But I dont know yet what is causing it.
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: martes, 3 de marzo de 2020 18:07
> Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>
> Cc: Bruce Schuchardt <bs...@pivotal.io>>;
> Jacob Barrett <jb...@pivotal.io>>;
> Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <cb...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> > We are currently working on other issue related to this change: gw
> senders pings are not reaching the gw receivers, so ClientHealthMonitor
> closes the connections. I saw that the ping tasks are created by
> ServerLocation, so I have tried to solve the issue by changing it to be
> done by Endpoint. This change is not finished yet, as in its current status
> it causes the closing of connections from gw servers to gw receivers every
> 5 seconds.
>
> Are you using session affinity? I think you probably will need to since
> pings can go over different connections than the data connection.
>
> -Dan
>
> On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Bruce,
> >
> > Thanks for your comments, but we are not planning to use TLS, so Im
> afraid
> > the PR you are working on will not solve this problem.
> >
> > The origin of this issue is that we would like to be able to configure
> all
> > gw receivers with the same "hostname-for-senders" value. The reason is
> that
> > we will run a multisite Geode cluster, having each site on a different
> > cloud environment, so using just one hostname makes configuration much
> more
> > easier.
> >
> > When we tried to configure the cluster in this way, we experienced an
> > issue with the replication. Using the same hostname-for-senders parameter
> > causes that different servers have equals ServerLocation objects, so if
> one
> > receiver is down, the others are considered down too. With the change
> > suggested by Jacob this problem is solved, and replication works fine.
> >
> > We are currently working on other issue related to this change: gw
> senders
> > pings are not reaching the gw receivers, so ClientHealthMonitor closes
> the
> > connections. I saw that the ping tasks are created by ServerLocation, so
> I
> > have tried to solve the issue by changing it to be done by Endpoint. This
> > change is not finished yet, as in its current status it causes the
> closing
> > of connections from gw servers to gw receivers every 5 seconds.
> >
> > Why you dont like the idea of using the InternalDistributedMember for
> > distinguish server locations? Are you thinking about other alternative?
> In
> > this use case, two different gw receivers will have the same
> > ServerLocation, so we need to distinguish them.
> >
> > BR/
> >
> > Alberto B.
> >
> > ________________________________
> > De: Bruce Schuchardt <bschuchardt@pivotal.io<mailto:
> bschuchardt@pivotal.io>>
> > Enviado: lunes, 2 de marzo de 2020 20:20
> > Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Jacob Barrett <
> > jbarrett@pivotal.io<ma...@pivotal.io>>
> > Cc: Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <
> > cblack@pivotal.io<ma...@pivotal.io>>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > I'm coming to this conversation late and probably am missing a lot of
> > context.  Is the point of this to be to direct senders to some common
> > gateway that all of the gateway receivers are configured to advertise?
> > I've been working on a PR to support redirection of connections for
> > client/server and gateway communications to a common address and put the
> > destination host name in the SNIHostName TLS parameter.  Then you won't
> > have to tell servers about the common host name - just tell clients what
> > the gateway is and they'll connect to it & tell it what the target host
> > name is via the SNIHostName.  However, that only works if SSL is enabled.
> >
> > PR 4743 is a step toward this approach and changes TcpClient and
> > SocketCreator to take an unresolved host address.  After this is merged
> > another change will allow folks to set a gateway host/port that will be
> > used to form connections and insert the destination hostname into the
> > SNIHostName SSLParameter.
> >
> > I would really like us to avoid including InternalDistributedMembers in
> > equality checks for server-locations.  To-date we've only held these
> > identifiers in Endpoints and other places for debugging purposes and have
> > used ServerLocation to identify servers.
> >
> > On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> > <al...@est.tech> wrote:
> >
> >     Hi again,
> >
> >     Status update: the simplification of the maps suggested by Jacob made
> > useless the new proposed class containing the ServerLocation and the
> member
> > id. With this refactoring, replication is working in the scenario we have
> > been discussing in this conversation. Thats great, and I think the code
> can
> > be merged into develop if there are no extra comments in the PR.
> >
> >     But this does not mean we can say that Geode is able to work properly
> > when using gw receivers with the same ip + port. We have seen that when
> > working with this configuration, there is a problem with the pings sent
> > from gw senders (that acts as clients) to the gw receivers (servers). The
> > pings are reaching just one of the receivers, so the sender-receiver
> > connection is finally closed by the ClientHealthMonitor.
> >
> >     Do you have any suggestion about how to handle this issue? My first
> > idea was to identify where the connection is created, to check if the
> > sender could be aware in some way there are more than one server to which
> > the ping should be sent, but Im not sure if it could be possible. Or if
> the
> > alternative could be to change the ClientHealthMonitor to be "clever"
> > enough to not close connections in this case. Any comment is welcome 🙂
> >
> >     Thanks,
> >
> >     Alberto B.
> >
> >     ________________________________
> >     De: Jacob Barrett <jb...@pivotal.io>>
> >     Enviado: miércoles, 22 de enero de 2020 19:01
> >     Para: Alberto Bustamante Reyes <al...@est.tech>
> >     Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Anilkumar Gingade <
> > agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> >     Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> >     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> > <alberto.bustamante.reyes@est.tech<mailto:
> > alberto.bustamante.reyes@est.tech>> wrote:
> >
> >     Thanks Naba & Jacob for your comments!
> >
> >
> >
> >     @Naba: I have been implementing a solution as you suggested, and I
> > think it would be convenient if the client knows the memberId of the
> server
> > it is connected to.
> >
> >     (current code is here: https://github.com/apache/geode/pull/4616 )
> >
> >     For example, in:
> >
> >     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> > currentServer, String group, Set<ServerLocation> excludedServers)
> >
> >     In this method, client has sent the ServerLocation , but if that
> > object does not contain the memberId, I dont see how to guarantee that
> the
> > replacement that will be returned is not the same server the client is
> > currently connected.
> >     Inside that method, this other method is called:
> >
> >
> >     Given that your setup is masquerading multiple members behind the
> same
> > host and port (ServerLocation) it doesn’t matter. When the pool opens a
> new
> > socket to the replacement server it will be to the shared hostname and
> port
> > and the Kubenetes service at that host and port will just pick a backend
> > host. In the solution we suggested we preserved that behavior since the
> k8s
> > service can’t determine which backend member to route the connection to
> > based on the member id.
> >
> >
> >     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> > groupServers)
> >
> >     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> > object. If the keys of that map have the same host and port, they are
> only
> > different on the memberId. But as you dont know it (you just have
> > currentServer which contains host and port), you cannot get the correct
> > LoadHolder value, so you cannot know if your server is the most loaded.
> >
> >     Again, given your use case the behavior of this method is lost when a
> > new connection is establish by the pool through the shared hostname
> anyway.
> >
> >     @Jacob: I think the solution finally implies that client have to know
> > the memberId, I think we could simplify the maps.
> >
> >     The client isn’t keeping these load maps, the locator is, and the
> > locator knows all the member ids. The client end only needs to know the
> > host/port combination. In your example where the wan replication (a
> client
> > to the remote cluster) connects to the shared host/port service and get
> > randomly routed to one of the backend servers in that service.
> >
> >     All of this locator balancing code is unnecessarily in this model
> > where something else is choosing the final destination. The goal of our
> > proposed changes was to recognize that all we need is to make sure the
> > locator keeps the shared ServerLocation alive in its responses to clients
> > by tracking the members associated and reducing that set to the set of
> unit
> > ServerLocations. In your case that will always reduce to 1 ServerLocation
> > for N number of members, as long as 1 member is still up.
> >
> >     -Jake
> >
> >
> >
> >
> >
> >
>
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Ok, I have moved the RFC then. Thanks again for your time & help!
________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 26 de marzo de 2020 18:54
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: Re: WAN replication issue in cloud native environments

+1

After talking through this with Bruce a bit, I think the changes you are
proposing to LocatorLoadSnapshot and EndPointManager manager make sense.
For the ping issue, I like the proposed solution to forward the ping to the
correct server. Sounds good!

-Dan

On Thu, Mar 26, 2020 at 10:47 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> +1
>
> I think this could move to the "In Development" state
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 25, 2020 at 4:13 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi,
>
>
>
> I have modified the RFC to include the alternative suggested by Bruce. Im
> also extending the deadline for sending comments to next Friday 27th March
> EOB.
>
>
>
> Thanks!
>
>
>
> BR/
>
>
>
> Alberto B.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: lunes, 23 de marzo de 2020 22:38
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I think what Dan did was pass in a socket factory that would connect to
> his gateway instead of the requested server.  Doing it like that would
> require a lot less code change than what you’re currently doing and would
> get past the unit test problem.
>
>
>
> I can point you to where you’d need to make changes for the Ping
> operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to
> reach.  PingOp.execute() gets that as a parameter and
> PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s
> cmdExecute would need to read that data if
> serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then
> it would have to compare the server location it read to that server’s
> coordinates and, if not equal, find the server with those coordinates and
> send a new DistributionMessage to it with the client’s identity.  There are
> plenty of DistributionMessage classes around to look at as precedents.  You
> send the message with
> serverConnection.getCache().getDistributionManager().putOutgoing(message).
>
>
>
> You can PM me any time.  Dan could answer questions about his gateway work.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Monday, March 23, 2020 at 2:18 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for your answer and your comment in the wiki Bruce. I will take a
> closer look at what you mentioned, it is not clear enough for me how to
> implement it.
>
>
>
> BTW, I forgot to set a deadline for the wiki review, I hope that Thursday
> 26th March is enough to receive comments.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: jueves, 19 de marzo de 2020 16:30
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I wonder if an approach similar to the SNI hostname PoolFactory changes
> would work for this non-TLS gateway.  The client needs to differentiate
> between the different servers so that it doesn’t declare all of them dead
> should one of them fail.  If the pool knew about the gateway it could
> direct all traffic there and the servers wouldn’t need to set a
> hostname-for-clients.
>
>
>
> It’s not an ideal solution since the gateway wouldn’t know which server
> the client wanted to contact and there are sure to be other problems like
> creating a backup queue for subscriptions.  But that’s the case with the
> hostname-for-clients approach, too.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 18, 2020 at 8:35 AM
> To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <
> dev@geode.apache.org>
> Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <
> jbarrett@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie
> Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi all,
>
>
>
> As Bruce suggested me, I have created a wiki page describing the problem
> we are trying to solve:
> https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers
>
>
>
> Please let me know if further clarifications are needed.
>
>
>
> Also, I have closed the PR I have been using until now, and created a new
> one with the current status of the solution, with one commit per issue
> described in the wiki: https://github.com/apache/geode/pull/4824
>
>
>
> Thanks in advance!
>
> De: Alberto Bustamante Reyes <al...@est.tech>
> Enviado: lunes, 9 de marzo de 2020 11:24
> Para: Dan Smith <ds...@pivotal.io>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for point that out Dan. Sorry for the misunderstanding, as I only
> found that "affinity" (setServerAffinityLocation method) on the client code
> I thought you were talking about it.
> Anyway, I did some more tests and it does not solve our problem...
>
> I tried configuring the service affinity on k8s, but it breaks the first
> part of the solution (the changes implemented on LocatorLoadSnapshot that
> solves the problem of the replication) and senders do not connect to other
> receivers when the one they were connected to is down.
>
> The only alternative we have in mind to try to solve the ping problem is
> to keep on investigating if changing the ping task creation could be a
> solution (the changes implemented are clearly breaking something, so the
> solution is not complete yet).
>
>
>
>
>
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>
> Enviado: jueves, 5 de marzo de 2020 21:03
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I think there is some confusion here.
>
> The client side class ExecutablePool has a method called
> setServerAffinityLocation. It looks like that is used for some internal
> transaction code to make sure transactions go to the same server. I don't
> think it makes any sense for the gateway to be messing with this setting.
>
> What I was talking about was session affinity in your proxy server. For
> example, if you are using k8s, session affinity as defined in this page -
> https://kubernetes.io/docs/concepts/services-networking/service/
>
> "If you want to make sure that connections from a particular client are
> passed to the same Pod each time, you can select the session affinity based
> on the client’s IP addresses by setting service.spec.sessionAffinity to
> “ClientIP” (the default is “None”)"
>
> I think setting session affinity might help your use case, because it
> sounds like you are having issues with the proxy directing pings to a
> different server than the data.
>
> -Dan
>
> On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I think that was what I did when I tried, but I realized I had a failure
> in the code. Now that I have tried again, reverting the change of executing
> ping by endpoint, and applying the server affinity, the connections are
> much more stable! Looks promising 🙂
>
> I suppose that if I want to introduce this change, setting the server
> affinity in the gateway sender should be introduced as a new option in the
> sender configuration, right?
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: jueves, 5 de marzo de 2020 4:41
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Bruce Schuchardt <
> bschuchardt@pivotal.io<ma...@pivotal.io>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> Oh, sorry, I meant server affinity with the proxy itself. So that it will
> always route traffic from the same gateway sender to the same gateway
> receiver. Hopefully that would ensure that pings go to the same receiver
> data is sent to.
>
> -Dan
>
> On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I have tried setting the server affinity on the gateway sender's pool in
> AbstractGatewaySender class, when the server location is set, but I dont
> see any difference on the behavior of the connections.
>
> I did not mention that the connections are reset every 5 seconds due to
> "java.io.EOFException: The connection has been reset while reading the
> header". But I dont know yet what is causing it.
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: martes, 3 de marzo de 2020 18:07
> Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>
> Cc: Bruce Schuchardt <bs...@pivotal.io>>;
> Jacob Barrett <jb...@pivotal.io>>;
> Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <cb...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> > We are currently working on other issue related to this change: gw
> senders pings are not reaching the gw receivers, so ClientHealthMonitor
> closes the connections. I saw that the ping tasks are created by
> ServerLocation, so I have tried to solve the issue by changing it to be
> done by Endpoint. This change is not finished yet, as in its current status
> it causes the closing of connections from gw servers to gw receivers every
> 5 seconds.
>
> Are you using session affinity? I think you probably will need to since
> pings can go over different connections than the data connection.
>
> -Dan
>
> On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Bruce,
> >
> > Thanks for your comments, but we are not planning to use TLS, so Im
> afraid
> > the PR you are working on will not solve this problem.
> >
> > The origin of this issue is that we would like to be able to configure
> all
> > gw receivers with the same "hostname-for-senders" value. The reason is
> that
> > we will run a multisite Geode cluster, having each site on a different
> > cloud environment, so using just one hostname makes configuration much
> more
> > easier.
> >
> > When we tried to configure the cluster in this way, we experienced an
> > issue with the replication. Using the same hostname-for-senders parameter
> > causes that different servers have equals ServerLocation objects, so if
> one
> > receiver is down, the others are considered down too. With the change
> > suggested by Jacob this problem is solved, and replication works fine.
> >
> > We are currently working on other issue related to this change: gw
> senders
> > pings are not reaching the gw receivers, so ClientHealthMonitor closes
> the
> > connections. I saw that the ping tasks are created by ServerLocation, so
> I
> > have tried to solve the issue by changing it to be done by Endpoint. This
> > change is not finished yet, as in its current status it causes the
> closing
> > of connections from gw servers to gw receivers every 5 seconds.
> >
> > Why you dont like the idea of using the InternalDistributedMember for
> > distinguish server locations? Are you thinking about other alternative?
> In
> > this use case, two different gw receivers will have the same
> > ServerLocation, so we need to distinguish them.
> >
> > BR/
> >
> > Alberto B.
> >
> > ________________________________
> > De: Bruce Schuchardt <bschuchardt@pivotal.io<mailto:
> bschuchardt@pivotal.io>>
> > Enviado: lunes, 2 de marzo de 2020 20:20
> > Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Jacob Barrett <
> > jbarrett@pivotal.io<ma...@pivotal.io>>
> > Cc: Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <
> > cblack@pivotal.io<ma...@pivotal.io>>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > I'm coming to this conversation late and probably am missing a lot of
> > context.  Is the point of this to be to direct senders to some common
> > gateway that all of the gateway receivers are configured to advertise?
> > I've been working on a PR to support redirection of connections for
> > client/server and gateway communications to a common address and put the
> > destination host name in the SNIHostName TLS parameter.  Then you won't
> > have to tell servers about the common host name - just tell clients what
> > the gateway is and they'll connect to it & tell it what the target host
> > name is via the SNIHostName.  However, that only works if SSL is enabled.
> >
> > PR 4743 is a step toward this approach and changes TcpClient and
> > SocketCreator to take an unresolved host address.  After this is merged
> > another change will allow folks to set a gateway host/port that will be
> > used to form connections and insert the destination hostname into the
> > SNIHostName SSLParameter.
> >
> > I would really like us to avoid including InternalDistributedMembers in
> > equality checks for server-locations.  To-date we've only held these
> > identifiers in Endpoints and other places for debugging purposes and have
> > used ServerLocation to identify servers.
> >
> > On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> > <al...@est.tech> wrote:
> >
> >     Hi again,
> >
> >     Status update: the simplification of the maps suggested by Jacob made
> > useless the new proposed class containing the ServerLocation and the
> member
> > id. With this refactoring, replication is working in the scenario we have
> > been discussing in this conversation. Thats great, and I think the code
> can
> > be merged into develop if there are no extra comments in the PR.
> >
> >     But this does not mean we can say that Geode is able to work properly
> > when using gw receivers with the same ip + port. We have seen that when
> > working with this configuration, there is a problem with the pings sent
> > from gw senders (that acts as clients) to the gw receivers (servers). The
> > pings are reaching just one of the receivers, so the sender-receiver
> > connection is finally closed by the ClientHealthMonitor.
> >
> >     Do you have any suggestion about how to handle this issue? My first
> > idea was to identify where the connection is created, to check if the
> > sender could be aware in some way there are more than one server to which
> > the ping should be sent, but Im not sure if it could be possible. Or if
> the
> > alternative could be to change the ClientHealthMonitor to be "clever"
> > enough to not close connections in this case. Any comment is welcome 🙂
> >
> >     Thanks,
> >
> >     Alberto B.
> >
> >     ________________________________
> >     De: Jacob Barrett <jb...@pivotal.io>>
> >     Enviado: miércoles, 22 de enero de 2020 19:01
> >     Para: Alberto Bustamante Reyes <al...@est.tech>
> >     Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Anilkumar Gingade <
> > agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> >     Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> >     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> > <alberto.bustamante.reyes@est.tech<mailto:
> > alberto.bustamante.reyes@est.tech>> wrote:
> >
> >     Thanks Naba & Jacob for your comments!
> >
> >
> >
> >     @Naba: I have been implementing a solution as you suggested, and I
> > think it would be convenient if the client knows the memberId of the
> server
> > it is connected to.
> >
> >     (current code is here: https://github.com/apache/geode/pull/4616 )
> >
> >     For example, in:
> >
> >     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> > currentServer, String group, Set<ServerLocation> excludedServers)
> >
> >     In this method, client has sent the ServerLocation , but if that
> > object does not contain the memberId, I dont see how to guarantee that
> the
> > replacement that will be returned is not the same server the client is
> > currently connected.
> >     Inside that method, this other method is called:
> >
> >
> >     Given that your setup is masquerading multiple members behind the
> same
> > host and port (ServerLocation) it doesn’t matter. When the pool opens a
> new
> > socket to the replacement server it will be to the shared hostname and
> port
> > and the Kubenetes service at that host and port will just pick a backend
> > host. In the solution we suggested we preserved that behavior since the
> k8s
> > service can’t determine which backend member to route the connection to
> > based on the member id.
> >
> >
> >     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> > groupServers)
> >
> >     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> > object. If the keys of that map have the same host and port, they are
> only
> > different on the memberId. But as you dont know it (you just have
> > currentServer which contains host and port), you cannot get the correct
> > LoadHolder value, so you cannot know if your server is the most loaded.
> >
> >     Again, given your use case the behavior of this method is lost when a
> > new connection is establish by the pool through the shared hostname
> anyway.
> >
> >     @Jacob: I think the solution finally implies that client have to know
> > the memberId, I think we could simplify the maps.
> >
> >     The client isn’t keeping these load maps, the locator is, and the
> > locator knows all the member ids. The client end only needs to know the
> > host/port combination. In your example where the wan replication (a
> client
> > to the remote cluster) connects to the shared host/port service and get
> > randomly routed to one of the backend servers in that service.
> >
> >     All of this locator balancing code is unnecessarily in this model
> > where something else is choosing the final destination. The goal of our
> > proposed changes was to recognize that all we need is to make sure the
> > locator keeps the shared ServerLocation alive in its responses to clients
> > by tracking the members associated and reducing that set to the set of
> unit
> > ServerLocations. In your case that will always reduce to 1 ServerLocation
> > for N number of members, as long as 1 member is still up.
> >
> >     -Jake
> >
> >
> >
> >
> >
> >
>
>

Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
+1

After talking through this with Bruce a bit, I think the changes you are
proposing to LocatorLoadSnapshot and EndPointManager manager make sense.
For the ping issue, I like the proposed solution to forward the ping to the
correct server. Sounds good!

-Dan

On Thu, Mar 26, 2020 at 10:47 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> +1
>
> I think this could move to the "In Development" state
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 25, 2020 at 4:13 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi,
>
>
>
> I have modified the RFC to include the alternative suggested by Bruce. Im
> also extending the deadline for sending comments to next Friday 27th March
> EOB.
>
>
>
> Thanks!
>
>
>
> BR/
>
>
>
> Alberto B.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: lunes, 23 de marzo de 2020 22:38
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I think what Dan did was pass in a socket factory that would connect to
> his gateway instead of the requested server.  Doing it like that would
> require a lot less code change than what you’re currently doing and would
> get past the unit test problem.
>
>
>
> I can point you to where you’d need to make changes for the Ping
> operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to
> reach.  PingOp.execute() gets that as a parameter and
> PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s
> cmdExecute would need to read that data if
> serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then
> it would have to compare the server location it read to that server’s
> coordinates and, if not equal, find the server with those coordinates and
> send a new DistributionMessage to it with the client’s identity.  There are
> plenty of DistributionMessage classes around to look at as precedents.  You
> send the message with
> serverConnection.getCache().getDistributionManager().putOutgoing(message).
>
>
>
> You can PM me any time.  Dan could answer questions about his gateway work.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Monday, March 23, 2020 at 2:18 PM
> To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <
> dsmith@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <
> agingade@pivotal.io>, Charlie Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for your answer and your comment in the wiki Bruce. I will take a
> closer look at what you mentioned, it is not clear enough for me how to
> implement it.
>
>
>
> BTW, I forgot to set a deadline for the wiki review, I hope that Thursday
> 26th March is enough to receive comments.
>
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: jueves, 19 de marzo de 2020 16:30
> Para: Alberto Bustamante Reyes <al...@est.tech>; Dan
> Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
> Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> I wonder if an approach similar to the SNI hostname PoolFactory changes
> would work for this non-TLS gateway.  The client needs to differentiate
> between the different servers so that it doesn’t declare all of them dead
> should one of them fail.  If the pool knew about the gateway it could
> direct all traffic there and the servers wouldn’t need to set a
> hostname-for-clients.
>
>
>
> It’s not an ideal solution since the gateway wouldn’t know which server
> the client wanted to contact and there are sure to be other problems like
> creating a backup queue for subscriptions.  But that’s the case with the
> hostname-for-clients approach, too.
>
>
>
>
>
> From: Alberto Bustamante Reyes <al...@est.tech>
> Date: Wednesday, March 18, 2020 at 8:35 AM
> To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <
> dev@geode.apache.org>
> Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <
> jbarrett@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie
> Black <cb...@pivotal.io>
> Subject: RE: WAN replication issue in cloud native environments
>
>
>
> Hi all,
>
>
>
> As Bruce suggested me, I have created a wiki page describing the problem
> we are trying to solve:
> https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers
>
>
>
> Please let me know if further clarifications are needed.
>
>
>
> Also, I have closed the PR I have been using until now, and created a new
> one with the current status of the solution, with one commit per issue
> described in the wiki: https://github.com/apache/geode/pull/4824
>
>
>
> Thanks in advance!
>
> De: Alberto Bustamante Reyes <al...@est.tech>
> Enviado: lunes, 9 de marzo de 2020 11:24
> Para: Dan Smith <ds...@pivotal.io>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: RE: WAN replication issue in cloud native environments
>
>
>
> Thanks for point that out Dan. Sorry for the misunderstanding, as I only
> found that "affinity" (setServerAffinityLocation method) on the client code
> I thought you were talking about it.
> Anyway, I did some more tests and it does not solve our problem...
>
> I tried configuring the service affinity on k8s, but it breaks the first
> part of the solution (the changes implemented on LocatorLoadSnapshot that
> solves the problem of the replication) and senders do not connect to other
> receivers when the one they were connected to is down.
>
> The only alternative we have in mind to try to solve the ping problem is
> to keep on investigating if changing the ping task creation could be a
> solution (the changes implemented are clearly breaking something, so the
> solution is not complete yet).
>
>
>
>
>
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>
> Enviado: jueves, 5 de marzo de 2020 21:03
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I think there is some confusion here.
>
> The client side class ExecutablePool has a method called
> setServerAffinityLocation. It looks like that is used for some internal
> transaction code to make sure transactions go to the same server. I don't
> think it makes any sense for the gateway to be messing with this setting.
>
> What I was talking about was session affinity in your proxy server. For
> example, if you are using k8s, session affinity as defined in this page -
> https://kubernetes.io/docs/concepts/services-networking/service/
>
> "If you want to make sure that connections from a particular client are
> passed to the same Pod each time, you can select the session affinity based
> on the client’s IP addresses by setting service.spec.sessionAffinity to
> “ClientIP” (the default is “None”)"
>
> I think setting session affinity might help your use case, because it
> sounds like you are having issues with the proxy directing pings to a
> different server than the data.
>
> -Dan
>
> On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I think that was what I did when I tried, but I realized I had a failure
> in the code. Now that I have tried again, reverting the change of executing
> ping by endpoint, and applying the server affinity, the connections are
> much more stable! Looks promising 🙂
>
> I suppose that if I want to introduce this change, setting the server
> affinity in the gateway sender should be introduced as a new option in the
> sender configuration, right?
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: jueves, 5 de marzo de 2020 4:41
> Para: Alberto Bustamante Reyes <al...@est.tech>
> Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Bruce Schuchardt <
> bschuchardt@pivotal.io<ma...@pivotal.io>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> Oh, sorry, I meant server affinity with the proxy itself. So that it will
> always route traffic from the same gateway sender to the same gateway
> receiver. Hopefully that would ensure that pings go to the same receiver
> data is sent to.
>
> -Dan
>
> On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> I have tried setting the server affinity on the gateway sender's pool in
> AbstractGatewaySender class, when the server location is set, but I dont
> see any difference on the behavior of the connections.
>
> I did not mention that the connections are reset every 5 seconds due to
> "java.io.EOFException: The connection has been reset while reading the
> header". But I dont know yet what is causing it.
>
> ________________________________
> De: Dan Smith <ds...@pivotal.io>>
> Enviado: martes, 3 de marzo de 2020 18:07
> Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>
> Cc: Bruce Schuchardt <bs...@pivotal.io>>;
> Jacob Barrett <jb...@pivotal.io>>;
> Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <cb...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> > We are currently working on other issue related to this change: gw
> senders pings are not reaching the gw receivers, so ClientHealthMonitor
> closes the connections. I saw that the ping tasks are created by
> ServerLocation, so I have tried to solve the issue by changing it to be
> done by Endpoint. This change is not finished yet, as in its current status
> it causes the closing of connections from gw servers to gw receivers every
> 5 seconds.
>
> Are you using session affinity? I think you probably will need to since
> pings can go over different connections than the data connection.
>
> -Dan
>
> On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Bruce,
> >
> > Thanks for your comments, but we are not planning to use TLS, so Im
> afraid
> > the PR you are working on will not solve this problem.
> >
> > The origin of this issue is that we would like to be able to configure
> all
> > gw receivers with the same "hostname-for-senders" value. The reason is
> that
> > we will run a multisite Geode cluster, having each site on a different
> > cloud environment, so using just one hostname makes configuration much
> more
> > easier.
> >
> > When we tried to configure the cluster in this way, we experienced an
> > issue with the replication. Using the same hostname-for-senders parameter
> > causes that different servers have equals ServerLocation objects, so if
> one
> > receiver is down, the others are considered down too. With the change
> > suggested by Jacob this problem is solved, and replication works fine.
> >
> > We are currently working on other issue related to this change: gw
> senders
> > pings are not reaching the gw receivers, so ClientHealthMonitor closes
> the
> > connections. I saw that the ping tasks are created by ServerLocation, so
> I
> > have tried to solve the issue by changing it to be done by Endpoint. This
> > change is not finished yet, as in its current status it causes the
> closing
> > of connections from gw servers to gw receivers every 5 seconds.
> >
> > Why you dont like the idea of using the InternalDistributedMember for
> > distinguish server locations? Are you thinking about other alternative?
> In
> > this use case, two different gw receivers will have the same
> > ServerLocation, so we need to distinguish them.
> >
> > BR/
> >
> > Alberto B.
> >
> > ________________________________
> > De: Bruce Schuchardt <bschuchardt@pivotal.io<mailto:
> bschuchardt@pivotal.io>>
> > Enviado: lunes, 2 de marzo de 2020 20:20
> > Para: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Jacob Barrett <
> > jbarrett@pivotal.io<ma...@pivotal.io>>
> > Cc: Anilkumar Gingade <ag...@pivotal.io>>;
> Charlie Black <
> > cblack@pivotal.io<ma...@pivotal.io>>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > I'm coming to this conversation late and probably am missing a lot of
> > context.  Is the point of this to be to direct senders to some common
> > gateway that all of the gateway receivers are configured to advertise?
> > I've been working on a PR to support redirection of connections for
> > client/server and gateway communications to a common address and put the
> > destination host name in the SNIHostName TLS parameter.  Then you won't
> > have to tell servers about the common host name - just tell clients what
> > the gateway is and they'll connect to it & tell it what the target host
> > name is via the SNIHostName.  However, that only works if SSL is enabled.
> >
> > PR 4743 is a step toward this approach and changes TcpClient and
> > SocketCreator to take an unresolved host address.  After this is merged
> > another change will allow folks to set a gateway host/port that will be
> > used to form connections and insert the destination hostname into the
> > SNIHostName SSLParameter.
> >
> > I would really like us to avoid including InternalDistributedMembers in
> > equality checks for server-locations.  To-date we've only held these
> > identifiers in Endpoints and other places for debugging purposes and have
> > used ServerLocation to identify servers.
> >
> > On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> > <al...@est.tech> wrote:
> >
> >     Hi again,
> >
> >     Status update: the simplification of the maps suggested by Jacob made
> > useless the new proposed class containing the ServerLocation and the
> member
> > id. With this refactoring, replication is working in the scenario we have
> > been discussing in this conversation. Thats great, and I think the code
> can
> > be merged into develop if there are no extra comments in the PR.
> >
> >     But this does not mean we can say that Geode is able to work properly
> > when using gw receivers with the same ip + port. We have seen that when
> > working with this configuration, there is a problem with the pings sent
> > from gw senders (that acts as clients) to the gw receivers (servers). The
> > pings are reaching just one of the receivers, so the sender-receiver
> > connection is finally closed by the ClientHealthMonitor.
> >
> >     Do you have any suggestion about how to handle this issue? My first
> > idea was to identify where the connection is created, to check if the
> > sender could be aware in some way there are more than one server to which
> > the ping should be sent, but Im not sure if it could be possible. Or if
> the
> > alternative could be to change the ClientHealthMonitor to be "clever"
> > enough to not close connections in this case. Any comment is welcome 🙂
> >
> >     Thanks,
> >
> >     Alberto B.
> >
> >     ________________________________
> >     De: Jacob Barrett <jb...@pivotal.io>>
> >     Enviado: miércoles, 22 de enero de 2020 19:01
> >     Para: Alberto Bustamante Reyes <al...@est.tech>
> >     Cc: dev@geode.apache.org<ma...@geode.apache.org> <
> dev@geode.apache.org<ma...@geode.apache.org>>; Anilkumar Gingade <
> > agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> >     Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> >     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> > <alberto.bustamante.reyes@est.tech<mailto:
> > alberto.bustamante.reyes@est.tech>> wrote:
> >
> >     Thanks Naba & Jacob for your comments!
> >
> >
> >
> >     @Naba: I have been implementing a solution as you suggested, and I
> > think it would be convenient if the client knows the memberId of the
> server
> > it is connected to.
> >
> >     (current code is here: https://github.com/apache/geode/pull/4616 )
> >
> >     For example, in:
> >
> >     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> > currentServer, String group, Set<ServerLocation> excludedServers)
> >
> >     In this method, client has sent the ServerLocation , but if that
> > object does not contain the memberId, I dont see how to guarantee that
> the
> > replacement that will be returned is not the same server the client is
> > currently connected.
> >     Inside that method, this other method is called:
> >
> >
> >     Given that your setup is masquerading multiple members behind the
> same
> > host and port (ServerLocation) it doesn’t matter. When the pool opens a
> new
> > socket to the replacement server it will be to the shared hostname and
> port
> > and the Kubenetes service at that host and port will just pick a backend
> > host. In the solution we suggested we preserved that behavior since the
> k8s
> > service can’t determine which backend member to route the connection to
> > based on the member id.
> >
> >
> >     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> > groupServers)
> >
> >     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> > object. If the keys of that map have the same host and port, they are
> only
> > different on the memberId. But as you dont know it (you just have
> > currentServer which contains host and port), you cannot get the correct
> > LoadHolder value, so you cannot know if your server is the most loaded.
> >
> >     Again, given your use case the behavior of this method is lost when a
> > new connection is establish by the pool through the shared hostname
> anyway.
> >
> >     @Jacob: I think the solution finally implies that client have to know
> > the memberId, I think we could simplify the maps.
> >
> >     The client isn’t keeping these load maps, the locator is, and the
> > locator knows all the member ids. The client end only needs to know the
> > host/port combination. In your example where the wan replication (a
> client
> > to the remote cluster) connects to the shared host/port service and get
> > randomly routed to one of the backend servers in that service.
> >
> >     All of this locator balancing code is unnecessarily in this model
> > where something else is choosing the final destination. The goal of our
> > proposed changes was to recognize that all we need is to make sure the
> > locator keeps the shared ServerLocation alive in its responses to clients
> > by tracking the members associated and reducing that set to the set of
> unit
> > ServerLocations. In your case that will always reduce to 1 ServerLocation
> > for N number of members, as long as 1 member is still up.
> >
> >     -Jake
> >
> >
> >
> >
> >
> >
>
>

Re: WAN replication issue in cloud native environments

Posted by Bruce Schuchardt <bs...@pivotal.io>.
+1

I think this could move to the "In Development" state

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 25, 2020 at 4:13 PM
To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Hi,

 

I have modified the RFC to include the alternative suggested by Bruce. Im also extending the deadline for sending comments to next Friday 27th March EOB.

 

Thanks!

 

BR/

 

Alberto B.

De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: lunes, 23 de marzo de 2020 22:38
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments 

 

I think what Dan did was pass in a socket factory that would connect to his gateway instead of the requested server.  Doing it like that would require a lot less code change than what you’re currently doing and would get past the unit test problem.

 

I can point you to where you’d need to make changes for the Ping operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to reach.  PingOp.execute() gets that as a parameter and PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s cmdExecute would need to read that data if serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then it would have to compare the server location it read to that server’s coordinates and, if not equal, find the server with those coordinates and send a new DistributionMessage to it with the client’s identity.  There are plenty of DistributionMessage classes around to look at as precedents.  You send the message with serverConnection.getCache().getDistributionManager().putOutgoing(message).

 

You can PM me any time.  Dan could answer questions about his gateway work.

                                                                                                                                            

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Monday, March 23, 2020 at 2:18 PM
To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Thanks for your answer and your comment in the wiki Bruce. I will take a closer look at what you mentioned, it is not clear enough for me how to implement it.

 

BTW, I forgot to set a deadline for the wiki review, I hope that Thursday 26th March is enough to receive comments. 

De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: jueves, 19 de marzo de 2020 16:30
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments 

 

I wonder if an approach similar to the SNI hostname PoolFactory changes would work for this non-TLS gateway.  The client needs to differentiate between the different servers so that it doesn’t declare all of them dead should one of them fail.  If the pool knew about the gateway it could direct all traffic there and the servers wouldn’t need to set a hostname-for-clients.

 

It’s not an ideal solution since the gateway wouldn’t know which server the client wanted to contact and there are sure to be other problems like creating a backup queue for subscriptions.  But that’s the case with the hostname-for-clients approach, too.

 

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 18, 2020 at 8:35 AM
To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Hi all,

 

As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers

 

Please let me know if further clarifications are needed.

 

Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824

 

Thanks in advance!

De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments 

 

Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>


RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

I have modified the RFC to include the alternative suggested by Bruce. Im also extending the deadline for sending comments to next Friday 27th March EOB.

Thanks!

BR/

Alberto B.
________________________________
De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: lunes, 23 de marzo de 2020 22:38
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments


I think what Dan did was pass in a socket factory that would connect to his gateway instead of the requested server.  Doing it like that would require a lot less code change than what you’re currently doing and would get past the unit test problem.



I can point you to where you’d need to make changes for the Ping operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to reach.  PingOp.execute() gets that as a parameter and PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s cmdExecute would need to read that data if serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then it would have to compare the server location it read to that server’s coordinates and, if not equal, find the server with those coordinates and send a new DistributionMessage to it with the client’s identity.  There are plenty of DistributionMessage classes around to look at as precedents.  You send the message with serverConnection.getCache().getDistributionManager().putOutgoing(message).



You can PM me any time.  Dan could answer questions about his gateway work.





From: Alberto Bustamante Reyes <al...@est.tech>
Date: Monday, March 23, 2020 at 2:18 PM
To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments



Thanks for your answer and your comment in the wiki Bruce. I will take a closer look at what you mentioned, it is not clear enough for me how to implement it.



BTW, I forgot to set a deadline for the wiki review, I hope that Thursday 26th March is enough to receive comments.

________________________________

De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: jueves, 19 de marzo de 2020 16:30
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments



I wonder if an approach similar to the SNI hostname PoolFactory changes would work for this non-TLS gateway.  The client needs to differentiate between the different servers so that it doesn’t declare all of them dead should one of them fail.  If the pool knew about the gateway it could direct all traffic there and the servers wouldn’t need to set a hostname-for-clients.



It’s not an ideal solution since the gateway wouldn’t know which server the client wanted to contact and there are sure to be other problems like creating a backup queue for subscriptions.  But that’s the case with the hostname-for-clients approach, too.





From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 18, 2020 at 8:35 AM
To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments



Hi all,



As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_GEODE_Allow-2Bsame-2Bhost-2Band-2Bport-2Bfor-2Ball-2Bgateway-2Breceivers&d=DwMGaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=BsmEMvbnhm5KC1W0HFuniJEJ4fc3l7UIrD_-77Kf46I&s=PIqmcXMmhziM0T3qJYgfxCcBk4EsoZ7aZpwubPfDuko&e=>



Please let me know if further clarifications are needed.



Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824



Thanks in advance!

________________________________

De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments



Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/<https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_concepts_services-2Dnetworking_service_&d=DwMGaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=BsmEMvbnhm5KC1W0HFuniJEJ4fc3l7UIrD_-77Kf46I&s=iF8SOe47Z1OSmk-Ol6B8uSOj9pU33u4cWiH-RfJciXA&e=>

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

Re: WAN replication issue in cloud native environments

Posted by Bruce Schuchardt <bs...@pivotal.io>.
I think what Dan did was pass in a socket factory that would connect to his gateway instead of the requested server.  Doing it like that would require a lot less code change than what you’re currently doing and would get past the unit test problem.

 

I can point you to where you’d need to make changes for the Ping operatio:.  PingOpImpl would need to send the ServerLocation it’s trying to reach.  PingOp.execute() gets that as a parameter and PingOpImpl.sendMessage() writes it to the server.  The Ping command class’s cmdExecute would need to read that data if serverConnection.getClientVersion() is Version.GEODE_1_13_0 or later.  Then it would have to compare the server location it read to that server’s coordinates and, if not equal, find the server with those coordinates and send a new DistributionMessage to it with the client’s identity.  There are plenty of DistributionMessage classes around to look at as precedents.  You send the message with serverConnection.getCache().getDistributionManager().putOutgoing(message).

 

You can PM me any time.  Dan could answer questions about his gateway work.

                                                                                                                                            

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Monday, March 23, 2020 at 2:18 PM
To: Bruce Schuchardt <bs...@pivotal.io>, Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Thanks for your answer and your comment in the wiki Bruce. I will take a closer look at what you mentioned, it is not clear enough for me how to implement it.

 

BTW, I forgot to set a deadline for the wiki review, I hope that Thursday 26th March is enough to receive comments. 

De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: jueves, 19 de marzo de 2020 16:30
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments 

 

I wonder if an approach similar to the SNI hostname PoolFactory changes would work for this non-TLS gateway.  The client needs to differentiate between the different servers so that it doesn’t declare all of them dead should one of them fail.  If the pool knew about the gateway it could direct all traffic there and the servers wouldn’t need to set a hostname-for-clients.

 

It’s not an ideal solution since the gateway wouldn’t know which server the client wanted to contact and there are sure to be other problems like creating a backup queue for subscriptions.  But that’s the case with the hostname-for-clients approach, too.

 

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 18, 2020 at 8:35 AM
To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Hi all,

 

As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers

 

Please let me know if further clarifications are needed.

 

Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824

 

Thanks in advance!

De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments 

 

Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>


RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Thanks for your answer and your comment in the wiki Bruce. I will take a closer look at what you mentioned, it is not clear enough for me how to implement it.

BTW, I forgot to set a deadline for the wiki review, I hope that Thursday 26th March is enough to receive comments.
________________________________
De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: jueves, 19 de marzo de 2020 16:30
Para: Alberto Bustamante Reyes <al...@est.tech>; Dan Smith <ds...@pivotal.io>; dev@geode.apache.org <de...@geode.apache.org>
Cc: Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments


I wonder if an approach similar to the SNI hostname PoolFactory changes would work for this non-TLS gateway.  The client needs to differentiate between the different servers so that it doesn’t declare all of them dead should one of them fail.  If the pool knew about the gateway it could direct all traffic there and the servers wouldn’t need to set a hostname-for-clients.



It’s not an ideal solution since the gateway wouldn’t know which server the client wanted to contact and there are sure to be other problems like creating a backup queue for subscriptions.  But that’s the case with the hostname-for-clients approach, too.





From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 18, 2020 at 8:35 AM
To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments



Hi all,



As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_GEODE_Allow-2Bsame-2Bhost-2Band-2Bport-2Bfor-2Ball-2Bgateway-2Breceivers&d=DwMGaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=BsmEMvbnhm5KC1W0HFuniJEJ4fc3l7UIrD_-77Kf46I&s=PIqmcXMmhziM0T3qJYgfxCcBk4EsoZ7aZpwubPfDuko&e=>



Please let me know if further clarifications are needed.



Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824



Thanks in advance!

________________________________

De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments



Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/<https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_concepts_services-2Dnetworking_service_&d=DwMGaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=BsmEMvbnhm5KC1W0HFuniJEJ4fc3l7UIrD_-77Kf46I&s=iF8SOe47Z1OSmk-Ol6B8uSOj9pU33u4cWiH-RfJciXA&e=>

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

Re: WAN replication issue in cloud native environments

Posted by Bruce Schuchardt <bs...@pivotal.io>.
I wonder if an approach similar to the SNI hostname PoolFactory changes would work for this non-TLS gateway.  The client needs to differentiate between the different servers so that it doesn’t declare all of them dead should one of them fail.  If the pool knew about the gateway it could direct all traffic there and the servers wouldn’t need to set a hostname-for-clients.

 

It’s not an ideal solution since the gateway wouldn’t know which server the client wanted to contact and there are sure to be other problems like creating a backup queue for subscriptions.  But that’s the case with the hostname-for-clients approach, too.

 

 

From: Alberto Bustamante Reyes <al...@est.tech>
Date: Wednesday, March 18, 2020 at 8:35 AM
To: Dan Smith <ds...@pivotal.io>, "dev@geode.apache.org" <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>, Jacob Barrett <jb...@pivotal.io>, Anilkumar Gingade <ag...@pivotal.io>, Charlie Black <cb...@pivotal.io>
Subject: RE: WAN replication issue in cloud native environments

 

Hi all,

 

As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers

 

Please let me know if further clarifications are needed.

 

Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824

 

Thanks in advance!

De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments 

 

Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>


RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi all,

As Bruce suggested me, I have created a wiki page describing the problem we are trying to solve: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers

Please let me know if further clarifications are needed.

Also, I have closed the PR I have been using until now, and created a new one with the current status of the solution, with one commit per issue described in the wiki: https://github.com/apache/geode/pull/4824

Thanks in advance!
________________________________
De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 9 de marzo de 2020 11:24
Para: Dan Smith <ds...@pivotal.io>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: RE: WAN replication issue in cloud native environments

Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Thanks for point that out Dan. Sorry for the misunderstanding, as I only found that "affinity" (setServerAffinityLocation method) on the client code I thought you were talking about it.
Anyway, I did some more tests and it does not solve our problem...

I tried configuring the service affinity on k8s, but it breaks the first part of the solution (the changes implemented on LocatorLoadSnapshot that solves the problem of the replication) and senders do not connect to other receivers when the one they were connected to is down.

The only alternative we have in mind to try to solve the ping problem is to keep on investigating if changing the ping task creation could be a solution (the changes implemented are clearly breaking something, so the solution is not complete yet).






________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 21:03
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I think there is some confusion here.

The client side class ExecutablePool has a method called setServerAffinityLocation. It looks like that is used for some internal transaction code to make sure transactions go to the same server. I don't think it makes any sense for the gateway to be messing with this setting.

What I was talking about was session affinity in your proxy server. For example, if you are using k8s, session affinity as defined in this page - https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are passed to the same Pod each time, you can select the session affinity based on the client’s IP addresses by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”)"

I think setting session affinity might help your use case, because it sounds like you are having issues with the proxy directing pings to a different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
I think there is some confusion here.

The client side class ExecutablePool has a method called
setServerAffinityLocation. It looks like that is used for some internal
transaction code to make sure transactions go to the same server. I don't
think it makes any sense for the gateway to be messing with this setting.

What I was talking about was *session* *affinity* in your proxy server. For
example, if you are using k8s, session affinity as defined in this page -
https://kubernetes.io/docs/concepts/services-networking/service/

"If you want to make sure that connections from a particular client are
passed to the same Pod each time, you can select the session affinity based
on the client’s IP addresses by setting service.spec.sessionAffinity to
“ClientIP” (the default is “None”)"

I think setting *session affinity* might help your use case, because it
sounds like you are having issues with the proxy directing pings to a
different server than the data.

-Dan

On Thu, Mar 5, 2020 at 4:20 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> I think that was what I did when I tried, but I realized I had a failure
> in the code. Now that I have tried again, reverting the change of executing
> ping by endpoint, and applying the server affinity, the connections are
> much more stable! Looks promising 🙂
>
> I suppose that if I want to introduce this change, setting the server
> affinity in the gateway sender should be introduced as a new option in the
> sender configuration, right?
> ------------------------------
> *De:* Dan Smith <ds...@pivotal.io>
> *Enviado:* jueves, 5 de marzo de 2020 4:41
> *Para:* Alberto Bustamante Reyes <al...@est.tech>
> *Cc:* dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <
> bschuchardt@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar
> Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
> *Asunto:* Re: WAN replication issue in cloud native environments
>
> Oh, sorry, I meant server affinity with the proxy itself. So that it will
> always route traffic from the same gateway sender to the same gateway
> receiver. Hopefully that would ensure that pings go to the same receiver
> data is sent to.
>
> -Dan
>
> On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> I have tried setting the server affinity on the gateway sender's pool in
> AbstractGatewaySender class, when the server location is set, but I dont
> see any difference on the behavior of the connections.
>
> I did not mention that the connections are reset every 5 seconds due to
> "java.io.EOFException: The connection has been reset while reading the
> header". But I dont know yet what is causing it.
>
> ------------------------------
> *De:* Dan Smith <ds...@pivotal.io>
> *Enviado:* martes, 3 de marzo de 2020 18:07
> *Para:* dev@geode.apache.org <de...@geode.apache.org>
> *Cc:* Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <
> jbarrett@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie
> Black <cb...@pivotal.io>
> *Asunto:* Re: WAN replication issue in cloud native environments
>
> > We are currently working on other issue related to this change: gw
> senders pings are not reaching the gw receivers, so ClientHealthMonitor
> closes the connections. I saw that the ping tasks are created by
> ServerLocation, so I have tried to solve the issue by changing it to be
> done by Endpoint. This change is not finished yet, as in its current status
> it causes the closing of connections from gw servers to gw receivers every
> 5 seconds.
>
> Are you using session affinity? I think you probably will need to since
> pings can go over different connections than the data connection.
>
> -Dan
>
> On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Bruce,
> >
> > Thanks for your comments, but we are not planning to use TLS, so Im
> afraid
> > the PR you are working on will not solve this problem.
> >
> > The origin of this issue is that we would like to be able to configure
> all
> > gw receivers with the same "hostname-for-senders" value. The reason is
> that
> > we will run a multisite Geode cluster, having each site on a different
> > cloud environment, so using just one hostname makes configuration much
> more
> > easier.
> >
> > When we tried to configure the cluster in this way, we experienced an
> > issue with the replication. Using the same hostname-for-senders parameter
> > causes that different servers have equals ServerLocation objects, so if
> one
> > receiver is down, the others are considered down too. With the change
> > suggested by Jacob this problem is solved, and replication works fine.
> >
> > We are currently working on other issue related to this change: gw
> senders
> > pings are not reaching the gw receivers, so ClientHealthMonitor closes
> the
> > connections. I saw that the ping tasks are created by ServerLocation, so
> I
> > have tried to solve the issue by changing it to be done by Endpoint. This
> > change is not finished yet, as in its current status it causes the
> closing
> > of connections from gw servers to gw receivers every 5 seconds.
> >
> > Why you dont like the idea of using the InternalDistributedMember for
> > distinguish server locations? Are you thinking about other alternative?
> In
> > this use case, two different gw receivers will have the same
> > ServerLocation, so we need to distinguish them.
> >
> > BR/
> >
> > Alberto B.
> >
> > ________________________________
> > De: Bruce Schuchardt <bs...@pivotal.io>
> > Enviado: lunes, 2 de marzo de 2020 20:20
> > Para: dev@geode.apache.org <de...@geode.apache.org>; Jacob Barrett <
> > jbarrett@pivotal.io>
> > Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <
> > cblack@pivotal.io>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > I'm coming to this conversation late and probably am missing a lot of
> > context.  Is the point of this to be to direct senders to some common
> > gateway that all of the gateway receivers are configured to advertise?
> > I've been working on a PR to support redirection of connections for
> > client/server and gateway communications to a common address and put the
> > destination host name in the SNIHostName TLS parameter.  Then you won't
> > have to tell servers about the common host name - just tell clients what
> > the gateway is and they'll connect to it & tell it what the target host
> > name is via the SNIHostName.  However, that only works if SSL is enabled.
> >
> > PR 4743 is a step toward this approach and changes TcpClient and
> > SocketCreator to take an unresolved host address.  After this is merged
> > another change will allow folks to set a gateway host/port that will be
> > used to form connections and insert the destination hostname into the
> > SNIHostName SSLParameter.
> >
> > I would really like us to avoid including InternalDistributedMembers in
> > equality checks for server-locations.  To-date we've only held these
> > identifiers in Endpoints and other places for debugging purposes and have
> > used ServerLocation to identify servers.
> >
> > On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> > <al...@est.tech> wrote:
> >
> >     Hi again,
> >
> >     Status update: the simplification of the maps suggested by Jacob made
> > useless the new proposed class containing the ServerLocation and the
> member
> > id. With this refactoring, replication is working in the scenario we have
> > been discussing in this conversation. Thats great, and I think the code
> can
> > be merged into develop if there are no extra comments in the PR.
> >
> >     But this does not mean we can say that Geode is able to work properly
> > when using gw receivers with the same ip + port. We have seen that when
> > working with this configuration, there is a problem with the pings sent
> > from gw senders (that acts as clients) to the gw receivers (servers). The
> > pings are reaching just one of the receivers, so the sender-receiver
> > connection is finally closed by the ClientHealthMonitor.
> >
> >     Do you have any suggestion about how to handle this issue? My first
> > idea was to identify where the connection is created, to check if the
> > sender could be aware in some way there are more than one server to which
> > the ping should be sent, but Im not sure if it could be possible. Or if
> the
> > alternative could be to change the ClientHealthMonitor to be "clever"
> > enough to not close connections in this case. Any comment is welcome 🙂
> >
> >     Thanks,
> >
> >     Alberto B.
> >
> >     ________________________________
> >     De: Jacob Barrett <jb...@pivotal.io>
> >     Enviado: miércoles, 22 de enero de 2020 19:01
> >     Para: Alberto Bustamante Reyes <al...@est.tech>
> >     Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <
> > agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> >     Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> >     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> > <alberto.bustamante.reyes@est.tech<mailto:
> > alberto.bustamante.reyes@est.tech>> wrote:
> >
> >     Thanks Naba & Jacob for your comments!
> >
> >
> >
> >     @Naba: I have been implementing a solution as you suggested, and I
> > think it would be convenient if the client knows the memberId of the
> server
> > it is connected to.
> >
> >     (current code is here: https://github.com/apache/geode/pull/4616 )
> >
> >     For example, in:
> >
> >     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> > currentServer, String group, Set<ServerLocation> excludedServers)
> >
> >     In this method, client has sent the ServerLocation , but if that
> > object does not contain the memberId, I dont see how to guarantee that
> the
> > replacement that will be returned is not the same server the client is
> > currently connected.
> >     Inside that method, this other method is called:
> >
> >
> >     Given that your setup is masquerading multiple members behind the
> same
> > host and port (ServerLocation) it doesn’t matter. When the pool opens a
> new
> > socket to the replacement server it will be to the shared hostname and
> port
> > and the Kubenetes service at that host and port will just pick a backend
> > host. In the solution we suggested we preserved that behavior since the
> k8s
> > service can’t determine which backend member to route the connection to
> > based on the member id.
> >
> >
> >     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> > groupServers)
> >
> >     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> > object. If the keys of that map have the same host and port, they are
> only
> > different on the memberId. But as you dont know it (you just have
> > currentServer which contains host and port), you cannot get the correct
> > LoadHolder value, so you cannot know if your server is the most loaded.
> >
> >     Again, given your use case the behavior of this method is lost when a
> > new connection is establish by the pool through the shared hostname
> anyway.
> >
> >     @Jacob: I think the solution finally implies that client have to know
> > the memberId, I think we could simplify the maps.
> >
> >     The client isn’t keeping these load maps, the locator is, and the
> > locator knows all the member ids. The client end only needs to know the
> > host/port combination. In your example where the wan replication (a
> client
> > to the remote cluster) connects to the shared host/port service and get
> > randomly routed to one of the backend servers in that service.
> >
> >     All of this locator balancing code is unnecessarily in this model
> > where something else is choosing the final destination. The goal of our
> > proposed changes was to recognize that all we need is to make sure the
> > locator keeps the shared ServerLocation alive in its responses to clients
> > by tracking the members associated and reducing that set to the set of
> unit
> > ServerLocations. In your case that will always reduce to 1 ServerLocation
> > for N number of members, as long as 1 member is still up.
> >
> >     -Jake
> >
> >
> >
> >
> >
> >
>
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
I think that was what I did when I tried, but I realized I had a failure in the code. Now that I have tried again, reverting the change of executing ping by endpoint, and applying the server affinity, the connections are much more stable! Looks promising 🙂

I suppose that if I want to introduce this change, setting the server affinity in the gateway sender should be introduced as a new option in the sender configuration, right?
________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: jueves, 5 de marzo de 2020 4:41
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

Oh, sorry, I meant server affinity with the proxy itself. So that it will always route traffic from the same gateway sender to the same gateway receiver. Hopefully that would ensure that pings go to the same receiver data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes <al...@est.tech> wrote:
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
Cc: Bruce Schuchardt <bs...@pivotal.io>>; Jacob Barrett <jb...@pivotal.io>>; Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Jacob Barrett <
> jbarrett@pivotal.io<ma...@pivotal.io>>
> Cc: Anilkumar Gingade <ag...@pivotal.io>>; Charlie Black <
> cblack@pivotal.io<ma...@pivotal.io>>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>; Anilkumar Gingade <
> agingade@pivotal.io<ma...@pivotal.io>>; Charlie Black <cb...@pivotal.io>>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
Oh, sorry, I meant server affinity with the proxy itself. So that it will
always route traffic from the same gateway sender to the same gateway
receiver. Hopefully that would ensure that pings go to the same receiver
data is sent to.

-Dan

On Wed, Mar 4, 2020, 1:31 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> I have tried setting the server affinity on the gateway sender's pool in
> AbstractGatewaySender class, when the server location is set, but I dont
> see any difference on the behavior of the connections.
>
> I did not mention that the connections are reset every 5 seconds due to
> "java.io.EOFException: The connection has been reset while reading the
> header". But I dont know yet what is causing it.
>
> ------------------------------
> *De:* Dan Smith <ds...@pivotal.io>
> *Enviado:* martes, 3 de marzo de 2020 18:07
> *Para:* dev@geode.apache.org <de...@geode.apache.org>
> *Cc:* Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <
> jbarrett@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie
> Black <cb...@pivotal.io>
> *Asunto:* Re: WAN replication issue in cloud native environments
>
> > We are currently working on other issue related to this change: gw
> senders pings are not reaching the gw receivers, so ClientHealthMonitor
> closes the connections. I saw that the ping tasks are created by
> ServerLocation, so I have tried to solve the issue by changing it to be
> done by Endpoint. This change is not finished yet, as in its current status
> it causes the closing of connections from gw servers to gw receivers every
> 5 seconds.
>
> Are you using session affinity? I think you probably will need to since
> pings can go over different connections than the data connection.
>
> -Dan
>
> On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Bruce,
> >
> > Thanks for your comments, but we are not planning to use TLS, so Im
> afraid
> > the PR you are working on will not solve this problem.
> >
> > The origin of this issue is that we would like to be able to configure
> all
> > gw receivers with the same "hostname-for-senders" value. The reason is
> that
> > we will run a multisite Geode cluster, having each site on a different
> > cloud environment, so using just one hostname makes configuration much
> more
> > easier.
> >
> > When we tried to configure the cluster in this way, we experienced an
> > issue with the replication. Using the same hostname-for-senders parameter
> > causes that different servers have equals ServerLocation objects, so if
> one
> > receiver is down, the others are considered down too. With the change
> > suggested by Jacob this problem is solved, and replication works fine.
> >
> > We are currently working on other issue related to this change: gw
> senders
> > pings are not reaching the gw receivers, so ClientHealthMonitor closes
> the
> > connections. I saw that the ping tasks are created by ServerLocation, so
> I
> > have tried to solve the issue by changing it to be done by Endpoint. This
> > change is not finished yet, as in its current status it causes the
> closing
> > of connections from gw servers to gw receivers every 5 seconds.
> >
> > Why you dont like the idea of using the InternalDistributedMember for
> > distinguish server locations? Are you thinking about other alternative?
> In
> > this use case, two different gw receivers will have the same
> > ServerLocation, so we need to distinguish them.
> >
> > BR/
> >
> > Alberto B.
> >
> > ________________________________
> > De: Bruce Schuchardt <bs...@pivotal.io>
> > Enviado: lunes, 2 de marzo de 2020 20:20
> > Para: dev@geode.apache.org <de...@geode.apache.org>; Jacob Barrett <
> > jbarrett@pivotal.io>
> > Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <
> > cblack@pivotal.io>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > I'm coming to this conversation late and probably am missing a lot of
> > context.  Is the point of this to be to direct senders to some common
> > gateway that all of the gateway receivers are configured to advertise?
> > I've been working on a PR to support redirection of connections for
> > client/server and gateway communications to a common address and put the
> > destination host name in the SNIHostName TLS parameter.  Then you won't
> > have to tell servers about the common host name - just tell clients what
> > the gateway is and they'll connect to it & tell it what the target host
> > name is via the SNIHostName.  However, that only works if SSL is enabled.
> >
> > PR 4743 is a step toward this approach and changes TcpClient and
> > SocketCreator to take an unresolved host address.  After this is merged
> > another change will allow folks to set a gateway host/port that will be
> > used to form connections and insert the destination hostname into the
> > SNIHostName SSLParameter.
> >
> > I would really like us to avoid including InternalDistributedMembers in
> > equality checks for server-locations.  To-date we've only held these
> > identifiers in Endpoints and other places for debugging purposes and have
> > used ServerLocation to identify servers.
> >
> > On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> > <al...@est.tech> wrote:
> >
> >     Hi again,
> >
> >     Status update: the simplification of the maps suggested by Jacob made
> > useless the new proposed class containing the ServerLocation and the
> member
> > id. With this refactoring, replication is working in the scenario we have
> > been discussing in this conversation. Thats great, and I think the code
> can
> > be merged into develop if there are no extra comments in the PR.
> >
> >     But this does not mean we can say that Geode is able to work properly
> > when using gw receivers with the same ip + port. We have seen that when
> > working with this configuration, there is a problem with the pings sent
> > from gw senders (that acts as clients) to the gw receivers (servers). The
> > pings are reaching just one of the receivers, so the sender-receiver
> > connection is finally closed by the ClientHealthMonitor.
> >
> >     Do you have any suggestion about how to handle this issue? My first
> > idea was to identify where the connection is created, to check if the
> > sender could be aware in some way there are more than one server to which
> > the ping should be sent, but Im not sure if it could be possible. Or if
> the
> > alternative could be to change the ClientHealthMonitor to be "clever"
> > enough to not close connections in this case. Any comment is welcome 🙂
> >
> >     Thanks,
> >
> >     Alberto B.
> >
> >     ________________________________
> >     De: Jacob Barrett <jb...@pivotal.io>
> >     Enviado: miércoles, 22 de enero de 2020 19:01
> >     Para: Alberto Bustamante Reyes <al...@est.tech>
> >     Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <
> > agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
> >     Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> >     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> > <alberto.bustamante.reyes@est.tech<mailto:
> > alberto.bustamante.reyes@est.tech>> wrote:
> >
> >     Thanks Naba & Jacob for your comments!
> >
> >
> >
> >     @Naba: I have been implementing a solution as you suggested, and I
> > think it would be convenient if the client knows the memberId of the
> server
> > it is connected to.
> >
> >     (current code is here: https://github.com/apache/geode/pull/4616 )
> >
> >     For example, in:
> >
> >     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> > currentServer, String group, Set<ServerLocation> excludedServers)
> >
> >     In this method, client has sent the ServerLocation , but if that
> > object does not contain the memberId, I dont see how to guarantee that
> the
> > replacement that will be returned is not the same server the client is
> > currently connected.
> >     Inside that method, this other method is called:
> >
> >
> >     Given that your setup is masquerading multiple members behind the
> same
> > host and port (ServerLocation) it doesn’t matter. When the pool opens a
> new
> > socket to the replacement server it will be to the shared hostname and
> port
> > and the Kubenetes service at that host and port will just pick a backend
> > host. In the solution we suggested we preserved that behavior since the
> k8s
> > service can’t determine which backend member to route the connection to
> > based on the member id.
> >
> >
> >     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> > groupServers)
> >
> >     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> > object. If the keys of that map have the same host and port, they are
> only
> > different on the memberId. But as you dont know it (you just have
> > currentServer which contains host and port), you cannot get the correct
> > LoadHolder value, so you cannot know if your server is the most loaded.
> >
> >     Again, given your use case the behavior of this method is lost when a
> > new connection is establish by the pool through the shared hostname
> anyway.
> >
> >     @Jacob: I think the solution finally implies that client have to know
> > the memberId, I think we could simplify the maps.
> >
> >     The client isn’t keeping these load maps, the locator is, and the
> > locator knows all the member ids. The client end only needs to know the
> > host/port combination. In your example where the wan replication (a
> client
> > to the remote cluster) connects to the shared host/port service and get
> > randomly routed to one of the backend servers in that service.
> >
> >     All of this locator balancing code is unnecessarily in this model
> > where something else is choosing the final destination. The goal of our
> > proposed changes was to recognize that all we need is to make sure the
> > locator keeps the shared ServerLocation alive in its responses to clients
> > by tracking the members associated and reducing that set to the set of
> unit
> > ServerLocations. In your case that will always reduce to 1 ServerLocation
> > for N number of members, as long as 1 member is still up.
> >
> >     -Jake
> >
> >
> >
> >
> >
> >
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
I have tried setting the server affinity on the gateway sender's pool in AbstractGatewaySender class, when the server location is set, but I dont see any difference on the behavior of the connections.

I did not mention that the connections are reset every 5 seconds due to "java.io.EOFException: The connection has been reset while reading the header". But I dont know yet what is causing it.

________________________________
De: Dan Smith <ds...@pivotal.io>
Enviado: martes, 3 de marzo de 2020 18:07
Para: dev@geode.apache.org <de...@geode.apache.org>
Cc: Bruce Schuchardt <bs...@pivotal.io>; Jacob Barrett <jb...@pivotal.io>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org <de...@geode.apache.org>; Jacob Barrett <
> jbarrett@pivotal.io>
> Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <
> cblack@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

Re: WAN replication issue in cloud native environments

Posted by Dan Smith <ds...@pivotal.io>.
> We are currently working on other issue related to this change: gw
senders pings are not reaching the gw receivers, so ClientHealthMonitor
closes the connections. I saw that the ping tasks are created by
ServerLocation, so I have tried to solve the issue by changing it to be
done by Endpoint. This change is not finished yet, as in its current status
it causes the closing of connections from gw servers to gw receivers every
5 seconds.

Are you using session affinity? I think you probably will need to since
pings can go over different connections than the data connection.

-Dan

On Tue, Mar 3, 2020 at 3:44 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Bruce,
>
> Thanks for your comments, but we are not planning to use TLS, so Im afraid
> the PR you are working on will not solve this problem.
>
> The origin of this issue is that we would like to be able to configure all
> gw receivers with the same "hostname-for-senders" value. The reason is that
> we will run a multisite Geode cluster, having each site on a different
> cloud environment, so using just one hostname makes configuration much more
> easier.
>
> When we tried to configure the cluster in this way, we experienced an
> issue with the replication. Using the same hostname-for-senders parameter
> causes that different servers have equals ServerLocation objects, so if one
> receiver is down, the others are considered down too. With the change
> suggested by Jacob this problem is solved, and replication works fine.
>
> We are currently working on other issue related to this change: gw senders
> pings are not reaching the gw receivers, so ClientHealthMonitor closes the
> connections. I saw that the ping tasks are created by ServerLocation, so I
> have tried to solve the issue by changing it to be done by Endpoint. This
> change is not finished yet, as in its current status it causes the closing
> of connections from gw servers to gw receivers every 5 seconds.
>
> Why you dont like the idea of using the InternalDistributedMember for
> distinguish server locations? Are you thinking about other alternative? In
> this use case, two different gw receivers will have the same
> ServerLocation, so we need to distinguish them.
>
> BR/
>
> Alberto B.
>
> ________________________________
> De: Bruce Schuchardt <bs...@pivotal.io>
> Enviado: lunes, 2 de marzo de 2020 20:20
> Para: dev@geode.apache.org <de...@geode.apache.org>; Jacob Barrett <
> jbarrett@pivotal.io>
> Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <
> cblack@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> I'm coming to this conversation late and probably am missing a lot of
> context.  Is the point of this to be to direct senders to some common
> gateway that all of the gateway receivers are configured to advertise?
> I've been working on a PR to support redirection of connections for
> client/server and gateway communications to a common address and put the
> destination host name in the SNIHostName TLS parameter.  Then you won't
> have to tell servers about the common host name - just tell clients what
> the gateway is and they'll connect to it & tell it what the target host
> name is via the SNIHostName.  However, that only works if SSL is enabled.
>
> PR 4743 is a step toward this approach and changes TcpClient and
> SocketCreator to take an unresolved host address.  After this is merged
> another change will allow folks to set a gateway host/port that will be
> used to form connections and insert the destination hostname into the
> SNIHostName SSLParameter.
>
> I would really like us to avoid including InternalDistributedMembers in
> equality checks for server-locations.  To-date we've only held these
> identifiers in Endpoints and other places for debugging purposes and have
> used ServerLocation to identify servers.
>
> On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes"
> <al...@est.tech> wrote:
>
>     Hi again,
>
>     Status update: the simplification of the maps suggested by Jacob made
> useless the new proposed class containing the ServerLocation and the member
> id. With this refactoring, replication is working in the scenario we have
> been discussing in this conversation. Thats great, and I think the code can
> be merged into develop if there are no extra comments in the PR.
>
>     But this does not mean we can say that Geode is able to work properly
> when using gw receivers with the same ip + port. We have seen that when
> working with this configuration, there is a problem with the pings sent
> from gw senders (that acts as clients) to the gw receivers (servers). The
> pings are reaching just one of the receivers, so the sender-receiver
> connection is finally closed by the ClientHealthMonitor.
>
>     Do you have any suggestion about how to handle this issue? My first
> idea was to identify where the connection is created, to check if the
> sender could be aware in some way there are more than one server to which
> the ping should be sent, but Im not sure if it could be possible. Or if the
> alternative could be to change the ClientHealthMonitor to be "clever"
> enough to not close connections in this case. Any comment is welcome 🙂
>
>     Thanks,
>
>     Alberto B.
>
>     ________________________________
>     De: Jacob Barrett <jb...@pivotal.io>
>     Enviado: miércoles, 22 de enero de 2020 19:01
>     Para: Alberto Bustamante Reyes <al...@est.tech>
>     Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <
> agingade@pivotal.io>; Charlie Black <cb...@pivotal.io>
>     Asunto: Re: WAN replication issue in cloud native environments
>
>
>
>     On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes
> <alberto.bustamante.reyes@est.tech<mailto:
> alberto.bustamante.reyes@est.tech>> wrote:
>
>     Thanks Naba & Jacob for your comments!
>
>
>
>     @Naba: I have been implementing a solution as you suggested, and I
> think it would be convenient if the client knows the memberId of the server
> it is connected to.
>
>     (current code is here: https://github.com/apache/geode/pull/4616 )
>
>     For example, in:
>
>     LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation
> currentServer, String group, Set<ServerLocation> excludedServers)
>
>     In this method, client has sent the ServerLocation , but if that
> object does not contain the memberId, I dont see how to guarantee that the
> replacement that will be returned is not the same server the client is
> currently connected.
>     Inside that method, this other method is called:
>
>
>     Given that your setup is masquerading multiple members behind the same
> host and port (ServerLocation) it doesn’t matter. When the pool opens a new
> socket to the replacement server it will be to the shared hostname and port
> and the Kubenetes service at that host and port will just pick a backend
> host. In the solution we suggested we preserved that behavior since the k8s
> service can’t determine which backend member to route the connection to
> based on the member id.
>
>
>     LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer,
> groupServers)
>
>     where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>"
> object. If the keys of that map have the same host and port, they are only
> different on the memberId. But as you dont know it (you just have
> currentServer which contains host and port), you cannot get the correct
> LoadHolder value, so you cannot know if your server is the most loaded.
>
>     Again, given your use case the behavior of this method is lost when a
> new connection is establish by the pool through the shared hostname anyway.
>
>     @Jacob: I think the solution finally implies that client have to know
> the memberId, I think we could simplify the maps.
>
>     The client isn’t keeping these load maps, the locator is, and the
> locator knows all the member ids. The client end only needs to know the
> host/port combination. In your example where the wan replication (a client
> to the remote cluster) connects to the shared host/port service and get
> randomly routed to one of the backend servers in that service.
>
>     All of this locator balancing code is unnecessarily in this model
> where something else is choosing the final destination. The goal of our
> proposed changes was to recognize that all we need is to make sure the
> locator keeps the shared ServerLocation alive in its responses to clients
> by tracking the members associated and reducing that set to the set of unit
> ServerLocations. In your case that will always reduce to 1 ServerLocation
> for N number of members, as long as 1 member is still up.
>
>     -Jake
>
>
>
>
>
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi Bruce,

Thanks for your comments, but we are not planning to use TLS, so Im afraid the PR you are working on will not solve this problem.

The origin of this issue is that we would like to be able to configure all gw receivers with the same "hostname-for-senders" value. The reason is that we will run a multisite Geode cluster, having each site on a different cloud environment, so using just one hostname makes configuration much more easier.

When we tried to configure the cluster in this way, we experienced an issue with the replication. Using the same hostname-for-senders parameter causes that different servers have equals ServerLocation objects, so if one receiver is down, the others are considered down too. With the change suggested by Jacob this problem is solved, and replication works fine.

We are currently working on other issue related to this change: gw senders pings are not reaching the gw receivers, so ClientHealthMonitor closes the connections. I saw that the ping tasks are created by ServerLocation, so I have tried to solve the issue by changing it to be done by Endpoint. This change is not finished yet, as in its current status it causes the closing of connections from gw servers to gw receivers every 5 seconds.

Why you dont like the idea of using the InternalDistributedMember for distinguish server locations? Are you thinking about other alternative? In this use case, two different gw receivers will have the same ServerLocation, so we need to distinguish them.

BR/

Alberto B.

________________________________
De: Bruce Schuchardt <bs...@pivotal.io>
Enviado: lunes, 2 de marzo de 2020 20:20
Para: dev@geode.apache.org <de...@geode.apache.org>; Jacob Barrett <jb...@pivotal.io>
Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

I'm coming to this conversation late and probably am missing a lot of context.  Is the point of this to be to direct senders to some common gateway that all of the gateway receivers are configured to advertise?  I've been working on a PR to support redirection of connections for client/server and gateway communications to a common address and put the destination host name in the SNIHostName TLS parameter.  Then you won't have to tell servers about the common host name - just tell clients what the gateway is and they'll connect to it & tell it what the target host name is via the SNIHostName.  However, that only works if SSL is enabled.

PR 4743 is a step toward this approach and changes TcpClient and SocketCreator to take an unresolved host address.  After this is merged another change will allow folks to set a gateway host/port that will be used to form connections and insert the destination hostname into the SNIHostName SSLParameter.

I would really like us to avoid including InternalDistributedMembers in equality checks for server-locations.  To-date we've only held these identifiers in Endpoints and other places for debugging purposes and have used ServerLocation to identify servers.

On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes" <al...@est.tech> wrote:

    Hi again,

    Status update: the simplification of the maps suggested by Jacob made useless the new proposed class containing the ServerLocation and the member id. With this refactoring, replication is working in the scenario we have been discussing in this conversation. Thats great, and I think the code can be merged into develop if there are no extra comments in the PR.

    But this does not mean we can say that Geode is able to work properly when using gw receivers with the same ip + port. We have seen that when working with this configuration, there is a problem with the pings sent from gw senders (that acts as clients) to the gw receivers (servers). The pings are reaching just one of the receivers, so the sender-receiver connection is finally closed by the ClientHealthMonitor.

    Do you have any suggestion about how to handle this issue? My first idea was to identify where the connection is created, to check if the sender could be aware in some way there are more than one server to which the ping should be sent, but Im not sure if it could be possible. Or if the alternative could be to change the ClientHealthMonitor to be "clever" enough to not close connections in this case. Any comment is welcome 🙂

    Thanks,

    Alberto B.

    ________________________________
    De: Jacob Barrett <jb...@pivotal.io>
    Enviado: miércoles, 22 de enero de 2020 19:01
    Para: Alberto Bustamante Reyes <al...@est.tech>
    Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
    Asunto: Re: WAN replication issue in cloud native environments



    On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes <al...@est.tech>> wrote:

    Thanks Naba & Jacob for your comments!



    @Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.

    (current code is here: https://github.com/apache/geode/pull/4616 )

    For example, in:

    LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)

    In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
    Inside that method, this other method is called:


    Given that your setup is masquerading multiple members behind the same host and port (ServerLocation) it doesn’t matter. When the pool opens a new socket to the replacement server it will be to the shared hostname and port and the Kubenetes service at that host and port will just pick a backend host. In the solution we suggested we preserved that behavior since the k8s service can’t determine which backend member to route the connection to based on the member id.


    LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)

    where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.

    Again, given your use case the behavior of this method is lost when a new connection is establish by the pool through the shared hostname anyway.

    @Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.

    The client isn’t keeping these load maps, the locator is, and the locator knows all the member ids. The client end only needs to know the host/port combination. In your example where the wan replication (a client to the remote cluster) connects to the shared host/port service and get randomly routed to one of the backend servers in that service.

    All of this locator balancing code is unnecessarily in this model where something else is choosing the final destination. The goal of our proposed changes was to recognize that all we need is to make sure the locator keeps the shared ServerLocation alive in its responses to clients by tracking the members associated and reducing that set to the set of unit ServerLocations. In your case that will always reduce to 1 ServerLocation for N number of members, as long as 1 member is still up.

    -Jake






Re: WAN replication issue in cloud native environments

Posted by Bruce Schuchardt <bs...@pivotal.io>.
I'm coming to this conversation late and probably am missing a lot of context.  Is the point of this to be to direct senders to some common gateway that all of the gateway receivers are configured to advertise?  I've been working on a PR to support redirection of connections for client/server and gateway communications to a common address and put the destination host name in the SNIHostName TLS parameter.  Then you won't have to tell servers about the common host name - just tell clients what the gateway is and they'll connect to it & tell it what the target host name is via the SNIHostName.  However, that only works if SSL is enabled.

PR 4743 is a step toward this approach and changes TcpClient and SocketCreator to take an unresolved host address.  After this is merged another change will allow folks to set a gateway host/port that will be used to form connections and insert the destination hostname into the SNIHostName SSLParameter.

I would really like us to avoid including InternalDistributedMembers in equality checks for server-locations.  To-date we've only held these identifiers in Endpoints and other places for debugging purposes and have used ServerLocation to identify servers.

On 1/27/20, 8:56 AM, "Alberto Bustamante Reyes" <al...@est.tech> wrote:

    Hi again,
    
    Status update: the simplification of the maps suggested by Jacob made useless the new proposed class containing the ServerLocation and the member id. With this refactoring, replication is working in the scenario we have been discussing in this conversation. Thats great, and I think the code can be merged into develop if there are no extra comments in the PR.
    
    But this does not mean we can say that Geode is able to work properly when using gw receivers with the same ip + port. We have seen that when working with this configuration, there is a problem with the pings sent from gw senders (that acts as clients) to the gw receivers (servers). The pings are reaching just one of the receivers, so the sender-receiver connection is finally closed by the ClientHealthMonitor.
    
    Do you have any suggestion about how to handle this issue? My first idea was to identify where the connection is created, to check if the sender could be aware in some way there are more than one server to which the ping should be sent, but Im not sure if it could be possible. Or if the alternative could be to change the ClientHealthMonitor to be "clever" enough to not close connections in this case. Any comment is welcome 🙂
    
    Thanks,
    
    Alberto B.
    
    ________________________________
    De: Jacob Barrett <jb...@pivotal.io>
    Enviado: miércoles, 22 de enero de 2020 19:01
    Para: Alberto Bustamante Reyes <al...@est.tech>
    Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
    Asunto: Re: WAN replication issue in cloud native environments
    
    
    
    On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes <al...@est.tech>> wrote:
    
    Thanks Naba & Jacob for your comments!
    
    
    
    @Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.
    
    (current code is here: https://github.com/apache/geode/pull/4616 )
    
    For example, in:
    
    LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)
    
    In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
    Inside that method, this other method is called:
    
    
    Given that your setup is masquerading multiple members behind the same host and port (ServerLocation) it doesn’t matter. When the pool opens a new socket to the replacement server it will be to the shared hostname and port and the Kubenetes service at that host and port will just pick a backend host. In the solution we suggested we preserved that behavior since the k8s service can’t determine which backend member to route the connection to based on the member id.
    
    
    LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)
    
    where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.
    
    Again, given your use case the behavior of this method is lost when a new connection is establish by the pool through the shared hostname anyway.
    
    @Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.
    
    The client isn’t keeping these load maps, the locator is, and the locator knows all the member ids. The client end only needs to know the host/port combination. In your example where the wan replication (a client to the remote cluster) connects to the shared host/port service and get randomly routed to one of the backend servers in that service.
    
    All of this locator balancing code is unnecessarily in this model where something else is choosing the final destination. The goal of our proposed changes was to recognize that all we need is to make sure the locator keeps the shared ServerLocation alive in its responses to clients by tracking the members associated and reducing that set to the set of unit ServerLocations. In your case that will always reduce to 1 ServerLocation for N number of members, as long as 1 member is still up.
    
    -Jake
    
    
    



RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi again,

Status update: the simplification of the maps suggested by Jacob made useless the new proposed class containing the ServerLocation and the member id. With this refactoring, replication is working in the scenario we have been discussing in this conversation. Thats great, and I think the code can be merged into develop if there are no extra comments in the PR.

But this does not mean we can say that Geode is able to work properly when using gw receivers with the same ip + port. We have seen that when working with this configuration, there is a problem with the pings sent from gw senders (that acts as clients) to the gw receivers (servers). The pings are reaching just one of the receivers, so the sender-receiver connection is finally closed by the ClientHealthMonitor.

Do you have any suggestion about how to handle this issue? My first idea was to identify where the connection is created, to check if the sender could be aware in some way there are more than one server to which the ping should be sent, but Im not sure if it could be possible. Or if the alternative could be to change the ClientHealthMonitor to be "clever" enough to not close connections in this case. Any comment is welcome 🙂

Thanks,

Alberto B.

________________________________
De: Jacob Barrett <jb...@pivotal.io>
Enviado: miércoles, 22 de enero de 2020 19:01
Para: Alberto Bustamante Reyes <al...@est.tech>
Cc: dev@geode.apache.org <de...@geode.apache.org>; Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments



On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes <al...@est.tech>> wrote:

Thanks Naba & Jacob for your comments!



@Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.

(current code is here: https://github.com/apache/geode/pull/4616 )

For example, in:

LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)

In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
Inside that method, this other method is called:


Given that your setup is masquerading multiple members behind the same host and port (ServerLocation) it doesn’t matter. When the pool opens a new socket to the replacement server it will be to the shared hostname and port and the Kubenetes service at that host and port will just pick a backend host. In the solution we suggested we preserved that behavior since the k8s service can’t determine which backend member to route the connection to based on the member id.


LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)

where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.

Again, given your use case the behavior of this method is lost when a new connection is establish by the pool through the shared hostname anyway.

@Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.

The client isn’t keeping these load maps, the locator is, and the locator knows all the member ids. The client end only needs to know the host/port combination. In your example where the wan replication (a client to the remote cluster) connects to the shared host/port service and get randomly routed to one of the backend servers in that service.

All of this locator balancing code is unnecessarily in this model where something else is choosing the final destination. The goal of our proposed changes was to recognize that all we need is to make sure the locator keeps the shared ServerLocation alive in its responses to clients by tracking the members associated and reducing that set to the set of unit ServerLocations. In your case that will always reduce to 1 ServerLocation for N number of members, as long as 1 member is still up.

-Jake



Re: WAN replication issue in cloud native environments

Posted by Jacob Barrett <jb...@pivotal.io>.

> On Jan 22, 2020, at 9:51 AM, Alberto Bustamante Reyes <al...@est.tech> wrote:
> 
> Thanks Naba & Jacob for your comments!
> 
> 
> 
> @Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.
> 
> (current code is here: https://github.com/apache/geode/pull/4616 <https://github.com/apache/geode/pull/4616> )
> 
> For example, in:
> 
> LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)
> 
> In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
> Inside that method, this other method is called:


Given that your setup is masquerading multiple members behind the same host and port (ServerLocation) it doesn’t matter. When the pool opens a new socket to the replacement server it will be to the shared hostname and port and the Kubenetes service at that host and port will just pick a backend host. In the solution we suggested we preserved that behavior since the k8s service can’t determine which backend member to route the connection to based on the member id.


> LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)
> 
> where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.

Again, given your use case the behavior of this method is lost when a new connection is establish by the pool through the shared hostname anyway. 

> @Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.

The client isn’t keeping these load maps, the locator is, and the locator knows all the member ids. The client end only needs to know the host/port combination. In your example where the wan replication (a client to the remote cluster) connects to the shared host/port service and get randomly routed to one of the backend servers in that service.

All of this locator balancing code is unnecessarily in this model where something else is choosing the final destination. The goal of our proposed changes was to recognize that all we need is to make sure the locator keeps the shared ServerLocation alive in its responses to clients by tracking the members associated and reducing that set to the set of unit ServerLocations. In your case that will always reduce to 1 ServerLocation for N number of members, as long as 1 member is still up.

-Jake



RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Thanks Naba & Jacob for your comments!



@Naba: I have been implementing a solution as you suggested, and I think it would be convenient if the client knows the memberId of the server it is connected to.

(current code is here: https://github.com/apache/geode/pull/4616 )

For example, in:

LocatorLoadSnapshot::getReplacementServerForConnection(ServerLocation currentServer, String group, Set<ServerLocation> excludedServers)

In this method, client has sent the ServerLocation , but if that object does not contain the memberId, I dont see how to guarantee that the replacement that will be returned is not the same server the client is currently connected.
Inside that method, this other method is called:

LocatorLoadSnapshot::isCurrentServerMostLoaded(currentServer, groupServers)

where groupServers is a "Map<ServerLocationAndMemberId, LoadHolder>" object. If the keys of that map have the same host and port, they are only different on the memberId. But as you dont know it (you just have currentServer which contains host and port), you cannot get the correct LoadHolder value, so you cannot know if your server is the most loaded.

So I think client needs to know the memberId of the server.

@Jacob: I think the solution finally implies that client have to know the memberId, I think we could simplify the maps.


BR/

Alberto B.


________________________________
De: Jacob Barrett <jb...@pivotal.io>
Enviado: miércoles, 22 de enero de 2020 7:29
Para: dev@geode.apache.org <de...@geode.apache.org>
Cc: Anilkumar Gingade <ag...@pivotal.io>; Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments


> On Jan 21, 2020, at 1:24 PM, Nabarun Nag <nn...@apache.org> wrote:
>
> Suggestion:
> - Instead, can we create a new class that contains the memberID and
> ServerLocation and that new class object is added as a key in the
> connectionMap.

I poked around a bit in this code and the ServerLocation is also in the LoadHolder class so we can simplify this even more by just using the member ID as the key in all these maps. When we need the ServerLocation we can get that from the LoadHolder.

The addServer call comes from a caller that has the CacheServerProfile, which has the member ID. The updateLoad caller is a DistributedMessage which has a sender member that is the member ID. Lastly, removeServer caller has a CacheServerProfile as well we can again get the member ID.

-Jake



Re: WAN replication issue in cloud native environments

Posted by Jacob Barrett <jb...@pivotal.io>.
> On Jan 21, 2020, at 1:24 PM, Nabarun Nag <nn...@apache.org> wrote:
> 
> Suggestion:
> - Instead, can we create a new class that contains the memberID and
> ServerLocation and that new class object is added as a key in the
> connectionMap.

I poked around a bit in this code and the ServerLocation is also in the LoadHolder class so we can simplify this even more by just using the member ID as the key in all these maps. When we need the ServerLocation we can get that from the LoadHolder. 

The addServer call comes from a caller that has the CacheServerProfile, which has the member ID. The updateLoad caller is a DistributedMessage which has a sender member that is the member ID. Lastly, removeServer caller has a CacheServerProfile as well we can again get the member ID.

-Jake



Re: WAN replication issue in cloud native environments

Posted by Nabarun Nag <nn...@apache.org>.
Hi Alberto,

Thank you for the contributions to the Apache Geode project

Here are a few feedback and pointers that we came up with:
1. Right now looking at your solution, we can see that you are modifying
the class "ServerLocation" which is used to stored as a key in the
connectionMap in LocatorLoadSnapshot.
2. ServerLocation was modified to include the memberID to differentiate
each server with the same hostname-for-sender and same pair of start and
end ports.
3. As ServerLocation is transmitted, there were a lot of changes in terms
of serialization etc. and also modification in ops code.

Suggestion:
- Instead, can we create a new class that contains the memberID and
ServerLocation and that new class object is added as a key in the
connectionMap.
- When member leaves, only that entry is removed from the connectionMap.
- When the remote locator is requesting the receiver information we
continue sending the ServerLocation, that we extract from the newly created
class.

Advantage:
- No changes required in terms of serialization as we are still sending the
ServerLocation like before.
- No changes to the ops.
- No extra bits sent over the wire


Please do let us know what do you feel about this solution.

Regards
Naba






````

On Tue, Jan 21, 2020 at 7:01 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi,
>
> I have been implementing a possible solution for this issue, and although
> I have not finished yet, I would like to kindly ask for comments.
>
> I created some Helm charts to explain and reproduce the problem, if you
> are interested they are here:
> https://github.com/alb3rtobr/geode-cloudnative-wan-replication
>
> The solution consists on adding to ServerLocation the id of the member
> hosting the server, to allow to differentiate two or more gateway receivers
> with the same ip but that are in different locations. I verified that this
> change fixes the problem.
>
> After that, I have been working on fixing issues with the existing tests.
> In the meanwhile, it will be useful to get some feedback about the
> solution, specially if there are impacts I have not considered yet (maybe
> they are the reason for the failing tests Im currently working on).
>
> The code can be found on this PR:
> https://github.com/apache/geode/pull/4489
>
> Thanks in advance!
>
> Alberto B.
>
>
> ________________________________
> De: Anilkumar Gingade <ag...@pivotal.io>
> Enviado: viernes, 6 de diciembre de 2019 18:56
> Para: geode <de...@geode.apache.org>
> Cc: Charlie Black <cb...@pivotal.io>
> Asunto: Re: WAN replication issue in cloud native environments
>
> Alberto,
>
> Can you please file a JIRA ticket for this. This could come up often as
> more and more deployments move to K8s.
>
> -Anil.
>
>
> On Fri, Dec 6, 2019 at 8:33 AM Sai Boorlagadda <sa...@gmail.com>
> wrote:
>
> > > if one gw receiver stops, the locator will publish to any remote
> locator
> > that there are no receivers up.
> >
> > I am not sure if locators proactively update remote locators about change
> > in receivers list rather I think the senders figures this out on
> connection
> > issues.
> > But I see the problem that local-site locators have only one member in
> the
> > list of receivers that they maintain as all receivers register with a
> > single <hostname:port> address.
> >
> > One idea I had earlier is to statically set receivers list to locators
> > (just like remote-locators property) which are exchanged with gw-senders.
> > This way we can introduce a boolean flag to turn off wan discovery and
> use
> > the statically configured addresses. This can be also useful for
> > remote-locators if they are behind a service.
> >
> > Sai
> >
> > On Thu, Dec 5, 2019 at 2:33 AM Alberto Bustamante Reyes
> > <al...@est.tech> wrote:
> >
> > > Thanks Charlie, but the issue is not about connectivity. Summarizing
> the
> > > issue, the problem is that if you have two or more gw receivers that
> are
> > > started with the same value of "hostname-for-senders", "start-port" and
> > > "end-port" (being "start-port" and "end-port" equal) parameters, if one
> > gw
> > > receiver stops, the locator will publish to any remote locator that
> there
> > > are no receivers up.
> > >
> > > And this use case is likely to happen on cloud-native environments, as
> > > described.
> > >
> > > BR/
> > >
> > > Alberto B.
> > > ________________________________
> > > De: Charlie Black <cb...@pivotal.io>
> > > Enviado: miércoles, 4 de diciembre de 2019 18:11
> > > Para: dev@geode.apache.org <de...@geode.apache.org>
> > > Asunto: Re: WAN replication issue in cloud native environments
> > >
> > > Alberto,
> > >
> > > Something else to think about SNI based routing.   I believe Mario
> might
> > be
> > > working on adding SNI to Geode - he at least had a proposal that he
> > > e-mailed out.
> > >
> > > Basics are the destination host is in the SNI field and the proxy can
> > > inspect and route the request to the right service instance.     Plus
> we
> > > have the option to not terminate the SSL at the proxy.
> > >
> > > Full disclosure - I haven't tried out SNI based routing myself and it
> is
> > > something that I thought could work as I was reading about it.   From
> the
> > > whiteboard I have done I think this will do ingress and egress just
> fine.
> > > Potentially easier then port mapping and `hostname for clients` playing
> > > around.
> > >
> > > Just something to think about.
> > >
> > > Charlie
> > >
> > >
> > > On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
> > > <al...@est.tech> wrote:
> > >
> > > > Hi Jacob,
> > > >
> > > > Yes,we are using LoadBalancer service type. But note the problem is
> not
> > > > the transport layer but on Geode as GW senders are complaining
> > > > “sender-2-parallel : Could not connect due to: There are no active
> > > > servers.” when one of the servers in the receiving cluster is killed.
> > > >
> > > > So, there is still one server alive in the receiving cluster but GW
> > > sender
> > > > does not know it and the locator is not able to inform about its
> > > existence.
> > > > Looking at the code it seems internal data structures (maps) holding
> > the
> > > > profiles use object whose equality check relies only on hostname and
> > > port.
> > > > This makes it impossible to differentiate servers when the same
> > > > “hostname-for-senders” and port are used. When the killed server
> comes
> > > back
> > > > up, the locator profiles are updated (internal map back to size()=1
> > > > although 2+ servers are there) and GW senders happily reconnect.
> > > >
> > > > The solution with the Geode as-is would be to expose each GW receiver
> > on
> > > a
> > > > different port outside of k8s cluster, this includes creating N
> > > Kubernetes
> > > > services for N GW receivers in addition to updating the service mesh
> > > > configuration (if it is used, firewalls etc…). Declarative nature of
> > > > kubernetes means we must know the ports in advance hence start-port
> and
> > > > end-port when creating each GW receiver must be equal and we should
> > have
> > > > some well-known
> > > > algorithm when creating GW receivers across servers. For example:
> > > server-0
> > > > port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> > > > receivers must be wired individually and we must turn off Geode’s
> > random
> > > > port allocation.
> > > >
> > > > But we are exploring the possibility for Geode to handle this
> > > cloud-native
> > > > configuration a bit better. Locators should be capable of holding GW
> > > > receiver information although they are hidden behind same hostname
> and
> > > port.
> > > > This is a code change in Geode and we would like to have community
> > > opinion
> > > > on it.
> > > >
> > > > Some obvious impacts with the legacy behavior would be when locator
> > picks
> > > > a server on behalf of the client (GW sender in this case) it does so
> > > based
> > > >  on the server load. When sender connects and considering all servers
> > are
> > > > using same VIP:PORT it is load balancer that will decide where the
> > > > connection will end up, but likely not on the one selected by
> locator.
> > So
> > > > here we ignore the locator instructions. Since GW senders normally do
> > not
> > > > create huge number of connections this probably shall not unbalance
> > > cluster
> > > > too much. But this is an impact worth considering. Custom load
> metrics
> > > > would also be ignored by GW senders. Opinions?
> > > >
> > > > Additional impact that comes to mind is GW sender load-balance
> command
> > > and
> > > > how it’s execution would be affected.
> > > >
> > > > Thanks!
> > > >
> > > > Alberto B.
> > > >
> > > > ________________________________
> > > > De: Jacob Barrett <jb...@pivotal.io>
> > > > Enviado: viernes, 29 de noviembre de 2019 13:06
> > > > Para: dev@geode.apache.org <de...@geode.apache.org>
> > > > Asunto: Re: WAN replication issue in cloud native environments
> > > >
> > > >
> > > >
> > > > > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> > > > <al...@est.tech> wrote:
> > > > >
> > > > > The reason for such a setup is deploying Geode cluster on a
> > Kubernetes
> > > > cluster where all GW receivers are reachable from the outside world
> on
> > > the
> > > > same VIP and port.
> > > >
> > > > Are you using LoadBalancer Service type?
> > > >
> > > > > Other kinds of configuration (different hostname and/or different
> > port
> > > > for each GW receiver) are not cheap from OAM and resources
> perspective
> > in
> > > > cloud native environments and also limit some important use-cases
> (like
> > > > scaling).
> > > >
> > > > If you could somehow configure host and port for sender (code
> > > modification
> > > > required) would exposing each port through the LoadBalancer be too
> > > > expensive too?
> > > >
> > > > > The problem experienced is that shutting down one server is
> stopping
> > > > replication to this cluster until the server is up again. We suspect
> > this
> > > > is because Geode incorrectly assumes there are no more alive servers
> > when
> > > > just one of them is down (since they share hostname-for-senders and
> > > port).
> > > >
> > > > Sees like at the worst case when it tries to reconnect the LB should
> > give
> > > > it a live server and it think the single server is back up.
> > > >
> > > > -Jake
> > > >
> > > >
> > >
> > > --
> > > Charlie Black | cblack@pivotal.io
> > >
> >
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

I have been implementing a possible solution for this issue, and although I have not finished yet, I would like to kindly ask for comments.

I created some Helm charts to explain and reproduce the problem, if you are interested they are here: https://github.com/alb3rtobr/geode-cloudnative-wan-replication

The solution consists on adding to ServerLocation the id of the member hosting the server, to allow to differentiate two or more gateway receivers with the same ip but that are in different locations. I verified that this change fixes the problem.

After that, I have been working on fixing issues with the existing tests. In the meanwhile, it will be useful to get some feedback about the solution, specially if there are impacts I have not considered yet (maybe they are the reason for the failing tests Im currently working on).

The code can be found on this PR: https://github.com/apache/geode/pull/4489

Thanks in advance!

Alberto B.


________________________________
De: Anilkumar Gingade <ag...@pivotal.io>
Enviado: viernes, 6 de diciembre de 2019 18:56
Para: geode <de...@geode.apache.org>
Cc: Charlie Black <cb...@pivotal.io>
Asunto: Re: WAN replication issue in cloud native environments

Alberto,

Can you please file a JIRA ticket for this. This could come up often as
more and more deployments move to K8s.

-Anil.


On Fri, Dec 6, 2019 at 8:33 AM Sai Boorlagadda <sa...@gmail.com>
wrote:

> > if one gw receiver stops, the locator will publish to any remote locator
> that there are no receivers up.
>
> I am not sure if locators proactively update remote locators about change
> in receivers list rather I think the senders figures this out on connection
> issues.
> But I see the problem that local-site locators have only one member in the
> list of receivers that they maintain as all receivers register with a
> single <hostname:port> address.
>
> One idea I had earlier is to statically set receivers list to locators
> (just like remote-locators property) which are exchanged with gw-senders.
> This way we can introduce a boolean flag to turn off wan discovery and use
> the statically configured addresses. This can be also useful for
> remote-locators if they are behind a service.
>
> Sai
>
> On Thu, Dec 5, 2019 at 2:33 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Thanks Charlie, but the issue is not about connectivity. Summarizing the
> > issue, the problem is that if you have two or more gw receivers that are
> > started with the same value of "hostname-for-senders", "start-port" and
> > "end-port" (being "start-port" and "end-port" equal) parameters, if one
> gw
> > receiver stops, the locator will publish to any remote locator that there
> > are no receivers up.
> >
> > And this use case is likely to happen on cloud-native environments, as
> > described.
> >
> > BR/
> >
> > Alberto B.
> > ________________________________
> > De: Charlie Black <cb...@pivotal.io>
> > Enviado: miércoles, 4 de diciembre de 2019 18:11
> > Para: dev@geode.apache.org <de...@geode.apache.org>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > Alberto,
> >
> > Something else to think about SNI based routing.   I believe Mario might
> be
> > working on adding SNI to Geode - he at least had a proposal that he
> > e-mailed out.
> >
> > Basics are the destination host is in the SNI field and the proxy can
> > inspect and route the request to the right service instance.     Plus we
> > have the option to not terminate the SSL at the proxy.
> >
> > Full disclosure - I haven't tried out SNI based routing myself and it is
> > something that I thought could work as I was reading about it.   From the
> > whiteboard I have done I think this will do ingress and egress just fine.
> > Potentially easier then port mapping and `hostname for clients` playing
> > around.
> >
> > Just something to think about.
> >
> > Charlie
> >
> >
> > On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
> > <al...@est.tech> wrote:
> >
> > > Hi Jacob,
> > >
> > > Yes,we are using LoadBalancer service type. But note the problem is not
> > > the transport layer but on Geode as GW senders are complaining
> > > “sender-2-parallel : Could not connect due to: There are no active
> > > servers.” when one of the servers in the receiving cluster is killed.
> > >
> > > So, there is still one server alive in the receiving cluster but GW
> > sender
> > > does not know it and the locator is not able to inform about its
> > existence.
> > > Looking at the code it seems internal data structures (maps) holding
> the
> > > profiles use object whose equality check relies only on hostname and
> > port.
> > > This makes it impossible to differentiate servers when the same
> > > “hostname-for-senders” and port are used. When the killed server comes
> > back
> > > up, the locator profiles are updated (internal map back to size()=1
> > > although 2+ servers are there) and GW senders happily reconnect.
> > >
> > > The solution with the Geode as-is would be to expose each GW receiver
> on
> > a
> > > different port outside of k8s cluster, this includes creating N
> > Kubernetes
> > > services for N GW receivers in addition to updating the service mesh
> > > configuration (if it is used, firewalls etc…). Declarative nature of
> > > kubernetes means we must know the ports in advance hence start-port and
> > > end-port when creating each GW receiver must be equal and we should
> have
> > > some well-known
> > > algorithm when creating GW receivers across servers. For example:
> > server-0
> > > port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> > > receivers must be wired individually and we must turn off Geode’s
> random
> > > port allocation.
> > >
> > > But we are exploring the possibility for Geode to handle this
> > cloud-native
> > > configuration a bit better. Locators should be capable of holding GW
> > > receiver information although they are hidden behind same hostname and
> > port.
> > > This is a code change in Geode and we would like to have community
> > opinion
> > > on it.
> > >
> > > Some obvious impacts with the legacy behavior would be when locator
> picks
> > > a server on behalf of the client (GW sender in this case) it does so
> > based
> > >  on the server load. When sender connects and considering all servers
> are
> > > using same VIP:PORT it is load balancer that will decide where the
> > > connection will end up, but likely not on the one selected by locator.
> So
> > > here we ignore the locator instructions. Since GW senders normally do
> not
> > > create huge number of connections this probably shall not unbalance
> > cluster
> > > too much. But this is an impact worth considering. Custom load metrics
> > > would also be ignored by GW senders. Opinions?
> > >
> > > Additional impact that comes to mind is GW sender load-balance command
> > and
> > > how it’s execution would be affected.
> > >
> > > Thanks!
> > >
> > > Alberto B.
> > >
> > > ________________________________
> > > De: Jacob Barrett <jb...@pivotal.io>
> > > Enviado: viernes, 29 de noviembre de 2019 13:06
> > > Para: dev@geode.apache.org <de...@geode.apache.org>
> > > Asunto: Re: WAN replication issue in cloud native environments
> > >
> > >
> > >
> > > > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> > > <al...@est.tech> wrote:
> > > >
> > > > The reason for such a setup is deploying Geode cluster on a
> Kubernetes
> > > cluster where all GW receivers are reachable from the outside world on
> > the
> > > same VIP and port.
> > >
> > > Are you using LoadBalancer Service type?
> > >
> > > > Other kinds of configuration (different hostname and/or different
> port
> > > for each GW receiver) are not cheap from OAM and resources perspective
> in
> > > cloud native environments and also limit some important use-cases (like
> > > scaling).
> > >
> > > If you could somehow configure host and port for sender (code
> > modification
> > > required) would exposing each port through the LoadBalancer be too
> > > expensive too?
> > >
> > > > The problem experienced is that shutting down one server is stopping
> > > replication to this cluster until the server is up again. We suspect
> this
> > > is because Geode incorrectly assumes there are no more alive servers
> when
> > > just one of them is down (since they share hostname-for-senders and
> > port).
> > >
> > > Sees like at the worst case when it tries to reconnect the LB should
> give
> > > it a live server and it think the single server is back up.
> > >
> > > -Jake
> > >
> > >
> >
> > --
> > Charlie Black | cblack@pivotal.io
> >
>

Re: WAN replication issue in cloud native environments

Posted by Anilkumar Gingade <ag...@pivotal.io>.
Alberto,

Can you please file a JIRA ticket for this. This could come up often as
more and more deployments move to K8s.

-Anil.


On Fri, Dec 6, 2019 at 8:33 AM Sai Boorlagadda <sa...@gmail.com>
wrote:

> > if one gw receiver stops, the locator will publish to any remote locator
> that there are no receivers up.
>
> I am not sure if locators proactively update remote locators about change
> in receivers list rather I think the senders figures this out on connection
> issues.
> But I see the problem that local-site locators have only one member in the
> list of receivers that they maintain as all receivers register with a
> single <hostname:port> address.
>
> One idea I had earlier is to statically set receivers list to locators
> (just like remote-locators property) which are exchanged with gw-senders.
> This way we can introduce a boolean flag to turn off wan discovery and use
> the statically configured addresses. This can be also useful for
> remote-locators if they are behind a service.
>
> Sai
>
> On Thu, Dec 5, 2019 at 2:33 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Thanks Charlie, but the issue is not about connectivity. Summarizing the
> > issue, the problem is that if you have two or more gw receivers that are
> > started with the same value of "hostname-for-senders", "start-port" and
> > "end-port" (being "start-port" and "end-port" equal) parameters, if one
> gw
> > receiver stops, the locator will publish to any remote locator that there
> > are no receivers up.
> >
> > And this use case is likely to happen on cloud-native environments, as
> > described.
> >
> > BR/
> >
> > Alberto B.
> > ________________________________
> > De: Charlie Black <cb...@pivotal.io>
> > Enviado: miércoles, 4 de diciembre de 2019 18:11
> > Para: dev@geode.apache.org <de...@geode.apache.org>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> > Alberto,
> >
> > Something else to think about SNI based routing.   I believe Mario might
> be
> > working on adding SNI to Geode - he at least had a proposal that he
> > e-mailed out.
> >
> > Basics are the destination host is in the SNI field and the proxy can
> > inspect and route the request to the right service instance.     Plus we
> > have the option to not terminate the SSL at the proxy.
> >
> > Full disclosure - I haven't tried out SNI based routing myself and it is
> > something that I thought could work as I was reading about it.   From the
> > whiteboard I have done I think this will do ingress and egress just fine.
> > Potentially easier then port mapping and `hostname for clients` playing
> > around.
> >
> > Just something to think about.
> >
> > Charlie
> >
> >
> > On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
> > <al...@est.tech> wrote:
> >
> > > Hi Jacob,
> > >
> > > Yes,we are using LoadBalancer service type. But note the problem is not
> > > the transport layer but on Geode as GW senders are complaining
> > > “sender-2-parallel : Could not connect due to: There are no active
> > > servers.” when one of the servers in the receiving cluster is killed.
> > >
> > > So, there is still one server alive in the receiving cluster but GW
> > sender
> > > does not know it and the locator is not able to inform about its
> > existence.
> > > Looking at the code it seems internal data structures (maps) holding
> the
> > > profiles use object whose equality check relies only on hostname and
> > port.
> > > This makes it impossible to differentiate servers when the same
> > > “hostname-for-senders” and port are used. When the killed server comes
> > back
> > > up, the locator profiles are updated (internal map back to size()=1
> > > although 2+ servers are there) and GW senders happily reconnect.
> > >
> > > The solution with the Geode as-is would be to expose each GW receiver
> on
> > a
> > > different port outside of k8s cluster, this includes creating N
> > Kubernetes
> > > services for N GW receivers in addition to updating the service mesh
> > > configuration (if it is used, firewalls etc…). Declarative nature of
> > > kubernetes means we must know the ports in advance hence start-port and
> > > end-port when creating each GW receiver must be equal and we should
> have
> > > some well-known
> > > algorithm when creating GW receivers across servers. For example:
> > server-0
> > > port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> > > receivers must be wired individually and we must turn off Geode’s
> random
> > > port allocation.
> > >
> > > But we are exploring the possibility for Geode to handle this
> > cloud-native
> > > configuration a bit better. Locators should be capable of holding GW
> > > receiver information although they are hidden behind same hostname and
> > port.
> > > This is a code change in Geode and we would like to have community
> > opinion
> > > on it.
> > >
> > > Some obvious impacts with the legacy behavior would be when locator
> picks
> > > a server on behalf of the client (GW sender in this case) it does so
> > based
> > >  on the server load. When sender connects and considering all servers
> are
> > > using same VIP:PORT it is load balancer that will decide where the
> > > connection will end up, but likely not on the one selected by locator.
> So
> > > here we ignore the locator instructions. Since GW senders normally do
> not
> > > create huge number of connections this probably shall not unbalance
> > cluster
> > > too much. But this is an impact worth considering. Custom load metrics
> > > would also be ignored by GW senders. Opinions?
> > >
> > > Additional impact that comes to mind is GW sender load-balance command
> > and
> > > how it’s execution would be affected.
> > >
> > > Thanks!
> > >
> > > Alberto B.
> > >
> > > ________________________________
> > > De: Jacob Barrett <jb...@pivotal.io>
> > > Enviado: viernes, 29 de noviembre de 2019 13:06
> > > Para: dev@geode.apache.org <de...@geode.apache.org>
> > > Asunto: Re: WAN replication issue in cloud native environments
> > >
> > >
> > >
> > > > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> > > <al...@est.tech> wrote:
> > > >
> > > > The reason for such a setup is deploying Geode cluster on a
> Kubernetes
> > > cluster where all GW receivers are reachable from the outside world on
> > the
> > > same VIP and port.
> > >
> > > Are you using LoadBalancer Service type?
> > >
> > > > Other kinds of configuration (different hostname and/or different
> port
> > > for each GW receiver) are not cheap from OAM and resources perspective
> in
> > > cloud native environments and also limit some important use-cases (like
> > > scaling).
> > >
> > > If you could somehow configure host and port for sender (code
> > modification
> > > required) would exposing each port through the LoadBalancer be too
> > > expensive too?
> > >
> > > > The problem experienced is that shutting down one server is stopping
> > > replication to this cluster until the server is up again. We suspect
> this
> > > is because Geode incorrectly assumes there are no more alive servers
> when
> > > just one of them is down (since they share hostname-for-senders and
> > port).
> > >
> > > Sees like at the worst case when it tries to reconnect the LB should
> give
> > > it a live server and it think the single server is back up.
> > >
> > > -Jake
> > >
> > >
> >
> > --
> > Charlie Black | cblack@pivotal.io
> >
>

Re: WAN replication issue in cloud native environments

Posted by Sai Boorlagadda <sa...@gmail.com>.
> if one gw receiver stops, the locator will publish to any remote locator
that there are no receivers up.

I am not sure if locators proactively update remote locators about change
in receivers list rather I think the senders figures this out on connection
issues.
But I see the problem that local-site locators have only one member in the
list of receivers that they maintain as all receivers register with a
single <hostname:port> address.

One idea I had earlier is to statically set receivers list to locators
(just like remote-locators property) which are exchanged with gw-senders.
This way we can introduce a boolean flag to turn off wan discovery and use
the statically configured addresses. This can be also useful for
remote-locators if they are behind a service.

Sai

On Thu, Dec 5, 2019 at 2:33 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Thanks Charlie, but the issue is not about connectivity. Summarizing the
> issue, the problem is that if you have two or more gw receivers that are
> started with the same value of "hostname-for-senders", "start-port" and
> "end-port" (being "start-port" and "end-port" equal) parameters, if one gw
> receiver stops, the locator will publish to any remote locator that there
> are no receivers up.
>
> And this use case is likely to happen on cloud-native environments, as
> described.
>
> BR/
>
> Alberto B.
> ________________________________
> De: Charlie Black <cb...@pivotal.io>
> Enviado: miércoles, 4 de diciembre de 2019 18:11
> Para: dev@geode.apache.org <de...@geode.apache.org>
> Asunto: Re: WAN replication issue in cloud native environments
>
> Alberto,
>
> Something else to think about SNI based routing.   I believe Mario might be
> working on adding SNI to Geode - he at least had a proposal that he
> e-mailed out.
>
> Basics are the destination host is in the SNI field and the proxy can
> inspect and route the request to the right service instance.     Plus we
> have the option to not terminate the SSL at the proxy.
>
> Full disclosure - I haven't tried out SNI based routing myself and it is
> something that I thought could work as I was reading about it.   From the
> whiteboard I have done I think this will do ingress and egress just fine.
> Potentially easier then port mapping and `hostname for clients` playing
> around.
>
> Just something to think about.
>
> Charlie
>
>
> On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
> <al...@est.tech> wrote:
>
> > Hi Jacob,
> >
> > Yes,we are using LoadBalancer service type. But note the problem is not
> > the transport layer but on Geode as GW senders are complaining
> > “sender-2-parallel : Could not connect due to: There are no active
> > servers.” when one of the servers in the receiving cluster is killed.
> >
> > So, there is still one server alive in the receiving cluster but GW
> sender
> > does not know it and the locator is not able to inform about its
> existence.
> > Looking at the code it seems internal data structures (maps) holding the
> > profiles use object whose equality check relies only on hostname and
> port.
> > This makes it impossible to differentiate servers when the same
> > “hostname-for-senders” and port are used. When the killed server comes
> back
> > up, the locator profiles are updated (internal map back to size()=1
> > although 2+ servers are there) and GW senders happily reconnect.
> >
> > The solution with the Geode as-is would be to expose each GW receiver on
> a
> > different port outside of k8s cluster, this includes creating N
> Kubernetes
> > services for N GW receivers in addition to updating the service mesh
> > configuration (if it is used, firewalls etc…). Declarative nature of
> > kubernetes means we must know the ports in advance hence start-port and
> > end-port when creating each GW receiver must be equal and we should have
> > some well-known
> > algorithm when creating GW receivers across servers. For example:
> server-0
> > port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> > receivers must be wired individually and we must turn off Geode’s random
> > port allocation.
> >
> > But we are exploring the possibility for Geode to handle this
> cloud-native
> > configuration a bit better. Locators should be capable of holding GW
> > receiver information although they are hidden behind same hostname and
> port.
> > This is a code change in Geode and we would like to have community
> opinion
> > on it.
> >
> > Some obvious impacts with the legacy behavior would be when locator picks
> > a server on behalf of the client (GW sender in this case) it does so
> based
> >  on the server load. When sender connects and considering all servers are
> > using same VIP:PORT it is load balancer that will decide where the
> > connection will end up, but likely not on the one selected by locator. So
> > here we ignore the locator instructions. Since GW senders normally do not
> > create huge number of connections this probably shall not unbalance
> cluster
> > too much. But this is an impact worth considering. Custom load metrics
> > would also be ignored by GW senders. Opinions?
> >
> > Additional impact that comes to mind is GW sender load-balance command
> and
> > how it’s execution would be affected.
> >
> > Thanks!
> >
> > Alberto B.
> >
> > ________________________________
> > De: Jacob Barrett <jb...@pivotal.io>
> > Enviado: viernes, 29 de noviembre de 2019 13:06
> > Para: dev@geode.apache.org <de...@geode.apache.org>
> > Asunto: Re: WAN replication issue in cloud native environments
> >
> >
> >
> > > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> > <al...@est.tech> wrote:
> > >
> > > The reason for such a setup is deploying Geode cluster on a Kubernetes
> > cluster where all GW receivers are reachable from the outside world on
> the
> > same VIP and port.
> >
> > Are you using LoadBalancer Service type?
> >
> > > Other kinds of configuration (different hostname and/or different port
> > for each GW receiver) are not cheap from OAM and resources perspective in
> > cloud native environments and also limit some important use-cases (like
> > scaling).
> >
> > If you could somehow configure host and port for sender (code
> modification
> > required) would exposing each port through the LoadBalancer be too
> > expensive too?
> >
> > > The problem experienced is that shutting down one server is stopping
> > replication to this cluster until the server is up again. We suspect this
> > is because Geode incorrectly assumes there are no more alive servers when
> > just one of them is down (since they share hostname-for-senders and
> port).
> >
> > Sees like at the worst case when it tries to reconnect the LB should give
> > it a live server and it think the single server is back up.
> >
> > -Jake
> >
> >
>
> --
> Charlie Black | cblack@pivotal.io
>

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Thanks Charlie, but the issue is not about connectivity. Summarizing the issue, the problem is that if you have two or more gw receivers that are started with the same value of "hostname-for-senders", "start-port" and "end-port" (being "start-port" and "end-port" equal) parameters, if one gw receiver stops, the locator will publish to any remote locator that there are no receivers up.

And this use case is likely to happen on cloud-native environments, as described.

BR/

Alberto B.
________________________________
De: Charlie Black <cb...@pivotal.io>
Enviado: miércoles, 4 de diciembre de 2019 18:11
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: Re: WAN replication issue in cloud native environments

Alberto,

Something else to think about SNI based routing.   I believe Mario might be
working on adding SNI to Geode - he at least had a proposal that he
e-mailed out.

Basics are the destination host is in the SNI field and the proxy can
inspect and route the request to the right service instance.     Plus we
have the option to not terminate the SSL at the proxy.

Full disclosure - I haven't tried out SNI based routing myself and it is
something that I thought could work as I was reading about it.   From the
whiteboard I have done I think this will do ingress and egress just fine.
Potentially easier then port mapping and `hostname for clients` playing
around.

Just something to think about.

Charlie


On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Jacob,
>
> Yes,we are using LoadBalancer service type. But note the problem is not
> the transport layer but on Geode as GW senders are complaining
> “sender-2-parallel : Could not connect due to: There are no active
> servers.” when one of the servers in the receiving cluster is killed.
>
> So, there is still one server alive in the receiving cluster but GW sender
> does not know it and the locator is not able to inform about its existence.
> Looking at the code it seems internal data structures (maps) holding the
> profiles use object whose equality check relies only on hostname and port.
> This makes it impossible to differentiate servers when the same
> “hostname-for-senders” and port are used. When the killed server comes back
> up, the locator profiles are updated (internal map back to size()=1
> although 2+ servers are there) and GW senders happily reconnect.
>
> The solution with the Geode as-is would be to expose each GW receiver on a
> different port outside of k8s cluster, this includes creating N Kubernetes
> services for N GW receivers in addition to updating the service mesh
> configuration (if it is used, firewalls etc…). Declarative nature of
> kubernetes means we must know the ports in advance hence start-port and
> end-port when creating each GW receiver must be equal and we should have
> some well-known
> algorithm when creating GW receivers across servers. For example: server-0
> port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> receivers must be wired individually and we must turn off Geode’s random
> port allocation.
>
> But we are exploring the possibility for Geode to handle this cloud-native
> configuration a bit better. Locators should be capable of holding GW
> receiver information although they are hidden behind same hostname and port.
> This is a code change in Geode and we would like to have community opinion
> on it.
>
> Some obvious impacts with the legacy behavior would be when locator picks
> a server on behalf of the client (GW sender in this case) it does so based
>  on the server load. When sender connects and considering all servers are
> using same VIP:PORT it is load balancer that will decide where the
> connection will end up, but likely not on the one selected by locator. So
> here we ignore the locator instructions. Since GW senders normally do not
> create huge number of connections this probably shall not unbalance cluster
> too much. But this is an impact worth considering. Custom load metrics
> would also be ignored by GW senders. Opinions?
>
> Additional impact that comes to mind is GW sender load-balance command and
> how it’s execution would be affected.
>
> Thanks!
>
> Alberto B.
>
> ________________________________
> De: Jacob Barrett <jb...@pivotal.io>
> Enviado: viernes, 29 de noviembre de 2019 13:06
> Para: dev@geode.apache.org <de...@geode.apache.org>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> >
> > The reason for such a setup is deploying Geode cluster on a Kubernetes
> cluster where all GW receivers are reachable from the outside world on the
> same VIP and port.
>
> Are you using LoadBalancer Service type?
>
> > Other kinds of configuration (different hostname and/or different port
> for each GW receiver) are not cheap from OAM and resources perspective in
> cloud native environments and also limit some important use-cases (like
> scaling).
>
> If you could somehow configure host and port for sender (code modification
> required) would exposing each port through the LoadBalancer be too
> expensive too?
>
> > The problem experienced is that shutting down one server is stopping
> replication to this cluster until the server is up again. We suspect this
> is because Geode incorrectly assumes there are no more alive servers when
> just one of them is down (since they share hostname-for-senders and port).
>
> Sees like at the worst case when it tries to reconnect the LB should give
> it a live server and it think the single server is back up.
>
> -Jake
>
>

--
Charlie Black | cblack@pivotal.io

Re: WAN replication issue in cloud native environments

Posted by Charlie Black <cb...@pivotal.io>.
Alberto,

Something else to think about SNI based routing.   I believe Mario might be
working on adding SNI to Geode - he at least had a proposal that he
e-mailed out.

Basics are the destination host is in the SNI field and the proxy can
inspect and route the request to the right service instance.     Plus we
have the option to not terminate the SSL at the proxy.

Full disclosure - I haven't tried out SNI based routing myself and it is
something that I thought could work as I was reading about it.   From the
whiteboard I have done I think this will do ingress and egress just fine.
Potentially easier then port mapping and `hostname for clients` playing
around.

Just something to think about.

Charlie


On Wed, Dec 4, 2019 at 3:19 AM Alberto Bustamante Reyes
<al...@est.tech> wrote:

> Hi Jacob,
>
> Yes,we are using LoadBalancer service type. But note the problem is not
> the transport layer but on Geode as GW senders are complaining
> “sender-2-parallel : Could not connect due to: There are no active
> servers.” when one of the servers in the receiving cluster is killed.
>
> So, there is still one server alive in the receiving cluster but GW sender
> does not know it and the locator is not able to inform about its existence.
> Looking at the code it seems internal data structures (maps) holding the
> profiles use object whose equality check relies only on hostname and port.
> This makes it impossible to differentiate servers when the same
> “hostname-for-senders” and port are used. When the killed server comes back
> up, the locator profiles are updated (internal map back to size()=1
> although 2+ servers are there) and GW senders happily reconnect.
>
> The solution with the Geode as-is would be to expose each GW receiver on a
> different port outside of k8s cluster, this includes creating N Kubernetes
> services for N GW receivers in addition to updating the service mesh
> configuration (if it is used, firewalls etc…). Declarative nature of
> kubernetes means we must know the ports in advance hence start-port and
> end-port when creating each GW receiver must be equal and we should have
> some well-known
> algorithm when creating GW receivers across servers. For example: server-0
> port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW
> receivers must be wired individually and we must turn off Geode’s random
> port allocation.
>
> But we are exploring the possibility for Geode to handle this cloud-native
> configuration a bit better. Locators should be capable of holding GW
> receiver information although they are hidden behind same hostname and port.
> This is a code change in Geode and we would like to have community opinion
> on it.
>
> Some obvious impacts with the legacy behavior would be when locator picks
> a server on behalf of the client (GW sender in this case) it does so based
>  on the server load. When sender connects and considering all servers are
> using same VIP:PORT it is load balancer that will decide where the
> connection will end up, but likely not on the one selected by locator. So
> here we ignore the locator instructions. Since GW senders normally do not
> create huge number of connections this probably shall not unbalance cluster
> too much. But this is an impact worth considering. Custom load metrics
> would also be ignored by GW senders. Opinions?
>
> Additional impact that comes to mind is GW sender load-balance command and
> how it’s execution would be affected.
>
> Thanks!
>
> Alberto B.
>
> ________________________________
> De: Jacob Barrett <jb...@pivotal.io>
> Enviado: viernes, 29 de noviembre de 2019 13:06
> Para: dev@geode.apache.org <de...@geode.apache.org>
> Asunto: Re: WAN replication issue in cloud native environments
>
>
>
> > On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes
> <al...@est.tech> wrote:
> >
> > The reason for such a setup is deploying Geode cluster on a Kubernetes
> cluster where all GW receivers are reachable from the outside world on the
> same VIP and port.
>
> Are you using LoadBalancer Service type?
>
> > Other kinds of configuration (different hostname and/or different port
> for each GW receiver) are not cheap from OAM and resources perspective in
> cloud native environments and also limit some important use-cases (like
> scaling).
>
> If you could somehow configure host and port for sender (code modification
> required) would exposing each port through the LoadBalancer be too
> expensive too?
>
> > The problem experienced is that shutting down one server is stopping
> replication to this cluster until the server is up again. We suspect this
> is because Geode incorrectly assumes there are no more alive servers when
> just one of them is down (since they share hostname-for-senders and port).
>
> Sees like at the worst case when it tries to reconnect the LB should give
> it a live server and it think the single server is back up.
>
> -Jake
>
>

-- 
Charlie Black | cblack@pivotal.io

RE: WAN replication issue in cloud native environments

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi Jacob,

Yes,we are using LoadBalancer service type. But note the problem is not the transport layer but on Geode as GW senders are complaining “sender-2-parallel : Could not connect due to: There are no active servers.” when one of the servers in the receiving cluster is killed.

So, there is still one server alive in the receiving cluster but GW sender does not know it and the locator is not able to inform about its existence. Looking at the code it seems internal data structures (maps) holding the profiles use object whose equality check relies only on hostname and port. This makes it impossible to differentiate servers when the same “hostname-for-senders” and port are used. When the killed server comes back up, the locator profiles are updated (internal map back to size()=1 although 2+ servers are there) and GW senders happily reconnect.

The solution with the Geode as-is would be to expose each GW receiver on a different port outside of k8s cluster, this includes creating N Kubernetes services for N GW receivers in addition to updating the service mesh configuration (if it is used, firewalls etc…). Declarative nature of kubernetes means we must know the ports in advance hence start-port and end-port when creating each GW receiver must be equal and we should have some well-known
algorithm when creating GW receivers across servers. For example: server-0 port 5000, server-1 port 5001, server-2 port 5002 etc…. So, all GW receivers must be wired individually and we must turn off Geode’s random port allocation.

But we are exploring the possibility for Geode to handle this cloud-native configuration a bit better. Locators should be capable of holding GW receiver information although they are hidden behind same hostname and port.
This is a code change in Geode and we would like to have community opinion on it.

Some obvious impacts with the legacy behavior would be when locator picks a server on behalf of the client (GW sender in this case) it does so based
 on the server load. When sender connects and considering all servers are using same VIP:PORT it is load balancer that will decide where the connection will end up, but likely not on the one selected by locator. So here we ignore the locator instructions. Since GW senders normally do not create huge number of connections this probably shall not unbalance cluster too much. But this is an impact worth considering. Custom load metrics would also be ignored by GW senders. Opinions?

Additional impact that comes to mind is GW sender load-balance command and how it’s execution would be affected.

Thanks!

Alberto B.

________________________________
De: Jacob Barrett <jb...@pivotal.io>
Enviado: viernes, 29 de noviembre de 2019 13:06
Para: dev@geode.apache.org <de...@geode.apache.org>
Asunto: Re: WAN replication issue in cloud native environments



> On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes <al...@est.tech> wrote:
>
> The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port.

Are you using LoadBalancer Service type?

> Other kinds of configuration (different hostname and/or different port for each GW receiver) are not cheap from OAM and resources perspective in cloud native environments and also limit some important use-cases (like scaling).

If you could somehow configure host and port for sender (code modification required) would exposing each port through the LoadBalancer be too expensive too?

> The problem experienced is that shutting down one server is stopping replication to this cluster until the server is up again. We suspect this is because Geode incorrectly assumes there are no more alive servers when just one of them is down (since they share hostname-for-senders and port).

Sees like at the worst case when it tries to reconnect the LB should give it a live server and it think the single server is back up.

-Jake


Re: WAN replication issue in cloud native environments

Posted by Jacob Barrett <jb...@pivotal.io>.

> On Nov 29, 2019, at 3:14 AM, Alberto Bustamante Reyes <al...@est.tech> wrote:
> 
> The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port.

Are you using LoadBalancer Service type?

> Other kinds of configuration (different hostname and/or different port for each GW receiver) are not cheap from OAM and resources perspective in cloud native environments and also limit some important use-cases (like scaling).

If you could somehow configure host and port for sender (code modification required) would exposing each port through the LoadBalancer be too expensive too?

> The problem experienced is that shutting down one server is stopping replication to this cluster until the server is up again. We suspect this is because Geode incorrectly assumes there are no more alive servers when just one of them is down (since they share hostname-for-senders and port).

Sees like at the worst case when it tries to reconnect the LB should give it a live server and it think the single server is back up.

-Jake