You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Alberto Gomez <al...@est.tech> on 2020/07/06 15:24:05 UTC

[DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi,

I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

https://cwiki.apache.org/confluence/display/GEODE/Avoid+the+queuing+of+dropped+events+by+the+primary+gateway+sender+when+the+gateway+sender+is+stopped

Could you please give comments by Thursday, July 9th, 2020?

Thanks in advance,

Alberto G.

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Alberto Gomez <al...@est.tech>.
Hi Alexander,

Yes, sure. I am extending the deadline for comments to next Thursday, July the 16th.

Cheers,

Alberto G.
________________________________
From: Alexander Murmann <am...@apache.org>
Sent: Thursday, July 9, 2020 1:42 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi Alberto,

The timing on this RFC feels really tight. Would you be open to extending
this to next week?

On Wed, Jul 8, 2020 at 1:04 PM Eric Shu <es...@vmware.com> wrote:

> I think the only case the memory issue occurred is when all gateway
> senders are stopped in the wan-site. Otherwise another member would assume
> to be the primary queue. No more events will be enqueued in
> tmpDroppedEvents on the member with original primary queue. (For parallel
> wan queue, I do not think stop one gateway queue is a valid case to
> support.)
>
> For all gateway senders are stopped case, no need to notify any other
> members in the wan site if the limit is reached. The tmpDroppedEvents is
> only used for remove events on the secondary queue. If no events are
> enqueued in the secondary queue, there is no need to add into
> tmpDroppedEvents at all. To me, it should be only used for limited events
> to be queued.
>
> Regards,
> Eric
> ________________________________
> From: Alberto Gomez <al...@est.tech>
> Sent: Wednesday, July 8, 2020 12:02 PM
> To: dev@geode.apache.org <de...@geode.apache.org>
> Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> Thanks for your comments, Eric.
>
> Limiting the size of the queue would be a simple solution but I think it
> would pose several problems on the the one configuring and operating Geode:
>
>   *   How big should the queue be? Probably not easy to dimension. Should
> the limit by on the memory occupied by the elements or on the number of
> elements in the queue (in which case, depending on the size of the
> elements, the memory used could vary a lot)?
>   *   What  to do when the limit has been reached? how do we notify that
> it was reached, what to do afterwards, how would we know what dropped
> events did not make it to the queue but should have been removed from the
> secondary's queue...
>
> I think the solution proposed in the RFC is simple enough and also
> addresses a possible confusion with the semantics of the gateway sender
> stop command.
> Stopping a gateway sender currently makes that all events received while
> the sender is stopped are dropped; but at the same time, unlimited memory
> may be consumed by the dropped events. We could put a limit on the amount
> of memory used by the queued dropped events but what would be the point in
> the first place to store them if those events will not be sent to the
> remote site anyway?
> I would expect that after stopping a gateway sender no resources (or at
> least a minimal part) would be consumed by it. Otherwise we may as well not
> stop it or use the pause command depending on what we want to achieve.
>
> From what I have seen, queuing dropped events has its place while the
> gateway sender is starting and while it is stopping but if it is done in a
> sender to be started manually or in a manually stopped server it could
> provoke an unexpected memory exhaustion.
>
> I really think the solution proposed makes the behavior of the gateway
> sender command more logical.
>
> Best regards,
>
> Alberto
> ________________________________
> From: Eric Shu <es...@vmware.com>
> Sent: Wednesday, July 8, 2020 7:32 PM
> To: dev@geode.apache.org <de...@geode.apache.org>
> Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> It seems that I was not able to comment on the RFC in the wiki yet.
>
> Just try to find out if we have a simple solution for the issue you raised
> -- can we have a up-limit for the tmpDroppedEvents queue in question?
>
> Always check the limit before adding to the queue -- so that the tmp queue
> is not unbound?
>
> Regards,
> Eric
> ________________________________
> From: Alberto Gomez <al...@est.tech>
> Sent: Monday, July 6, 2020 8:24 AM
> To: geode <de...@geode.apache.org>
> Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> Hi,
>
> I have published a new RFC in the Apache Geode wiki with the following
> title: "Avoid the queueing of dropped events by the primary gateway sender
> when the gateway sender is stopped".
>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7C82aeb2f0bd30435131bd08d8237173c3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298317468898044&amp;sdata=ihK%2BeTvnhiA0XXcw22fv5VjjgzjYL2EQwL5%2Fe0KK%2F08%3D&amp;reserved=0
>
> Could you please give comments by Thursday, July 9th, 2020?
>
> Thanks in advance,
>
> Alberto G.
>

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Alexander Murmann <am...@apache.org>.
Hi Alberto,

The timing on this RFC feels really tight. Would you be open to extending
this to next week?

On Wed, Jul 8, 2020 at 1:04 PM Eric Shu <es...@vmware.com> wrote:

> I think the only case the memory issue occurred is when all gateway
> senders are stopped in the wan-site. Otherwise another member would assume
> to be the primary queue. No more events will be enqueued in
> tmpDroppedEvents on the member with original primary queue. (For parallel
> wan queue, I do not think stop one gateway queue is a valid case to
> support.)
>
> For all gateway senders are stopped case, no need to notify any other
> members in the wan site if the limit is reached. The tmpDroppedEvents is
> only used for remove events on the secondary queue. If no events are
> enqueued in the secondary queue, there is no need to add into
> tmpDroppedEvents at all. To me, it should be only used for limited events
> to be queued.
>
> Regards,
> Eric
> ________________________________
> From: Alberto Gomez <al...@est.tech>
> Sent: Wednesday, July 8, 2020 12:02 PM
> To: dev@geode.apache.org <de...@geode.apache.org>
> Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> Thanks for your comments, Eric.
>
> Limiting the size of the queue would be a simple solution but I think it
> would pose several problems on the the one configuring and operating Geode:
>
>   *   How big should the queue be? Probably not easy to dimension. Should
> the limit by on the memory occupied by the elements or on the number of
> elements in the queue (in which case, depending on the size of the
> elements, the memory used could vary a lot)?
>   *   What  to do when the limit has been reached? how do we notify that
> it was reached, what to do afterwards, how would we know what dropped
> events did not make it to the queue but should have been removed from the
> secondary's queue...
>
> I think the solution proposed in the RFC is simple enough and also
> addresses a possible confusion with the semantics of the gateway sender
> stop command.
> Stopping a gateway sender currently makes that all events received while
> the sender is stopped are dropped; but at the same time, unlimited memory
> may be consumed by the dropped events. We could put a limit on the amount
> of memory used by the queued dropped events but what would be the point in
> the first place to store them if those events will not be sent to the
> remote site anyway?
> I would expect that after stopping a gateway sender no resources (or at
> least a minimal part) would be consumed by it. Otherwise we may as well not
> stop it or use the pause command depending on what we want to achieve.
>
> From what I have seen, queuing dropped events has its place while the
> gateway sender is starting and while it is stopping but if it is done in a
> sender to be started manually or in a manually stopped server it could
> provoke an unexpected memory exhaustion.
>
> I really think the solution proposed makes the behavior of the gateway
> sender command more logical.
>
> Best regards,
>
> Alberto
> ________________________________
> From: Eric Shu <es...@vmware.com>
> Sent: Wednesday, July 8, 2020 7:32 PM
> To: dev@geode.apache.org <de...@geode.apache.org>
> Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> It seems that I was not able to comment on the RFC in the wiki yet.
>
> Just try to find out if we have a simple solution for the issue you raised
> -- can we have a up-limit for the tmpDroppedEvents queue in question?
>
> Always check the limit before adding to the queue -- so that the tmp queue
> is not unbound?
>
> Regards,
> Eric
> ________________________________
> From: Alberto Gomez <al...@est.tech>
> Sent: Monday, July 6, 2020 8:24 AM
> To: geode <de...@geode.apache.org>
> Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the
> primary gateway sender when the gateway sender is stopped
>
> Hi,
>
> I have published a new RFC in the Apache Geode wiki with the following
> title: "Avoid the queueing of dropped events by the primary gateway sender
> when the gateway sender is stopped".
>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7C82aeb2f0bd30435131bd08d8237173c3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298317468898044&amp;sdata=ihK%2BeTvnhiA0XXcw22fv5VjjgzjYL2EQwL5%2Fe0KK%2F08%3D&amp;reserved=0
>
> Could you please give comments by Thursday, July 9th, 2020?
>
> Thanks in advance,
>
> Alberto G.
>

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Alberto Gomez <al...@est.tech>.
Hi Eric,

I agree that the only case in which the memory issue may occur is when all gateway senders instances are stopped. And that is what the solution proposed in the RFC is targeted at, and also that is why the stop gateway sender command is intended to be updated to fix the issue.

Note that while stopping all the gateway sender instances, there may be events stored in the secondary senders that will be dropped by the primary sender. Those dropped events need to be queued while the secondaries are still up so that when the sender is started again, the secondary's queues would be drained accordingly.
If we go for the option of setting a limit on the dropped events, if set too small, there could be dropped events that should have been queued but weren't due to having reached the limit and which would not be sent to the secondaries to drain their queues completely (this is the case in which I meant that a notification must be sent to the operator of the system so that he knows that a possible issue is present in the system: queues with events that would stay there forever). On the other hand, if the limit is too high, the memory consumed by the queued dropped events could cause a problem of memory exhaustion.

I think the right balance is to stop queueing dropped events when all the gateway sender instances are stopped.

BR,

Alberto

________________________________
From: Eric Shu <es...@vmware.com>
Sent: Wednesday, July 8, 2020 9:25 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

I think the only case the memory issue occurred is when all gateway senders are stopped in the wan-site. Otherwise another member would assume to be the primary queue. No more events will be enqueued in tmpDroppedEvents on the member with original primary queue. (For parallel wan queue, I do not think stop one gateway queue is a valid case to support.)


For all gateway senders are stopped case, no need to notify any other members in the wan site if the limit is reached. The tmpDroppedEvents is only used for remove events on the secondary queue. If no events are enqueued in the secondary queue, there is no need to add into tmpDroppedEvents at all. To me, it should be only used for limited events to be queued.

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Wednesday, July 8, 2020 12:02 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Thanks for your comments, Eric.

Limiting the size of the queue would be a simple solution but I think it would pose several problems on the the one configuring and operating Geode:

  *   How big should the queue be? Probably not easy to dimension. Should the limit by on the memory occupied by the elements or on the number of elements in the queue (in which case, depending on the size of the elements, the memory used could vary a lot)?
  *   What  to do when the limit has been reached? how do we notify that it was reached, what to do afterwards, how would we know what dropped events did not make it to the queue but should have been removed from the secondary's queue...

I think the solution proposed in the RFC is simple enough and also addresses a possible confusion with the semantics of the gateway sender stop command.
Stopping a gateway sender currently makes that all events received while the sender is stopped are dropped; but at the same time, unlimited memory may be consumed by the dropped events. We could put a limit on the amount of memory used by the queued dropped events but what would be the point in the first place to store them if those events will not be sent to the remote site anyway?
I would expect that after stopping a gateway sender no resources (or at least a minimal part) would be consumed by it. Otherwise we may as well not stop it or use the pause command depending on what we want to achieve.

From what I have seen, queuing dropped events has its place while the gateway sender is starting and while it is stopping but if it is done in a sender to be started manually or in a manually stopped server it could provoke an unexpected memory exhaustion.

I really think the solution proposed makes the behavior of the gateway sender command more logical.

Best regards,

Alberto
________________________________
From: Eric Shu <es...@vmware.com>
Sent: Wednesday, July 8, 2020 7:32 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

It seems that I was not able to comment on the RFC in the wiki yet.

Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Monday, July 6, 2020 8:24 AM
To: geode <de...@geode.apache.org>
Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi,

I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7C82aeb2f0bd30435131bd08d8237173c3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298317468898044&amp;sdata=ihK%2BeTvnhiA0XXcw22fv5VjjgzjYL2EQwL5%2Fe0KK%2F08%3D&amp;reserved=0

Could you please give comments by Thursday, July 9th, 2020?

Thanks in advance,

Alberto G.

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Alberto Gomez <al...@est.tech>.
Hi Xiaojian,

No problem, I had already extended the deadline for comments to next Thursday (July the 16th). If more time is needed to get all the relevant comments, we can extend it further.

Thanks,

Alberto
________________________________
From: Xiaojian Zhou <zh...@vmware.com>
Sent: Friday, July 10, 2020 6:32 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi, Alberto:

I was the original author who introduced the tmpDroppedEvents. Due to other work, I only got chance to read the issue on Thursday, which is your deadline. Can you hold on a little bit longer to Monday?

I have been thinking of history of the code changes and issues you encountered. I will try to find a light-weight solution with minimum impact to current code.

Regards
Xiaojian Zhou

On 7/8/20, 1:05 PM, "Eric Shu" <es...@vmware.com> wrote:

    I think the only case the memory issue occurred is when all gateway senders are stopped in the wan-site. Otherwise another member would assume to be the primary queue. No more events will be enqueued in tmpDroppedEvents on the member with original primary queue. (For parallel wan queue, I do not think stop one gateway queue is a valid case to support.)

    For all gateway senders are stopped case, no need to notify any other members in the wan site if the limit is reached. The tmpDroppedEvents is only used for remove events on the secondary queue. If no events are enqueued in the secondary queue, there is no need to add into tmpDroppedEvents at all. To me, it should be only used for limited events to be queued.

    Regards,
    Eric
    ________________________________
    From: Alberto Gomez <al...@est.tech>
    Sent: Wednesday, July 8, 2020 12:02 PM
    To: dev@geode.apache.org <de...@geode.apache.org>
    Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    Thanks for your comments, Eric.

    Limiting the size of the queue would be a simple solution but I think it would pose several problems on the the one configuring and operating Geode:

      *   How big should the queue be? Probably not easy to dimension. Should the limit by on the memory occupied by the elements or on the number of elements in the queue (in which case, depending on the size of the elements, the memory used could vary a lot)?
      *   What  to do when the limit has been reached? how do we notify that it was reached, what to do afterwards, how would we know what dropped events did not make it to the queue but should have been removed from the secondary's queue...

    I think the solution proposed in the RFC is simple enough and also addresses a possible confusion with the semantics of the gateway sender stop command.
    Stopping a gateway sender currently makes that all events received while the sender is stopped are dropped; but at the same time, unlimited memory may be consumed by the dropped events. We could put a limit on the amount of memory used by the queued dropped events but what would be the point in the first place to store them if those events will not be sent to the remote site anyway?
    I would expect that after stopping a gateway sender no resources (or at least a minimal part) would be consumed by it. Otherwise we may as well not stop it or use the pause command depending on what we want to achieve.

    From what I have seen, queuing dropped events has its place while the gateway sender is starting and while it is stopping but if it is done in a sender to be started manually or in a manually stopped server it could provoke an unexpected memory exhaustion.

    I really think the solution proposed makes the behavior of the gateway sender command more logical.

    Best regards,

    Alberto
    ________________________________
    From: Eric Shu <es...@vmware.com>
    Sent: Wednesday, July 8, 2020 7:32 PM
    To: dev@geode.apache.org <de...@geode.apache.org>
    Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    It seems that I was not able to comment on the RFC in the wiki yet.

    Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

    Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

    Regards,
    Eric
    ________________________________
    From: Alberto Gomez <al...@est.tech>
    Sent: Monday, July 6, 2020 8:24 AM
    To: geode <de...@geode.apache.org>
    Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    Hi,

    I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

    https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Czhouxh%40vmware.com%7C368e4337126e4e4ca05508d8237a31c6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298355008581269&amp;sdata=8R2rlTCFR1sVOk77ZDwjG5IVAnCBnuXHWTc2lluB2do%3D&amp;reserved=0

    Could you please give comments by Thursday, July 9th, 2020?

    Thanks in advance,

    Alberto G.


Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Xiaojian Zhou <zh...@vmware.com>.
Hi, Alberto:

I was the original author who introduced the tmpDroppedEvents. Due to other work, I only got chance to read the issue on Thursday, which is your deadline. Can you hold on a little bit longer to Monday?

I have been thinking of history of the code changes and issues you encountered. I will try to find a light-weight solution with minimum impact to current code. 

Regards
Xiaojian Zhou

On 7/8/20, 1:05 PM, "Eric Shu" <es...@vmware.com> wrote:

    I think the only case the memory issue occurred is when all gateway senders are stopped in the wan-site. Otherwise another member would assume to be the primary queue. No more events will be enqueued in tmpDroppedEvents on the member with original primary queue. (For parallel wan queue, I do not think stop one gateway queue is a valid case to support.)

    For all gateway senders are stopped case, no need to notify any other members in the wan site if the limit is reached. The tmpDroppedEvents is only used for remove events on the secondary queue. If no events are enqueued in the secondary queue, there is no need to add into tmpDroppedEvents at all. To me, it should be only used for limited events to be queued.

    Regards,
    Eric
    ________________________________
    From: Alberto Gomez <al...@est.tech>
    Sent: Wednesday, July 8, 2020 12:02 PM
    To: dev@geode.apache.org <de...@geode.apache.org>
    Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    Thanks for your comments, Eric.

    Limiting the size of the queue would be a simple solution but I think it would pose several problems on the the one configuring and operating Geode:

      *   How big should the queue be? Probably not easy to dimension. Should the limit by on the memory occupied by the elements or on the number of elements in the queue (in which case, depending on the size of the elements, the memory used could vary a lot)?
      *   What  to do when the limit has been reached? how do we notify that it was reached, what to do afterwards, how would we know what dropped events did not make it to the queue but should have been removed from the secondary's queue...

    I think the solution proposed in the RFC is simple enough and also addresses a possible confusion with the semantics of the gateway sender stop command.
    Stopping a gateway sender currently makes that all events received while the sender is stopped are dropped; but at the same time, unlimited memory may be consumed by the dropped events. We could put a limit on the amount of memory used by the queued dropped events but what would be the point in the first place to store them if those events will not be sent to the remote site anyway?
    I would expect that after stopping a gateway sender no resources (or at least a minimal part) would be consumed by it. Otherwise we may as well not stop it or use the pause command depending on what we want to achieve.

    From what I have seen, queuing dropped events has its place while the gateway sender is starting and while it is stopping but if it is done in a sender to be started manually or in a manually stopped server it could provoke an unexpected memory exhaustion.

    I really think the solution proposed makes the behavior of the gateway sender command more logical.

    Best regards,

    Alberto
    ________________________________
    From: Eric Shu <es...@vmware.com>
    Sent: Wednesday, July 8, 2020 7:32 PM
    To: dev@geode.apache.org <de...@geode.apache.org>
    Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    It seems that I was not able to comment on the RFC in the wiki yet.

    Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

    Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

    Regards,
    Eric
    ________________________________
    From: Alberto Gomez <al...@est.tech>
    Sent: Monday, July 6, 2020 8:24 AM
    To: geode <de...@geode.apache.org>
    Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

    Hi,

    I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

    https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Czhouxh%40vmware.com%7C368e4337126e4e4ca05508d8237a31c6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298355008581269&amp;sdata=8R2rlTCFR1sVOk77ZDwjG5IVAnCBnuXHWTc2lluB2do%3D&amp;reserved=0

    Could you please give comments by Thursday, July 9th, 2020?

    Thanks in advance,

    Alberto G.


Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Eric Shu <es...@vmware.com>.
I think the only case the memory issue occurred is when all gateway senders are stopped in the wan-site. Otherwise another member would assume to be the primary queue. No more events will be enqueued in tmpDroppedEvents on the member with original primary queue. (For parallel wan queue, I do not think stop one gateway queue is a valid case to support.)

For all gateway senders are stopped case, no need to notify any other members in the wan site if the limit is reached. The tmpDroppedEvents is only used for remove events on the secondary queue. If no events are enqueued in the secondary queue, there is no need to add into tmpDroppedEvents at all. To me, it should be only used for limited events to be queued.

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Wednesday, July 8, 2020 12:02 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Thanks for your comments, Eric.

Limiting the size of the queue would be a simple solution but I think it would pose several problems on the the one configuring and operating Geode:

  *   How big should the queue be? Probably not easy to dimension. Should the limit by on the memory occupied by the elements or on the number of elements in the queue (in which case, depending on the size of the elements, the memory used could vary a lot)?
  *   What  to do when the limit has been reached? how do we notify that it was reached, what to do afterwards, how would we know what dropped events did not make it to the queue but should have been removed from the secondary's queue...

I think the solution proposed in the RFC is simple enough and also addresses a possible confusion with the semantics of the gateway sender stop command.
Stopping a gateway sender currently makes that all events received while the sender is stopped are dropped; but at the same time, unlimited memory may be consumed by the dropped events. We could put a limit on the amount of memory used by the queued dropped events but what would be the point in the first place to store them if those events will not be sent to the remote site anyway?
I would expect that after stopping a gateway sender no resources (or at least a minimal part) would be consumed by it. Otherwise we may as well not stop it or use the pause command depending on what we want to achieve.

From what I have seen, queuing dropped events has its place while the gateway sender is starting and while it is stopping but if it is done in a sender to be started manually or in a manually stopped server it could provoke an unexpected memory exhaustion.

I really think the solution proposed makes the behavior of the gateway sender command more logical.

Best regards,

Alberto
________________________________
From: Eric Shu <es...@vmware.com>
Sent: Wednesday, July 8, 2020 7:32 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

It seems that I was not able to comment on the RFC in the wiki yet.

Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Monday, July 6, 2020 8:24 AM
To: geode <de...@geode.apache.org>
Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi,

I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7C82aeb2f0bd30435131bd08d8237173c3%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298317468898044&amp;sdata=ihK%2BeTvnhiA0XXcw22fv5VjjgzjYL2EQwL5%2Fe0KK%2F08%3D&amp;reserved=0

Could you please give comments by Thursday, July 9th, 2020?

Thanks in advance,

Alberto G.

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Alberto Gomez <al...@est.tech>.
Thanks for your comments, Eric.

Limiting the size of the queue would be a simple solution but I think it would pose several problems on the the one configuring and operating Geode:

  *   How big should the queue be? Probably not easy to dimension. Should the limit by on the memory occupied by the elements or on the number of elements in the queue (in which case, depending on the size of the elements, the memory used could vary a lot)?
  *   What  to do when the limit has been reached? how do we notify that it was reached, what to do afterwards, how would we know what dropped events did not make it to the queue but should have been removed from the secondary's queue...

I think the solution proposed in the RFC is simple enough and also addresses a possible confusion with the semantics of the gateway sender stop command.
Stopping a gateway sender currently makes that all events received while the sender is stopped are dropped; but at the same time, unlimited memory may be consumed by the dropped events. We could put a limit on the amount of memory used by the queued dropped events but what would be the point in the first place to store them if those events will not be sent to the remote site anyway?
I would expect that after stopping a gateway sender no resources (or at least a minimal part) would be consumed by it. Otherwise we may as well not stop it or use the pause command depending on what we want to achieve.

From what I have seen, queuing dropped events has its place while the gateway sender is starting and while it is stopping but if it is done in a sender to be started manually or in a manually stopped server it could provoke an unexpected memory exhaustion.

I really think the solution proposed makes the behavior of the gateway sender command more logical.

Best regards,

Alberto
________________________________
From: Eric Shu <es...@vmware.com>
Sent: Wednesday, July 8, 2020 7:32 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

It seems that I was not able to comment on the RFC in the wiki yet.

Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Monday, July 6, 2020 8:24 AM
To: geode <de...@geode.apache.org>
Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi,

I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7Cf4d61d141c014854f4c508d821c0a78e%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637296458615861191&amp;sdata=Nqd%2FeUxXR713XIzn5KRg4x2V6CJIGHSgTEEwlTEzryk%3D&amp;reserved=0

Could you please give comments by Thursday, July 9th, 2020?

Thanks in advance,

Alberto G.

Re: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Posted by Eric Shu <es...@vmware.com>.
It seems that I was not able to comment on the RFC in the wiki yet.

Just try to find out if we have a simple solution for the issue you raised -- can we have a up-limit for the tmpDroppedEvents queue in question?

Always check the limit before adding to the queue -- so that the tmp queue is not unbound?

Regards,
Eric
________________________________
From: Alberto Gomez <al...@est.tech>
Sent: Monday, July 6, 2020 8:24 AM
To: geode <de...@geode.apache.org>
Subject: [DISCUSS] RFC - Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped

Hi,

I have published a new RFC in the Apache Geode wiki with the following title: "Avoid the queueing of dropped events by the primary gateway sender when the gateway sender is stopped".

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGEODE%2FAvoid%2Bthe%2Bqueuing%2Bof%2Bdropped%2Bevents%2Bby%2Bthe%2Bprimary%2Bgateway%2Bsender%2Bwhen%2Bthe%2Bgateway%2Bsender%2Bis%2Bstopped&amp;data=02%7C01%7Ceshu%40vmware.com%7Cf4d61d141c014854f4c508d821c0a78e%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637296458615861191&amp;sdata=Nqd%2FeUxXR713XIzn5KRg4x2V6CJIGHSgTEEwlTEzryk%3D&amp;reserved=0

Could you please give comments by Thursday, July 9th, 2020?

Thanks in advance,

Alberto G.