You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geode.apache.org by Bruce Schuchardt <bs...@pivotal.io> on 2019/03/20 15:48:31 UTC

[DISCUSS] TTL setting on WAN

We've seen situations where the receiving side of a WAN gateway is slow 
to accept data or is not accepting any data.  This can cause queues to 
fill up on the sending side.  If disk-overflow is being used this can 
even lead to an outage.  Some users are concerned more with the latest 
data and don't really care if old data is thrown away in this 
situation.  They may have set a TTL on their Regions and would like to 
be able to do the same thing with their GatewaySenders.

With that in mind I'd like to add this method to GatewaySenderFactory:

/** * Sets the timeToLive expiration attribute for queue entries for the 
next * {@code GatewaySender} created. * * @param timeToLive the 
timeToLive ExpirationAttributes for entries in this region * @return a 
reference to this GatewaySenderFactory object * @throws 
IllegalArgumentException if timeToLive is null * @see 
RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory setEntryTimeToLive(ExpirationAttributes timeToLive);

The exact implementation may not be the same as for Regions since we 
probably want to expire the oldest entries first and make sure we do so 
in their order in the queue.

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

Dan, I agree - there shouldn't be a choice of action here.  We can 
change it to just take an int and specify in the docs that it is in 
seconds and will cause timed-out events to not be transmitted to the 
other site.

I think we want to guarantee that time-out happens in queue order.

On 3/20/19 10:32 AM, Dan Smith wrote:
> Sounds like a good feature. I'm interested to see what ordering guarantees
> we want to implement - if we can expire things in the order they were added
> to the queue that seems like a good way to go.
>
> As Anil pointed out - do you really want to let the user pass in an
> ExpirationAttributes object? That allows the user to control the
> ExpirationAction (destroy, invalidate, etc.). I don't think the
> ExpirationAction makes sense for a GatewaySender?
>
> -Dan
>
> On Wed, Mar 20, 2019 at 10:26 AM Anilkumar Gingade <ag...@pivotal.io>
> wrote:
>
>> +1. Will the expiration (destroy) be applied on local queues or the
>> expiration will be replicated (for both serial and parallel)?
>>
>> -Anil.
>>
>>
>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
>> wrote:
>>
>>> We've seen situations where the receiving side of a WAN gateway is slow
>>> to accept data or is not accepting any data.  This can cause queues to
>>> fill up on the sending side.  If disk-overflow is being used this can
>>> even lead to an outage.  Some users are concerned more with the latest
>>> data and don't really care if old data is thrown away in this
>>> situation.  They may have set a TTL on their Regions and would like to
>>> be able to do the same thing with their GatewaySenders.
>>>
>>> With that in mind I'd like to add this method to GatewaySenderFactory:
>>>
>>> /** * Sets the timeToLive expiration attribute for queue entries for the
>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>> timeToLive ExpirationAttributes for entries in this region * @return a
>>> reference to this GatewaySenderFactory object * @throws
>>> IllegalArgumentException if timeToLive is null * @see
>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>
>>> The exact implementation may not be the same as for Regions since we
>>> probably want to expire the oldest entries first and make sure we do so
>>> in their order in the queue.
>>>
>>>

Re: [DISCUSS] TTL setting on WAN

Posted by Dan Smith <ds...@pivotal.io>.

Sounds like a good feature. I'm interested to see what ordering guarantees
we want to implement - if we can expire things in the order they were added
to the queue that seems like a good way to go.

As Anil pointed out - do you really want to let the user pass in an
ExpirationAttributes object? That allows the user to control the
ExpirationAction (destroy, invalidate, etc.). I don't think the
ExpirationAction makes sense for a GatewaySender?

-Dan

On Wed, Mar 20, 2019 at 10:26 AM Anilkumar Gingade <ag...@pivotal.io>
wrote:

> +1. Will the expiration (destroy) be applied on local queues or the
> expiration will be replicated (for both serial and parallel)?
>
> -Anil.
>
>
> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
> wrote:
>
> > We've seen situations where the receiving side of a WAN gateway is slow
> > to accept data or is not accepting any data.  This can cause queues to
> > fill up on the sending side.  If disk-overflow is being used this can
> > even lead to an outage.  Some users are concerned more with the latest
> > data and don't really care if old data is thrown away in this
> > situation.  They may have set a TTL on their Regions and would like to
> > be able to do the same thing with their GatewaySenders.
> >
> > With that in mind I'd like to add this method to GatewaySenderFactory:
> >
> > /** * Sets the timeToLive expiration attribute for queue entries for the
> > next * {@code GatewaySender} created. * * @param timeToLive the
> > timeToLive ExpirationAttributes for entries in this region * @return a
> > reference to this GatewaySenderFactory object * @throws
> > IllegalArgumentException if timeToLive is null * @see
> > RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
> > setEntryTimeToLive(ExpirationAttributes timeToLive);
> >
> > The exact implementation may not be the same as for Regions since we
> > probably want to expire the oldest entries first and make sure we do so
> > in their order in the queue.
> >
> >
>

Re: [DISCUSS] TTL setting on WAN

Posted by Anthony Baker <ab...@pivotal.io>.

An important use case for this is session caching.  Obviously it’s pointless to replicate an expired session—the user has already gone away.  Copying the bits to the remote cluster is just creating unnecessary work.

> On Mar 20, 2019, at 11:22 AM, Bruce Schuchardt <bs...@pivotal.io> wrote:
> 
> I don't know why the users didn't use conflation but I suspect they're generating events with new keys.  Conflation only applies if you're operating on the same keys all the time.  If you're generating new keys it doesn't help at all.
> 
> On 3/20/19 11:10 AM, Udo Kohlmeyer wrote:
>> If all that the customer is concerned about is that the receiving side gets the "latest" data, conflation is definitely the best bet. How do we classify old? The only classification that I have of old (in this context) is that there is a newer version of a data entry. This classification is not time based, as a TTL approach would require, but state.
>> 
>> What is the use of WAN replication if we are proposing to only replicate *some* of the data. How would the clusters ever know if they are in-sync? I believe we are just opening a door that we will never be able to close again and will cause us endless issues.
>> 
>> --Udo
>> 
>> On 3/20/19 10:56, Bruce Schuchardt wrote:
>>> Udo, in the cases I've looked at the user is okay with inconsistency because they don't really care about the old data. They're most interested in getting the newest data and keeping the sending site from going down.  I guess the docs for TTL should make it very clear that it will cause inconsistencies.
>>> 
>>> Conflation does seem like an appropriate thing to try if the same keys are being updated - I'll do some investigation and see why it wasn't appropriate.
>>> 
>>> 
>>> On 3/20/19 10:51 AM, Udo Kohlmeyer wrote:
>>>> -1, I don't believe this is a feature that we should support. IF a client is experiencing slow WAN replication and users only care about the "latest" data, then maybe the user should use "conflation".
>>>> 
>>>> With a TTL model, we are messing with our consistency tenet. I'm am NOT in support of a setting that can cause inconsistency.
>>>> 
>>>> Dead-letter queues is another area that WILL cause data/site inconsistency. I think we really have to take a step back, think about WHAT tenets are important to GEODE and then act accordingly.
>>>> 
>>>> --Udo
>>>> 
>>>> On 3/20/19 10:46, Bruce Schuchardt wrote:
>>>>> IDK Anil, we'll figure that out in the implementation.  I was thinking it would be in the dispatch threads, so if distribution is need that will happen as it does now.  I'm hopeful that this won't perturb the code too much.
>>>>> 
>>>>> One thing that was brought up to me in person was the Dead Letter Queue <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> initiative that seems to have stalled.  That seems like a very similar idea though it's reacting to errors coming from the receiving side and not a local condition.  I like the callback, stats, gfsh and cluster config integration in that write-up & think they might be useful here.  There is also relevant discussion in that page about things like PDX registrations.  Is that initiative going to move forward at some point or is it off the boards?
>>>>> 
>>>>> On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
>>>>>> +1. Will the expiration (destroy) be applied on local queues or the
>>>>>> expiration will be replicated (for both serial and parallel)?
>>>>>> 
>>>>>> -Anil.
>>>>>> 
>>>>>> 
>>>>>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
>>>>>> wrote:
>>>>>> 
>>>>>>> We've seen situations where the receiving side of a WAN gateway is slow
>>>>>>> to accept data or is not accepting any data.  This can cause queues to
>>>>>>> fill up on the sending side.  If disk-overflow is being used this can
>>>>>>> even lead to an outage.  Some users are concerned more with the latest
>>>>>>> data and don't really care if old data is thrown away in this
>>>>>>> situation.  They may have set a TTL on their Regions and would like to
>>>>>>> be able to do the same thing with their GatewaySenders.
>>>>>>> 
>>>>>>> With that in mind I'd like to add this method to GatewaySenderFactory:
>>>>>>> 
>>>>>>> /** * Sets the timeToLive expiration attribute for queue entries for the
>>>>>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>>>>>> timeToLive ExpirationAttributes for entries in this region * @return a
>>>>>>> reference to this GatewaySenderFactory object * @throws
>>>>>>> IllegalArgumentException if timeToLive is null * @see
>>>>>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>>>>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>>>>> 
>>>>>>> The exact implementation may not be the same as for Regions since we
>>>>>>> probably want to expire the oldest entries first and make sure we do so
>>>>>>> in their order in the queue.
>>>>>>> 
>>>>>>> 
>>>>>

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

I don't know why the users didn't use conflation but I suspect they're 
generating events with new keys.  Conflation only applies if you're 
operating on the same keys all the time.  If you're generating new keys 
it doesn't help at all.

On 3/20/19 11:10 AM, Udo Kohlmeyer wrote:
> If all that the customer is concerned about is that the receiving side 
> gets the "latest" data, conflation is definitely the best bet. How do 
> we classify old? The only classification that I have of old (in this 
> context) is that there is a newer version of a data entry. This 
> classification is not time based, as a TTL approach would require, but 
> state.
>
> What is the use of WAN replication if we are proposing to only 
> replicate *some* of the data. How would the clusters ever know if they 
> are in-sync? I believe we are just opening a door that we will never 
> be able to close again and will cause us endless issues.
>
> --Udo
>
> On 3/20/19 10:56, Bruce Schuchardt wrote:
>> Udo, in the cases I've looked at the user is okay with inconsistency 
>> because they don't really care about the old data. They're most 
>> interested in getting the newest data and keeping the sending site 
>> from going down.  I guess the docs for TTL should make it very clear 
>> that it will cause inconsistencies.
>>
>> Conflation does seem like an appropriate thing to try if the same 
>> keys are being updated - I'll do some investigation and see why it 
>> wasn't appropriate.
>>
>>
>> On 3/20/19 10:51 AM, Udo Kohlmeyer wrote:
>>> -1, I don't believe this is a feature that we should support. IF a 
>>> client is experiencing slow WAN replication and users only care 
>>> about the "latest" data, then maybe the user should use "conflation".
>>>
>>> With a TTL model, we are messing with our consistency tenet. I'm am 
>>> NOT in support of a setting that can cause inconsistency.
>>>
>>> Dead-letter queues is another area that WILL cause data/site 
>>> inconsistency. I think we really have to take a step back, think 
>>> about WHAT tenets are important to GEODE and then act accordingly.
>>>
>>> --Udo
>>>
>>> On 3/20/19 10:46, Bruce Schuchardt wrote:
>>>> IDK Anil, we'll figure that out in the implementation.  I was 
>>>> thinking it would be in the dispatch threads, so if distribution is 
>>>> need that will happen as it does now.  I'm hopeful that this won't 
>>>> perturb the code too much.
>>>>
>>>> One thing that was brought up to me in person was the Dead Letter 
>>>> Queue 
>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
>>>> initiative that seems to have stalled.  That seems like a very 
>>>> similar idea though it's reacting to errors coming from the 
>>>> receiving side and not a local condition.  I like the callback, 
>>>> stats, gfsh and cluster config integration in that write-up & think 
>>>> they might be useful here.  There is also relevant discussion in 
>>>> that page about things like PDX registrations.  Is that initiative 
>>>> going to move forward at some point or is it off the boards?
>>>>
>>>> On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
>>>>> +1. Will the expiration (destroy) be applied on local queues or the
>>>>> expiration will be replicated (for both serial and parallel)?
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 
>>>>> <bs...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> We've seen situations where the receiving side of a WAN gateway 
>>>>>> is slow
>>>>>> to accept data or is not accepting any data.  This can cause 
>>>>>> queues to
>>>>>> fill up on the sending side.  If disk-overflow is being used this 
>>>>>> can
>>>>>> even lead to an outage.  Some users are concerned more with the 
>>>>>> latest
>>>>>> data and don't really care if old data is thrown away in this
>>>>>> situation.  They may have set a TTL on their Regions and would 
>>>>>> like to
>>>>>> be able to do the same thing with their GatewaySenders.
>>>>>>
>>>>>> With that in mind I'd like to add this method to 
>>>>>> GatewaySenderFactory:
>>>>>>
>>>>>> /** * Sets the timeToLive expiration attribute for queue entries 
>>>>>> for the
>>>>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>>>>> timeToLive ExpirationAttributes for entries in this region * 
>>>>>> @return a
>>>>>> reference to this GatewaySenderFactory object * @throws
>>>>>> IllegalArgumentException if timeToLive is null * @see
>>>>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>>>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>>>>
>>>>>> The exact implementation may not be the same as for Regions since we
>>>>>> probably want to expire the oldest entries first and make sure we 
>>>>>> do so
>>>>>> in their order in the queue.
>>>>>>
>>>>>>
>>>>

Re: [DISCUSS] TTL setting on WAN

Posted by Udo Kohlmeyer <ud...@apache.org>.

If all that the customer is concerned about is that the receiving side 
gets the "latest" data, conflation is definitely the best bet. How do we 
classify old? The only classification that I have of old (in this 
context) is that there is a newer version of a data entry. This 
classification is not time based, as a TTL approach would require, but 
state.

What is the use of WAN replication if we are proposing to only replicate 
*some* of the data. How would the clusters ever know if they are 
in-sync? I believe we are just opening a door that we will never be able 
to close again and will cause us endless issues.

--Udo

On 3/20/19 10:56, Bruce Schuchardt wrote:
> Udo, in the cases I've looked at the user is okay with inconsistency 
> because they don't really care about the old data. They're most 
> interested in getting the newest data and keeping the sending site 
> from going down.  I guess the docs for TTL should make it very clear 
> that it will cause inconsistencies.
>
> Conflation does seem like an appropriate thing to try if the same keys 
> are being updated - I'll do some investigation and see why it wasn't 
> appropriate.
>
>
> On 3/20/19 10:51 AM, Udo Kohlmeyer wrote:
>> -1, I don't believe this is a feature that we should support. IF a 
>> client is experiencing slow WAN replication and users only care about 
>> the "latest" data, then maybe the user should use "conflation".
>>
>> With a TTL model, we are messing with our consistency tenet. I'm am 
>> NOT in support of a setting that can cause inconsistency.
>>
>> Dead-letter queues is another area that WILL cause data/site 
>> inconsistency. I think we really have to take a step back, think 
>> about WHAT tenets are important to GEODE and then act accordingly.
>>
>> --Udo
>>
>> On 3/20/19 10:46, Bruce Schuchardt wrote:
>>> IDK Anil, we'll figure that out in the implementation.  I was 
>>> thinking it would be in the dispatch threads, so if distribution is 
>>> need that will happen as it does now.  I'm hopeful that this won't 
>>> perturb the code too much.
>>>
>>> One thing that was brought up to me in person was the Dead Letter 
>>> Queue 
>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
>>> initiative that seems to have stalled.  That seems like a very 
>>> similar idea though it's reacting to errors coming from the 
>>> receiving side and not a local condition.  I like the callback, 
>>> stats, gfsh and cluster config integration in that write-up & think 
>>> they might be useful here.  There is also relevant discussion in 
>>> that page about things like PDX registrations.  Is that initiative 
>>> going to move forward at some point or is it off the boards?
>>>
>>> On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
>>>> +1. Will the expiration (destroy) be applied on local queues or the
>>>> expiration will be replicated (for both serial and parallel)?
>>>>
>>>> -Anil.
>>>>
>>>>
>>>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 
>>>> <bs...@pivotal.io>
>>>> wrote:
>>>>
>>>>> We've seen situations where the receiving side of a WAN gateway is 
>>>>> slow
>>>>> to accept data or is not accepting any data.  This can cause 
>>>>> queues to
>>>>> fill up on the sending side.  If disk-overflow is being used this can
>>>>> even lead to an outage.  Some users are concerned more with the 
>>>>> latest
>>>>> data and don't really care if old data is thrown away in this
>>>>> situation.  They may have set a TTL on their Regions and would 
>>>>> like to
>>>>> be able to do the same thing with their GatewaySenders.
>>>>>
>>>>> With that in mind I'd like to add this method to 
>>>>> GatewaySenderFactory:
>>>>>
>>>>> /** * Sets the timeToLive expiration attribute for queue entries 
>>>>> for the
>>>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>>>> timeToLive ExpirationAttributes for entries in this region * 
>>>>> @return a
>>>>> reference to this GatewaySenderFactory object * @throws
>>>>> IllegalArgumentException if timeToLive is null * @see
>>>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>>>
>>>>> The exact implementation may not be the same as for Regions since we
>>>>> probably want to expire the oldest entries first and make sure we 
>>>>> do so
>>>>> in their order in the queue.
>>>>>
>>>>>
>>>

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

Udo, in the cases I've looked at the user is okay with inconsistency 
because they don't really care about the old data. They're most 
interested in getting the newest data and keeping the sending site from 
going down.  I guess the docs for TTL should make it very clear that it 
will cause inconsistencies.

Conflation does seem like an appropriate thing to try if the same keys 
are being updated - I'll do some investigation and see why it wasn't 
appropriate.


On 3/20/19 10:51 AM, Udo Kohlmeyer wrote:
> -1, I don't believe this is a feature that we should support. IF a 
> client is experiencing slow WAN replication and users only care about 
> the "latest" data, then maybe the user should use "conflation".
>
> With a TTL model, we are messing with our consistency tenet. I'm am 
> NOT in support of a setting that can cause inconsistency.
>
> Dead-letter queues is another area that WILL cause data/site 
> inconsistency. I think we really have to take a step back, think about 
> WHAT tenets are important to GEODE and then act accordingly.
>
> --Udo
>
> On 3/20/19 10:46, Bruce Schuchardt wrote:
>> IDK Anil, we'll figure that out in the implementation.  I was 
>> thinking it would be in the dispatch threads, so if distribution is 
>> need that will happen as it does now.  I'm hopeful that this won't 
>> perturb the code too much.
>>
>> One thing that was brought up to me in person was the Dead Letter 
>> Queue 
>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
>> initiative that seems to have stalled.  That seems like a very 
>> similar idea though it's reacting to errors coming from the receiving 
>> side and not a local condition.  I like the callback, stats, gfsh and 
>> cluster config integration in that write-up & think they might be 
>> useful here.  There is also relevant discussion in that page about 
>> things like PDX registrations.  Is that initiative going to move 
>> forward at some point or is it off the boards?
>>
>> On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
>>> +1. Will the expiration (destroy) be applied on local queues or the
>>> expiration will be replicated (for both serial and parallel)?
>>>
>>> -Anil.
>>>
>>>
>>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 
>>> <bs...@pivotal.io>
>>> wrote:
>>>
>>>> We've seen situations where the receiving side of a WAN gateway is 
>>>> slow
>>>> to accept data or is not accepting any data.  This can cause queues to
>>>> fill up on the sending side.  If disk-overflow is being used this can
>>>> even lead to an outage.  Some users are concerned more with the latest
>>>> data and don't really care if old data is thrown away in this
>>>> situation.  They may have set a TTL on their Regions and would like to
>>>> be able to do the same thing with their GatewaySenders.
>>>>
>>>> With that in mind I'd like to add this method to GatewaySenderFactory:
>>>>
>>>> /** * Sets the timeToLive expiration attribute for queue entries 
>>>> for the
>>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>>> timeToLive ExpirationAttributes for entries in this region * @return a
>>>> reference to this GatewaySenderFactory object * @throws
>>>> IllegalArgumentException if timeToLive is null * @see
>>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>>
>>>> The exact implementation may not be the same as for Regions since we
>>>> probably want to expire the oldest entries first and make sure we 
>>>> do so
>>>> in their order in the queue.
>>>>
>>>>
>>

Re: [DISCUSS] TTL setting on WAN

Posted by Udo Kohlmeyer <ud...@apache.org>.

-1, I don't believe this is a feature that we should support. IF a 
client is experiencing slow WAN replication and users only care about 
the "latest" data, then maybe the user should use "conflation".

With a TTL model, we are messing with our consistency tenet. I'm am NOT 
in support of a setting that can cause inconsistency.

Dead-letter queues is another area that WILL cause data/site 
inconsistency. I think we really have to take a step back, think about 
WHAT tenets are important to GEODE and then act accordingly.

--Udo

On 3/20/19 10:46, Bruce Schuchardt wrote:
> IDK Anil, we'll figure that out in the implementation.  I was thinking 
> it would be in the dispatch threads, so if distribution is need that 
> will happen as it does now.  I'm hopeful that this won't perturb the 
> code too much.
>
> One thing that was brought up to me in person was the Dead Letter 
> Queue 
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
> initiative that seems to have stalled.  That seems like a very similar 
> idea though it's reacting to errors coming from the receiving side and 
> not a local condition.  I like the callback, stats, gfsh and cluster 
> config integration in that write-up & think they might be useful 
> here.  There is also relevant discussion in that page about things 
> like PDX registrations.  Is that initiative going to move forward at 
> some point or is it off the boards?
>
> On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
>> +1. Will the expiration (destroy) be applied on local queues or the
>> expiration will be replicated (for both serial and parallel)?
>>
>> -Anil.
>>
>>
>> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt 
>> <bs...@pivotal.io>
>> wrote:
>>
>>> We've seen situations where the receiving side of a WAN gateway is slow
>>> to accept data or is not accepting any data.  This can cause queues to
>>> fill up on the sending side.  If disk-overflow is being used this can
>>> even lead to an outage.  Some users are concerned more with the latest
>>> data and don't really care if old data is thrown away in this
>>> situation.  They may have set a TTL on their Regions and would like to
>>> be able to do the same thing with their GatewaySenders.
>>>
>>> With that in mind I'd like to add this method to GatewaySenderFactory:
>>>
>>> /** * Sets the timeToLive expiration attribute for queue entries for 
>>> the
>>> next * {@code GatewaySender} created. * * @param timeToLive the
>>> timeToLive ExpirationAttributes for entries in this region * @return a
>>> reference to this GatewaySenderFactory object * @throws
>>> IllegalArgumentException if timeToLive is null * @see
>>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>>
>>> The exact implementation may not be the same as for Regions since we
>>> probably want to expire the oldest entries first and make sure we do so
>>> in their order in the queue.
>>>
>>>
>

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

IDK Anil, we'll figure that out in the implementation.  I was thinking 
it would be in the dispatch threads, so if distribution is need that 
will happen as it does now.  I'm hopeful that this won't perturb the 
code too much.

One thing that was brought up to me in person was the Dead Letter Queue 
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452478> 
initiative that seems to have stalled.  That seems like a very similar 
idea though it's reacting to errors coming from the receiving side and 
not a local condition.  I like the callback, stats, gfsh and cluster 
config integration in that write-up & think they might be useful here.  
There is also relevant discussion in that page about things like PDX 
registrations.  Is that initiative going to move forward at some point 
or is it off the boards?

On 3/20/19 10:25 AM, Anilkumar Gingade wrote:
> +1. Will the expiration (destroy) be applied on local queues or the
> expiration will be replicated (for both serial and parallel)?
>
> -Anil.
>
>
> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
> wrote:
>
>> We've seen situations where the receiving side of a WAN gateway is slow
>> to accept data or is not accepting any data.  This can cause queues to
>> fill up on the sending side.  If disk-overflow is being used this can
>> even lead to an outage.  Some users are concerned more with the latest
>> data and don't really care if old data is thrown away in this
>> situation.  They may have set a TTL on their Regions and would like to
>> be able to do the same thing with their GatewaySenders.
>>
>> With that in mind I'd like to add this method to GatewaySenderFactory:
>>
>> /** * Sets the timeToLive expiration attribute for queue entries for the
>> next * {@code GatewaySender} created. * * @param timeToLive the
>> timeToLive ExpirationAttributes for entries in this region * @return a
>> reference to this GatewaySenderFactory object * @throws
>> IllegalArgumentException if timeToLive is null * @see
>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>>
>> The exact implementation may not be the same as for Regions since we
>> probably want to expire the oldest entries first and make sure we do so
>> in their order in the queue.
>>
>>

Re: [DISCUSS] TTL setting on WAN

Posted by Anilkumar Gingade <ag...@pivotal.io>.

Yes. From our experiment that looked like a possibility.

-Anil.


On Tue, Mar 26, 2019 at 9:59 AM Dan Smith <ds...@pivotal.io> wrote:

> Following up on the conflation thing - Anil and I did an experiment.
> Conflation definitely *does* happen on everything in the queue, not just
> the last batch. But we didn't see destroys get conflated with updates.
>
> So one thing that might make this use case work is to conflate the destroys
> with the updates. Then the disk space would be freed up when the expiration
> events are conflated in the queue.
>
> -Dan
>
> On Tue, Mar 26, 2019 at 8:19 AM Bruce Schuchardt <bs...@pivotal.io>
> wrote:
>
> > I've been thinking along those lines as well Suranjan.  Since conflation
> > and expiry-forwarding don't solve the problem of running out of disk
> > space the solution needs to involve the dispatch thread.
> >
> > For the session-state caching scenario that raised this whole issue I
> > think what you've described will work.  Looking at it with a wider lens
> > I'm a little concerned about a TTL on the queue because multiple regions
> > can feed into the same queue and you might not have the same TTL
> > settings on all of those regions.
> >
> > On 3/25/19 4:53 PM, Suranjan Kumar wrote:
> > > Hi,
> > >   I think the one approach for a user would be to 'filter' the events
> > while
> > > dispatching. If I remember correctly, we can attach a filter at
> dispatch
> > > time and filter the events based on creationTime of the GatewayEvent.
> We
> > > can provide a pre created filter and use it based on some so that user
> > > doesn't have to write his/her own.
> > >
> > > Something like,
> > > /**
> > > All the events which spend timeToLive or more time in queue will be
> > deleted
> > > from the queue
> > > and will not be sent to remote site.
> > > Possible consequence is that two sites can be inconsistent in case
> > > */
> > > public GatewaySenderFactory setEntryTimeToLive(long timeToLive);
> > >
> > > As queues will be read in LRU way this would be faster too. Only
> drawback
> > > is that there will be only one thread (not sure if we have concurrent
> > > dispatcher yet) clearing the queue.
> > >
> > > As Udo/Dan mentioned above, user needs to be aware of the consequences.
> > >
> > >
> > > On Tue, Mar 26, 2019 at 3:09 AM Bruce Schuchardt <
> bschuchardt@pivotal.io
> > >
> > > wrote:
> > >
> > >> I've walked through the code to forward expiration actions to async
> > >> event listeners & don't see how to apply it to removal of queue
> entries
> > >> for WAN.  The current implementation just queues the expiration
> > >> actions.  If we wanted to remove any queued events associated with the
> > >> expired region entry we'd have to scan the whole queue, which would be
> > >> too slow if we're overflowing the queue to disk.
> > >>
> > >> I've also walked through the conflation code.  It applies only to the
> > >> current batch being processed by the gateway sender.  The data
> structure
> > >> used to perform conflation is just a Map that is created in the
> sender's
> > >> batch processing method and then thrown away.
> > >>
> > >> On 3/20/19 11:15 AM, Dan Smith wrote:
> > >>>> 2) The developer wants to replicate _state_.  This means that
> implicit
> > >>>> state changes (expiration or eviction w/ destroy) could allow us to
> > >>>> optimize the queue size.  This is very similar to conflation, just a
> > >>>> different kind of optimization.
> > >>>>
> > >>>> For this second case, does it make sense to allow the user to
> specify
> > a
> > >>>> different TTL than the underlying region?  It seems like what the
> user
> > >>>> wants is to not replicate stale data and having an extra TTL
> attribute
> > >>>> would just be another value to mis-configure.  What do you think
> about
> > >> just
> > >>>> providing a boolean flag?
> > >>>>
> > >>>>
> > >>> This kinda jogged my memory. AsyncEventQueues actually *do* have a
> > >> boolean
> > >>> flag to allow you to forward expiration events to the queue. I have
> no
> > >> idea
> > >>> how this interacts with conflation though -
> > >>>
> > >>
> >
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-
> >
>

Re: [DISCUSS] TTL setting on WAN

Posted by Dan Smith <ds...@pivotal.io>.

Following up on the conflation thing - Anil and I did an experiment.
Conflation definitely *does* happen on everything in the queue, not just
the last batch. But we didn't see destroys get conflated with updates.

So one thing that might make this use case work is to conflate the destroys
with the updates. Then the disk space would be freed up when the expiration
events are conflated in the queue.

-Dan

On Tue, Mar 26, 2019 at 8:19 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> I've been thinking along those lines as well Suranjan.  Since conflation
> and expiry-forwarding don't solve the problem of running out of disk
> space the solution needs to involve the dispatch thread.
>
> For the session-state caching scenario that raised this whole issue I
> think what you've described will work.  Looking at it with a wider lens
> I'm a little concerned about a TTL on the queue because multiple regions
> can feed into the same queue and you might not have the same TTL
> settings on all of those regions.
>
> On 3/25/19 4:53 PM, Suranjan Kumar wrote:
> > Hi,
> >   I think the one approach for a user would be to 'filter' the events
> while
> > dispatching. If I remember correctly, we can attach a filter at dispatch
> > time and filter the events based on creationTime of the GatewayEvent. We
> > can provide a pre created filter and use it based on some so that user
> > doesn't have to write his/her own.
> >
> > Something like,
> > /**
> > All the events which spend timeToLive or more time in queue will be
> deleted
> > from the queue
> > and will not be sent to remote site.
> > Possible consequence is that two sites can be inconsistent in case
> > */
> > public GatewaySenderFactory setEntryTimeToLive(long timeToLive);
> >
> > As queues will be read in LRU way this would be faster too. Only drawback
> > is that there will be only one thread (not sure if we have concurrent
> > dispatcher yet) clearing the queue.
> >
> > As Udo/Dan mentioned above, user needs to be aware of the consequences.
> >
> >
> > On Tue, Mar 26, 2019 at 3:09 AM Bruce Schuchardt <bschuchardt@pivotal.io
> >
> > wrote:
> >
> >> I've walked through the code to forward expiration actions to async
> >> event listeners & don't see how to apply it to removal of queue entries
> >> for WAN.  The current implementation just queues the expiration
> >> actions.  If we wanted to remove any queued events associated with the
> >> expired region entry we'd have to scan the whole queue, which would be
> >> too slow if we're overflowing the queue to disk.
> >>
> >> I've also walked through the conflation code.  It applies only to the
> >> current batch being processed by the gateway sender.  The data structure
> >> used to perform conflation is just a Map that is created in the sender's
> >> batch processing method and then thrown away.
> >>
> >> On 3/20/19 11:15 AM, Dan Smith wrote:
> >>>> 2) The developer wants to replicate _state_.  This means that implicit
> >>>> state changes (expiration or eviction w/ destroy) could allow us to
> >>>> optimize the queue size.  This is very similar to conflation, just a
> >>>> different kind of optimization.
> >>>>
> >>>> For this second case, does it make sense to allow the user to specify
> a
> >>>> different TTL than the underlying region?  It seems like what the user
> >>>> wants is to not replicate stale data and having an extra TTL attribute
> >>>> would just be another value to mis-configure.  What do you think about
> >> just
> >>>> providing a boolean flag?
> >>>>
> >>>>
> >>> This kinda jogged my memory. AsyncEventQueues actually *do* have a
> >> boolean
> >>> flag to allow you to forward expiration events to the queue. I have no
> >> idea
> >>> how this interacts with conflation though -
> >>>
> >>
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-
>

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

I've been thinking along those lines as well Suranjan.  Since conflation 
and expiry-forwarding don't solve the problem of running out of disk 
space the solution needs to involve the dispatch thread.

For the session-state caching scenario that raised this whole issue I 
think what you've described will work.  Looking at it with a wider lens 
I'm a little concerned about a TTL on the queue because multiple regions 
can feed into the same queue and you might not have the same TTL 
settings on all of those regions.

On 3/25/19 4:53 PM, Suranjan Kumar wrote:
> Hi,
>   I think the one approach for a user would be to 'filter' the events while
> dispatching. If I remember correctly, we can attach a filter at dispatch
> time and filter the events based on creationTime of the GatewayEvent. We
> can provide a pre created filter and use it based on some so that user
> doesn't have to write his/her own.
>
> Something like,
> /**
> All the events which spend timeToLive or more time in queue will be deleted
> from the queue
> and will not be sent to remote site.
> Possible consequence is that two sites can be inconsistent in case
> */
> public GatewaySenderFactory setEntryTimeToLive(long timeToLive);
>
> As queues will be read in LRU way this would be faster too. Only drawback
> is that there will be only one thread (not sure if we have concurrent
> dispatcher yet) clearing the queue.
>
> As Udo/Dan mentioned above, user needs to be aware of the consequences.
>
>
> On Tue, Mar 26, 2019 at 3:09 AM Bruce Schuchardt <bs...@pivotal.io>
> wrote:
>
>> I've walked through the code to forward expiration actions to async
>> event listeners & don't see how to apply it to removal of queue entries
>> for WAN.  The current implementation just queues the expiration
>> actions.  If we wanted to remove any queued events associated with the
>> expired region entry we'd have to scan the whole queue, which would be
>> too slow if we're overflowing the queue to disk.
>>
>> I've also walked through the conflation code.  It applies only to the
>> current batch being processed by the gateway sender.  The data structure
>> used to perform conflation is just a Map that is created in the sender's
>> batch processing method and then thrown away.
>>
>> On 3/20/19 11:15 AM, Dan Smith wrote:
>>>> 2) The developer wants to replicate _state_.  This means that implicit
>>>> state changes (expiration or eviction w/ destroy) could allow us to
>>>> optimize the queue size.  This is very similar to conflation, just a
>>>> different kind of optimization.
>>>>
>>>> For this second case, does it make sense to allow the user to specify a
>>>> different TTL than the underlying region?  It seems like what the user
>>>> wants is to not replicate stale data and having an extra TTL attribute
>>>> would just be another value to mis-configure.  What do you think about
>> just
>>>> providing a boolean flag?
>>>>
>>>>
>>> This kinda jogged my memory. AsyncEventQueues actually *do* have a
>> boolean
>>> flag to allow you to forward expiration events to the queue. I have no
>> idea
>>> how this interacts with conflation though -
>>>
>> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-

Re: [DISCUSS] TTL setting on WAN

Posted by Suranjan Kumar <su...@gmail.com>.

Hi,
 I think the one approach for a user would be to 'filter' the events while
dispatching. If I remember correctly, we can attach a filter at dispatch
time and filter the events based on creationTime of the GatewayEvent. We
can provide a pre created filter and use it based on some so that user
doesn't have to write his/her own.

Something like,
/**
All the events which spend timeToLive or more time in queue will be deleted
from the queue
and will not be sent to remote site.
Possible consequence is that two sites can be inconsistent in case
*/
public GatewaySenderFactory setEntryTimeToLive(long timeToLive);

As queues will be read in LRU way this would be faster too. Only drawback
is that there will be only one thread (not sure if we have concurrent
dispatcher yet) clearing the queue.

As Udo/Dan mentioned above, user needs to be aware of the consequences.

On Tue, Mar 26, 2019 at 3:09 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> I've walked through the code to forward expiration actions to async
> event listeners & don't see how to apply it to removal of queue entries
> for WAN.  The current implementation just queues the expiration
> actions.  If we wanted to remove any queued events associated with the
> expired region entry we'd have to scan the whole queue, which would be
> too slow if we're overflowing the queue to disk.
>
> I've also walked through the conflation code.  It applies only to the
> current batch being processed by the gateway sender.  The data structure
> used to perform conflation is just a Map that is created in the sender's
> batch processing method and then thrown away.
>
> On 3/20/19 11:15 AM, Dan Smith wrote:
> >> 2) The developer wants to replicate _state_.  This means that implicit
> >> state changes (expiration or eviction w/ destroy) could allow us to
> >> optimize the queue size.  This is very similar to conflation, just a
> >> different kind of optimization.
> >>
> >> For this second case, does it make sense to allow the user to specify a
> >> different TTL than the underlying region?  It seems like what the user
> >> wants is to not replicate stale data and having an extra TTL attribute
> >> would just be another value to mis-configure.  What do you think about
> just
> >> providing a boolean flag?
> >>
> >>
> > This kinda jogged my memory. AsyncEventQueues actually *do* have a
> boolean
> > flag to allow you to forward expiration events to the queue. I have no
> idea
> > how this interacts with conflation though -
> >
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-
> >
>

Re: [DISCUSS] TTL setting on WAN

Posted by Bruce Schuchardt <bs...@pivotal.io>.

I've walked through the code to forward expiration actions to async 
event listeners & don't see how to apply it to removal of queue entries 
for WAN.  The current implementation just queues the expiration 
actions.  If we wanted to remove any queued events associated with the 
expired region entry we'd have to scan the whole queue, which would be 
too slow if we're overflowing the queue to disk.

I've also walked through the conflation code.  It applies only to the 
current batch being processed by the gateway sender.  The data structure 
used to perform conflation is just a Map that is created in the sender's 
batch processing method and then thrown away.

On 3/20/19 11:15 AM, Dan Smith wrote:
>> 2) The developer wants to replicate _state_.  This means that implicit
>> state changes (expiration or eviction w/ destroy) could allow us to
>> optimize the queue size.  This is very similar to conflation, just a
>> different kind of optimization.
>>
>> For this second case, does it make sense to allow the user to specify a
>> different TTL than the underlying region?  It seems like what the user
>> wants is to not replicate stale data and having an extra TTL attribute
>> would just be another value to mis-configure.  What do you think about just
>> providing a boolean flag?
>>
>>
> This kinda jogged my memory. AsyncEventQueues actually *do* have a boolean
> flag to allow you to forward expiration events to the queue. I have no idea
> how this interacts with conflation though -
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-
>

Re: [DISCUSS] TTL setting on WAN

Posted by Dan Smith <ds...@pivotal.io>.

> 2) The developer wants to replicate _state_.  This means that implicit
> state changes (expiration or eviction w/ destroy) could allow us to
> optimize the queue size.  This is very similar to conflation, just a
> different kind of optimization.
>
> For this second case, does it make sense to allow the user to specify a
> different TTL than the underlying region?  It seems like what the user
> wants is to not replicate stale data and having an extra TTL attribute
> would just be another value to mis-configure.  What do you think about just
> providing a boolean flag?
>
>
This kinda jogged my memory. AsyncEventQueues actually *do* have a boolean
flag to allow you to forward expiration events to the queue. I have no idea
how this interacts with conflation though -
https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-

Re: [DISCUSS] TTL setting on WAN

Posted by Anthony Baker <ab...@pivotal.io>.

I think there are two modes:

1) The developer wants to replicate _events_.  This means all changes need to be sent to the remote site regardless of the state in the local cluster.  Most likely in order :-)

2) The developer wants to replicate _state_.  This means that implicit state changes (expiration or eviction w/ destroy) could allow us to optimize the queue size.  This is very similar to conflation, just a different kind of optimization.

For this second case, does it make sense to allow the user to specify a different TTL than the underlying region?  It seems like what the user wants is to not replicate stale data and having an extra TTL attribute would just be another value to mis-configure.  What do you think about just providing a boolean flag?

Anthony

> On Mar 20, 2019, at 10:25 AM, Anilkumar Gingade <ag...@pivotal.io> wrote:
> 
> +1. Will the expiration (destroy) be applied on local queues or the
> expiration will be replicated (for both serial and parallel)?
> 
> -Anil.
> 
> 
> On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
> wrote:
> 
>> We've seen situations where the receiving side of a WAN gateway is slow
>> to accept data or is not accepting any data.  This can cause queues to
>> fill up on the sending side.  If disk-overflow is being used this can
>> even lead to an outage.  Some users are concerned more with the latest
>> data and don't really care if old data is thrown away in this
>> situation.  They may have set a TTL on their Regions and would like to
>> be able to do the same thing with their GatewaySenders.
>> 
>> With that in mind I'd like to add this method to GatewaySenderFactory:
>> 
>> /** * Sets the timeToLive expiration attribute for queue entries for the
>> next * {@code GatewaySender} created. * * @param timeToLive the
>> timeToLive ExpirationAttributes for entries in this region * @return a
>> reference to this GatewaySenderFactory object * @throws
>> IllegalArgumentException if timeToLive is null * @see
>> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
>> setEntryTimeToLive(ExpirationAttributes timeToLive);
>> 
>> The exact implementation may not be the same as for Regions since we
>> probably want to expire the oldest entries first and make sure we do so
>> in their order in the queue.
>> 
>>

Re: [DISCUSS] TTL setting on WAN

Posted by Anilkumar Gingade <ag...@pivotal.io>.

+1. Will the expiration (destroy) be applied on local queues or the
expiration will be replicated (for both serial and parallel)?

-Anil.


On Wed, Mar 20, 2019 at 8:59 AM Bruce Schuchardt <bs...@pivotal.io>
wrote:

> We've seen situations where the receiving side of a WAN gateway is slow
> to accept data or is not accepting any data.  This can cause queues to
> fill up on the sending side.  If disk-overflow is being used this can
> even lead to an outage.  Some users are concerned more with the latest
> data and don't really care if old data is thrown away in this
> situation.  They may have set a TTL on their Regions and would like to
> be able to do the same thing with their GatewaySenders.
>
> With that in mind I'd like to add this method to GatewaySenderFactory:
>
> /** * Sets the timeToLive expiration attribute for queue entries for the
> next * {@code GatewaySender} created. * * @param timeToLive the
> timeToLive ExpirationAttributes for entries in this region * @return a
> reference to this GatewaySenderFactory object * @throws
> IllegalArgumentException if timeToLive is null * @see
> RegionFactory#setEntryTimeToLive */ public GatewaySenderFactory
> setEntryTimeToLive(ExpirationAttributes timeToLive);
>
> The exact implementation may not be the same as for Regions since we
> probably want to expire the oldest entries first and make sure we do so
> in their order in the queue.
>
>