You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Alberto Gomez <al...@est.tech> on 2020/03/25 15:04:47 UTC

RFC - Gateway sender to deliver transaction events atomically to receivers

Hi,

Could you please review the RFC for "Gateway sender to deliver transaction events atomically to receivers"?

https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers

Deadline for comments is Wednesday, April 1st, 2020,

Thanks,

Alberto G.

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Dan Smith <ds...@pivotal.io>.
> btw what are we voting on?  Just curious as I wasn't sure if we were
voting
for the current proposal or whether we should continue this discussion?

Sorry, just +1'd because I like the idea, not to imply we're voting on
anything. I thought that's a general apache convention during a discussion.

> In essence you are proposing a distributed transaction over WAN

I wouldn't equate this with distribution transactions at all. This is just
trying to group transaction events together in a batch. I don't think we
need to solve the whole distribution transaction problem with this proposal.

I remember someone trying to accomplish this same thing on top of geode
with TransactionListener that dumped into a separate region or something
like that. Barry might remember more details. Having a
--group-transaction-events options seems much more user friendly.

-Dan

On Wed, Mar 25, 2020 at 4:19 PM Udo Kohlmeyer <ud...@apache.com> wrote:

> My vote was to implement said solution.
>
> But it is a HUGE +1 to continue the discussion to resolve the issue
> identified!
>
> --Udo
>
> On 3/25/20 4:14 PM, Jason Huynh wrote:
> > I put some comments on the proposal on the wiki.
> >
> > btw what are we voting on?  Just curious as I wasn't sure if we were
> voting
> > for the current proposal or whether we should continue this discussion?
> >
> > I like the idea of having transactional ops be sent together in a batch
> if
> > possible and it would be an iterative improvement, whether that is a
> > complete solution to a larger problem, I think might be beyond what
> Alberto
> > was proposing?
> >
> > Again I am not exactly sure if this was intended to be a vote but I
> > would +1 the attempt and continuation of the discussion/proposal and
> > probably -0 the current proposal as there are some ideas/things to iron
> > out.
> >
> >
> >
> >
> > On Wed, Mar 25, 2020 at 3:49 PM Udo Kohlmeyer <uk...@pivotal.io>
> wrote:
> >
> >> Hi there Alberto,
> >>
> >> It's a "-1" from me.
> >>
> >> I have raised my concerns in the RFC comments. To summarize, whilst I
> >> like the idea (I had never thought of that problem you are trying to
> >> solve), I don't know how this will behave at scale. Just looking at some
> >> of the comments, I think it is safe to say that many have similar
> feelings.
> >>
> >> I like the notion of this proposal, but I'm not convinced that the
> >> solution is actually going solve the problem. I think it might solve
> >> only a very small part of the problem.
> >>
> >> In essence you are proposing a distributed transaction over WAN and I
> >> don't see enough in the proposal to convince me that we have a solution
> >> that will solve this problem.
> >>
> >> --Udo
> >>
> >> On 3/25/20 8:04 AM, Alberto Gomez wrote:
> >>> Hi,
> >>>
> >>> Could you please review the RFC for "Gateway sender to deliver
> >> transaction events atomically to receivers"?
> >>>
> >>
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >>> Deadline for comments is Wednesday, April 1st, 2020,
> >>>
> >>> Thanks,
> >>>
> >>> Alberto G.
> >>>
>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Udo Kohlmeyer <ud...@apache.com>.
My vote was to implement said solution.

But it is a HUGE +1 to continue the discussion to resolve the issue 
identified!

--Udo

On 3/25/20 4:14 PM, Jason Huynh wrote:
> I put some comments on the proposal on the wiki.
>
> btw what are we voting on?  Just curious as I wasn't sure if we were voting
> for the current proposal or whether we should continue this discussion?
>
> I like the idea of having transactional ops be sent together in a batch if
> possible and it would be an iterative improvement, whether that is a
> complete solution to a larger problem, I think might be beyond what Alberto
> was proposing?
>
> Again I am not exactly sure if this was intended to be a vote but I
> would +1 the attempt and continuation of the discussion/proposal and
> probably -0 the current proposal as there are some ideas/things to iron
> out.
>
>
>
>
> On Wed, Mar 25, 2020 at 3:49 PM Udo Kohlmeyer <uk...@pivotal.io> wrote:
>
>> Hi there Alberto,
>>
>> It's a "-1" from me.
>>
>> I have raised my concerns in the RFC comments. To summarize, whilst I
>> like the idea (I had never thought of that problem you are trying to
>> solve), I don't know how this will behave at scale. Just looking at some
>> of the comments, I think it is safe to say that many have similar feelings.
>>
>> I like the notion of this proposal, but I'm not convinced that the
>> solution is actually going solve the problem. I think it might solve
>> only a very small part of the problem.
>>
>> In essence you are proposing a distributed transaction over WAN and I
>> don't see enough in the proposal to convince me that we have a solution
>> that will solve this problem.
>>
>> --Udo
>>
>> On 3/25/20 8:04 AM, Alberto Gomez wrote:
>>> Hi,
>>>
>>> Could you please review the RFC for "Gateway sender to deliver
>> transaction events atomically to receivers"?
>>>
>> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
>>> Deadline for comments is Wednesday, April 1st, 2020,
>>>
>>> Thanks,
>>>
>>> Alberto G.
>>>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Jason Huynh <jh...@pivotal.io>.
I put some comments on the proposal on the wiki.

btw what are we voting on?  Just curious as I wasn't sure if we were voting
for the current proposal or whether we should continue this discussion?

I like the idea of having transactional ops be sent together in a batch if
possible and it would be an iterative improvement, whether that is a
complete solution to a larger problem, I think might be beyond what Alberto
was proposing?

Again I am not exactly sure if this was intended to be a vote but I
would +1 the attempt and continuation of the discussion/proposal and
probably -0 the current proposal as there are some ideas/things to iron
out.




On Wed, Mar 25, 2020 at 3:49 PM Udo Kohlmeyer <uk...@pivotal.io> wrote:

> Hi there Alberto,
>
> It's a "-1" from me.
>
> I have raised my concerns in the RFC comments. To summarize, whilst I
> like the idea (I had never thought of that problem you are trying to
> solve), I don't know how this will behave at scale. Just looking at some
> of the comments, I think it is safe to say that many have similar feelings.
>
> I like the notion of this proposal, but I'm not convinced that the
> solution is actually going solve the problem. I think it might solve
> only a very small part of the problem.
>
> In essence you are proposing a distributed transaction over WAN and I
> don't see enough in the proposal to convince me that we have a solution
> that will solve this problem.
>
> --Udo
>
> On 3/25/20 8:04 AM, Alberto Gomez wrote:
> > Hi,
> >
> > Could you please review the RFC for "Gateway sender to deliver
> transaction events atomically to receivers"?
> >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >
> > Deadline for comments is Wednesday, April 1st, 2020,
> >
> > Thanks,
> >
> > Alberto G.
> >
>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Udo Kohlmeyer <uk...@pivotal.io>.
Hi there Alberto,

It's a "-1" from me.

I have raised my concerns in the RFC comments. To summarize, whilst I 
like the idea (I had never thought of that problem you are trying to 
solve), I don't know how this will behave at scale. Just looking at some 
of the comments, I think it is safe to say that many have similar feelings.

I like the notion of this proposal, but I'm not convinced that the 
solution is actually going solve the problem. I think it might solve 
only a very small part of the problem.

In essence you are proposing a distributed transaction over WAN and I 
don't see enough in the proposal to convince me that we have a solution 
that will solve this problem.

--Udo

On 3/25/20 8:04 AM, Alberto Gomez wrote:
> Hi,
>
> Could you please review the RFC for "Gateway sender to deliver transaction events atomically to receivers"?
>
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
>
> Deadline for comments is Wednesday, April 1st, 2020,
>
> Thanks,
>
> Alberto G.
>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Alberto Gomez <al...@est.tech>.
Hi,

The

Yesterday was the end date for comments for this RFC.

I tried to answer the questions that were sent and also address the concerns about the proposal.

The main concern was related to the reordering of events that could happen in the gateway sender in order to group events of the same transaction in the same batch. My conclusion was that even if some reordering could happen, that would not mean that it was incorrect, given that it would be for events really close in time and also because there can already be some reordering of events between the time they are generated until they reach the sender's queue.

There was also a concern about adding a new field to each EntryEvent which would increase the over the wire format for everyone. The need for a new attribute in EntryEvent has been removed and in the new version of the proposal it is only needed to add the isLastTransactionEvent to the GatewaySenderEvent class.

Udo also showed some other and more general concerns which I do not know if have been resolved.

I would appreciate some more feedback so that I go for the pull request - if it is positive, or we keep the discussion alive.

Thanks in advance,

Alberto G.


________________________________
From: Barry Oglesby <bo...@pivotal.io>
Sent: Thursday, March 26, 2020 7:34 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: RFC - Gateway sender to deliver transaction events atomically to receivers

I added some comments to the proposal. There a few concerns, but I like the
idea in general.

Dan said: I remember someone trying to accomplish this same thing on top of
geode
with TransactionListener that dumped into a separate region or something
like that.

I think both Charlie and I have implemented this idea a few times,

Here is the basic idea:

The data region defines a TransactionListener with an afterCommit that:

- creates a UnitOfWork object
- creates an Event for each CacheEvent in the TransactionEvent event that
contains:
  - regionName
  - operation
  - key
  - value
  - potentially other things like EventID, VersionTag, TXId
- puts the UnitOfWork into a transaction region that has a gateway sender
attached to it. It also has a CacheWriter attached to it.

On the remote site, the CacheWriter attached to the transaction region:

- begins a transaction
- iterates the UnitOfWork's Events and executes each one
- commits the transaction

There are definitely some caveats to this:

- There is a race condition between the commit in the data region and the
TransactionListener afterCommit invocation doing the put into the
transaction region. If the server crashes after the put into the data
region but before the afterCommit callback, there will be data loss. In
that case, the transaction in question will not have been stored in the
transaction region and not be sent to the remote site.
- Ideally, the data and transaction regions should be colocated, but that
is a tricky.
- What happens if a transaction fails in the remote site?
- The transaction region has to be cleared periodically.
- Knowing when to process the transaction in the CacheWriter is a bit
tricky. It only needs to happen for transactions that originated remotely.
Adding distributed system id to the UnitOfWork is one way to address this.

Thanks,
Barry Oglesby



On Thu, Mar 26, 2020 at 7:34 AM Jacob Barrett <jb...@pivotal.io> wrote:

> Great idea. I called out some similar areas of concerns and spit balled
> some solutions to get the conversations flowing.
>
> -Jake
>
>
> > On Mar 25, 2020, at 8:04 AM, Alberto Gomez <al...@est.tech>
> wrote:
> >
> > Hi,
> >
> > Could you please review the RFC for "Gateway sender to deliver
> transaction events atomically to receivers"?
> >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >
> > Deadline for comments is Wednesday, April 1st, 2020,
> >
> > Thanks,
> >
> > Alberto G.
>
>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Barry Oglesby <bo...@pivotal.io>.
I added some comments to the proposal. There a few concerns, but I like the
idea in general.

Dan said: I remember someone trying to accomplish this same thing on top of
geode
with TransactionListener that dumped into a separate region or something
like that.

I think both Charlie and I have implemented this idea a few times,

Here is the basic idea:

The data region defines a TransactionListener with an afterCommit that:

- creates a UnitOfWork object
- creates an Event for each CacheEvent in the TransactionEvent event that
contains:
  - regionName
  - operation
  - key
  - value
  - potentially other things like EventID, VersionTag, TXId
- puts the UnitOfWork into a transaction region that has a gateway sender
attached to it. It also has a CacheWriter attached to it.

On the remote site, the CacheWriter attached to the transaction region:

- begins a transaction
- iterates the UnitOfWork's Events and executes each one
- commits the transaction

There are definitely some caveats to this:

- There is a race condition between the commit in the data region and the
TransactionListener afterCommit invocation doing the put into the
transaction region. If the server crashes after the put into the data
region but before the afterCommit callback, there will be data loss. In
that case, the transaction in question will not have been stored in the
transaction region and not be sent to the remote site.
- Ideally, the data and transaction regions should be colocated, but that
is a tricky.
- What happens if a transaction fails in the remote site?
- The transaction region has to be cleared periodically.
- Knowing when to process the transaction in the CacheWriter is a bit
tricky. It only needs to happen for transactions that originated remotely.
Adding distributed system id to the UnitOfWork is one way to address this.

Thanks,
Barry Oglesby



On Thu, Mar 26, 2020 at 7:34 AM Jacob Barrett <jb...@pivotal.io> wrote:

> Great idea. I called out some similar areas of concerns and spit balled
> some solutions to get the conversations flowing.
>
> -Jake
>
>
> > On Mar 25, 2020, at 8:04 AM, Alberto Gomez <al...@est.tech>
> wrote:
> >
> > Hi,
> >
> > Could you please review the RFC for "Gateway sender to deliver
> transaction events atomically to receivers"?
> >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >
> > Deadline for comments is Wednesday, April 1st, 2020,
> >
> > Thanks,
> >
> > Alberto G.
>
>

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Jacob Barrett <jb...@pivotal.io>.
Great idea. I called out some similar areas of concerns and spit balled some solutions to get the conversations flowing. 

-Jake


> On Mar 25, 2020, at 8:04 AM, Alberto Gomez <al...@est.tech> wrote:
> 
> Hi,
> 
> Could you please review the RFC for "Gateway sender to deliver transaction events atomically to receivers"?
> 
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> 
> Deadline for comments is Wednesday, April 1st, 2020,
> 
> Thanks,
> 
> Alberto G.


Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Posted by Dan Smith <ds...@pivotal.io>.
+1

I think this a good improvement to the way transactions behave with WAN! I
had a couple of more detailed comments I put on the doc.

Thanks,
-Dan

On Wed, Mar 25, 2020 at 8:05 AM Alberto Gomez <al...@est.tech>
wrote:

> Hi,
>
> Could you please review the RFC for "Gateway sender to deliver transaction
> events atomically to receivers"?
>
>
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
>
> Deadline for comments is Wednesday, April 1st, 2020,
>
> Thanks,
>
> Alberto G.
>