You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Alberto Gomez <al...@est.tech> on 2020/04/02 17:10:46 UTC

Re: RFC - Gateway sender to deliver transaction events atomically to receivers

Hi,

The

Yesterday was the end date for comments for this RFC.

I tried to answer the questions that were sent and also address the concerns about the proposal.

The main concern was related to the reordering of events that could happen in the gateway sender in order to group events of the same transaction in the same batch. My conclusion was that even if some reordering could happen, that would not mean that it was incorrect, given that it would be for events really close in time and also because there can already be some reordering of events between the time they are generated until they reach the sender's queue.

There was also a concern about adding a new field to each EntryEvent which would increase the over the wire format for everyone. The need for a new attribute in EntryEvent has been removed and in the new version of the proposal it is only needed to add the isLastTransactionEvent to the GatewaySenderEvent class.

Udo also showed some other and more general concerns which I do not know if have been resolved.

I would appreciate some more feedback so that I go for the pull request - if it is positive, or we keep the discussion alive.

Thanks in advance,

Alberto G.


________________________________
From: Barry Oglesby <bo...@pivotal.io>
Sent: Thursday, March 26, 2020 7:34 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: RFC - Gateway sender to deliver transaction events atomically to receivers

I added some comments to the proposal. There a few concerns, but I like the
idea in general.

Dan said: I remember someone trying to accomplish this same thing on top of
geode
with TransactionListener that dumped into a separate region or something
like that.

I think both Charlie and I have implemented this idea a few times,

Here is the basic idea:

The data region defines a TransactionListener with an afterCommit that:

- creates a UnitOfWork object
- creates an Event for each CacheEvent in the TransactionEvent event that
contains:
  - regionName
  - operation
  - key
  - value
  - potentially other things like EventID, VersionTag, TXId
- puts the UnitOfWork into a transaction region that has a gateway sender
attached to it. It also has a CacheWriter attached to it.

On the remote site, the CacheWriter attached to the transaction region:

- begins a transaction
- iterates the UnitOfWork's Events and executes each one
- commits the transaction

There are definitely some caveats to this:

- There is a race condition between the commit in the data region and the
TransactionListener afterCommit invocation doing the put into the
transaction region. If the server crashes after the put into the data
region but before the afterCommit callback, there will be data loss. In
that case, the transaction in question will not have been stored in the
transaction region and not be sent to the remote site.
- Ideally, the data and transaction regions should be colocated, but that
is a tricky.
- What happens if a transaction fails in the remote site?
- The transaction region has to be cleared periodically.
- Knowing when to process the transaction in the CacheWriter is a bit
tricky. It only needs to happen for transactions that originated remotely.
Adding distributed system id to the UnitOfWork is one way to address this.

Thanks,
Barry Oglesby



On Thu, Mar 26, 2020 at 7:34 AM Jacob Barrett <jb...@pivotal.io> wrote:

> Great idea. I called out some similar areas of concerns and spit balled
> some solutions to get the conversations flowing.
>
> -Jake
>
>
> > On Mar 25, 2020, at 8:04 AM, Alberto Gomez <al...@est.tech>
> wrote:
> >
> > Hi,
> >
> > Could you please review the RFC for "Gateway sender to deliver
> transaction events atomically to receivers"?
> >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >
> > Deadline for comments is Wednesday, April 1st, 2020,
> >
> > Thanks,
> >
> > Alberto G.
>
>