You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@river.apache.org by Patricia Shanahan <pa...@acm.org> on 2011/03/04 14:47:31 UTC

Re: [jira] Created: (RIVER-393) Distributed Java Space

On 3/4/2011 3:47 AM, Tom Hobbs (JIRA) wrote:

>   - Block the write method call until the RS is happy the entry is persisted safely

Could you define what you mean by "persisted safely"? Do you count 
getting it to non-volatile storage, or does it need to be stored on 
multiple servers? If the transaction is in non-volatile storage but that 
storage is attached to a dead server, the entry exists only in a very 
theoretical sense, and attempts to read or take it would fail.

I feel that distributed transactions are sufficiently closely related 
that they should be discussed in the same Jira.

Patricia

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Patricia Shanahan <pa...@acm.org>.

On 3/5/2011 7:14 AM, Niclas Hedhman wrote:
> On Sat, Mar 5, 2011 at 5:55 PM, Dan Creswell<da...@gmail.com>  wrote:
>> Transactions don't actually matter in any of the above as they are just
>> another form of operation boundary. Transactions are "durable":
>>
>> "The results of a transaction should be as persistent as the entity on which
>> the transaction commits."
>>
>> But that's ultimately defined by (3) above, rather than transactions
>> themselves.
>
> After recently spent a lot of evaluation on 'persisted spaces', I can
> only say 2 things;
>
>     - It is hard to get right with the SLA that matters;
>
>     - The SLA that I, the enterprise developer of resilient HA system,
> care about is that once the operation completes (call it transaction
> commit), the state transition is preserved even if N arbitrary
> failures occur, AND that the system has a computable T time to restore
> itself from 'reduced HA' to 'full HA' within which additional failures
> beyond N is not guaranteeing preservation of state transition. The
> larger the T, the more N I need which increases my cost profile.

This is a very helpful indication of the real world requirements.

> The are of the Jini Transaction Specification is really interesting,
> since one needs to figure out how to make the transaction manager
> distributed and resilient as well, and recovery from a transaction
> manager failure, half way through second phase... I don't even want to
> imagine it. You might find that a new, less featureful spec is the
> best recourse.

I have a vague memory of seeing a research paper on distributed 
transaction management in the last year or so. I intend to do a library 
search to make sure we have access to the latest ideas from the academic 
world. If this is an area of active research we may even be able to 
co-opt a researcher who would like to see their ideas turned into a real 
world implementation.

I'm actually inclined to tackle distributed, resilient transaction 
management first, for two reasons:

1. I'm dubious about whether a distributed JavaSpace would really be 
useful without it.

2. I think it may be useful in maintaining consistency among duplicated 
copies of the same entry.

Patricia

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Dan Creswell <da...@gmail.com>.

Cool, so.....

On 5 March 2011 15:14, Niclas Hedhman <ni...@hedhman.org> wrote:

> On Sat, Mar 5, 2011 at 5:55 PM, Dan Creswell <da...@gmail.com>
> wrote:
> > Transactions don't actually matter in any of the above as they are just
> > another form of operation boundary. Transactions are "durable":
> >
> > "The results of a transaction should be as persistent as the entity on
> which
> > the transaction commits."
> >
> > But that's ultimately defined by (3) above, rather than transactions
> > themselves.
>
> After recently spent a lot of evaluation on 'persisted spaces', I can
> only say 2 things;
>
>   - It is hard to get right with the SLA that matters;
>
>   - The SLA that I, the enterprise developer of resilient HA system,
> care about is that once the operation completes (call it transaction
> commit), the state transition is preserved even if N arbitrary
> failures occur, AND that the system has a computable T time to restore
> itself from 'reduced HA' to 'full HA' within which additional failures
> beyond N is not guaranteeing preservation of state transition. The
> larger the T, the more N I need which increases my cost profile.
>

That's not a bad way to think about things....


>
> The are of the Jini Transaction Specification is really interesting,
> since one needs to figure out how to make the transaction manager
> distributed and resilient as well, and recovery from a transaction
> manager failure, half way through second phase... I don't even want to
> imagine it. You might find that a new, less featureful spec is the
> best recourse.
>
>
Again, there are a couple of orthogonal issues within this discussion:

(1) Availability of service - that is what can make progress and when. e.g.
You might have a setup that allows transactions created since some failure
event to succeed whilst those prior don't.

(2) Ultimate resolution - two phase commit has some holes that can leave a
transaction unsettled forever. In essence, one must have at least one valid
copy of the transaction log survive and ultimately be available at some
point coinciding with a stable network state so things can sort themselves
out. If you lose all copies of the log, you're toast. A transaction will
hang around forever in the participants unresolved unless one introduces an
additional bounding condition e.g. based on time.

I am really curious of what will come out of this discussion.
>
>
> Cheers
> --
> Niclas Hedhman, Software Developer
> http://www.qi4j.org - New Energy for Java
>
> I live here; http://tinyurl.com/3xugrbk
> I work here; http://tinyurl.com/24svnvk
> I relax here; http://tinyurl.com/2cgsug
>

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Niclas Hedhman <ni...@hedhman.org>.

On Sat, Mar 5, 2011 at 5:55 PM, Dan Creswell <da...@gmail.com> wrote:
> Transactions don't actually matter in any of the above as they are just
> another form of operation boundary. Transactions are "durable":
>
> "The results of a transaction should be as persistent as the entity on which
> the transaction commits."
>
> But that's ultimately defined by (3) above, rather than transactions
> themselves.

After recently spent a lot of evaluation on 'persisted spaces', I can
only say 2 things;

   - It is hard to get right with the SLA that matters;

   - The SLA that I, the enterprise developer of resilient HA system,
care about is that once the operation completes (call it transaction
commit), the state transition is preserved even if N arbitrary
failures occur, AND that the system has a computable T time to restore
itself from 'reduced HA' to 'full HA' within which additional failures
beyond N is not guaranteeing preservation of state transition. The
larger the T, the more N I need which increases my cost profile.

The are of the Jini Transaction Specification is really interesting,
since one needs to figure out how to make the transaction manager
distributed and resilient as well, and recovery from a transaction
manager failure, half way through second phase... I don't even want to
imagine it. You might find that a new, less featureful spec is the
best recourse.

I am really curious of what will come out of this discussion.


Cheers
-- 
Niclas Hedhman, Software Developer
http://www.qi4j.org - New Energy for Java

I live here; http://tinyurl.com/3xugrbk
I work here; http://tinyurl.com/24svnvk
I relax here; http://tinyurl.com/2cgsug

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Patricia Shanahan <pa...@acm.org>.

Whether or not we need to change the meaning of the ordering 
specification, we certainly need to make it more formal.

One key issue is whether the specification implies a global total order 
of all operations, such as one would naturally get if they all operate 
on a single data structure. Sun historically had a tendency to build in 
that assumption.

Patricia



On 3/5/2011 1:55 AM, Dan Creswell wrote:
> We've talked a lot about transactions and "persisted safely" but as yet I've
> seen no discussion of ordering (from the spec):
>
> "Operations on a space are unordered. The only view of operation order can
> be a thread's view of the order of the operations it performs. A view of
> inter-thread order can be imposed only by cooperating threads that use an
> application-specific protocol to prevent two or more operations being in
> progress at a single time on a single JavaSpaces service. Such means are
> outside the purview of this specification.
> For example, given two threads *T* and *U*, if *T* performs a write operation
> and *U* performs a read with a template that would match the written entry,
> the read may not find the written entry even if the write returns before
> the read. Only if *T* and *U* cooperate to ensure that the write returns
> before the read commences would the read be ensured the opportunity to find
> the entry written by *T*(although it still might not do so because of an
> intervening take from a third entity)."
>
> In particular note that last sentence and the implications it has for
> completion of operations.
>
> Note also the the persistent property amounts to when an operation completes
> (or indeed a transaction) successfully, the operation(s) are now
> "remembered". "Remembered" being persisted with whatever guarantee a space
> provides. Traditionally that's transient (temporary in-memory) and
> persistent (recovered across space restarts, spec is silent on the quality
> of storage and its redundancy).
>
> If you ask me then, what needs considering is:
>
> (1) If we wish to relax the operation ordering constraints further.
> (2) Which if any of the operations we do not wish to support (*IfExists can
> be particularly tough to implement efficiently).
> (3) What our persistence guarantees will be.
>
> Transactions don't actually matter in any of the above as they are just
> another form of operation boundary. Transactions are "durable":
>
> "The results of a transaction should be as persistent as the entity on which
> the transaction commits."
>
> But that's ultimately defined by (3) above, rather than transactions
> themselves.
>
> On 5 March 2011 09:40, Tom Hobbs<tv...@googlemail.com>  wrote:
>
>> I have no preconceived idea about what "persisted safely" means.
>> Assuming some implementation like an RS orchestration layer on top of
>> a series of normal Java Spaces, part of the configuration might be
>> "safely persisted means that the entry exists in at least x available
>> underlying spaces".
>>
>> You're right about transactions being closely related and should be
>> considered along with everything else.  I'm (maybe more) keen to talk
>> about other kind of features like;
>>
>> - should those transactions be controlled by the client using the RS
>> or should the RS use them transparently
>> - the "optimistic writing" etc that I mentioned in the space
>>
>> Without understanding more about how the thing will/might be used I'm
>> less inclined to start thinking about how to build it.
>>
>> But like I said in the Jira, these things are what I naively think are
>> important - that doesn't mean that I didn't miss some other big stuff.
>>
>> Cheers,
>>
>> Tom
>>
>>
>> On Fri, Mar 4, 2011 at 1:47 PM, Patricia Shanahan<pa...@acm.org>  wrote:
>>> On 3/4/2011 3:47 AM, Tom Hobbs (JIRA) wrote:
>>>
>>>>   - Block the write method call until the RS is happy the entry is
>>>> persisted safely
>>>
>>> Could you define what you mean by "persisted safely"? Do you count
>> getting
>>> it to non-volatile storage, or does it need to be stored on multiple
>>> servers? If the transaction is in non-volatile storage but that storage
>> is
>>> attached to a dead server, the entry exists only in a very theoretical
>>> sense, and attempts to read or take it would fail.
>>>
>>> I feel that distributed transactions are sufficiently closely related
>> that
>>> they should be discussed in the same Jira.
>>>
>>> Patricia
>>>
>>
>

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Dan Creswell <da...@gmail.com>.

We've talked a lot about transactions and "persisted safely" but as yet I've
seen no discussion of ordering (from the spec):

"Operations on a space are unordered. The only view of operation order can
be a thread's view of the order of the operations it performs. A view of
inter-thread order can be imposed only by cooperating threads that use an
application-specific protocol to prevent two or more operations being in
progress at a single time on a single JavaSpaces service. Such means are
outside the purview of this specification.
For example, given two threads *T* and *U*, if *T* performs a write operation
and *U* performs a read with a template that would match the written entry,
the read may not find the written entry even if the write returns before
the read. Only if *T* and *U* cooperate to ensure that the write returns
before the read commences would the read be ensured the opportunity to find
the entry written by *T*(although it still might not do so because of an
intervening take from a third entity)."

In particular note that last sentence and the implications it has for
completion of operations.

Note also the the persistent property amounts to when an operation completes
(or indeed a transaction) successfully, the operation(s) are now
"remembered". "Remembered" being persisted with whatever guarantee a space
provides. Traditionally that's transient (temporary in-memory) and
persistent (recovered across space restarts, spec is silent on the quality
of storage and its redundancy).

If you ask me then, what needs considering is:

(1) If we wish to relax the operation ordering constraints further.
(2) Which if any of the operations we do not wish to support (*IfExists can
be particularly tough to implement efficiently).
(3) What our persistence guarantees will be.

Transactions don't actually matter in any of the above as they are just
another form of operation boundary. Transactions are "durable":

"The results of a transaction should be as persistent as the entity on which
the transaction commits."

But that's ultimately defined by (3) above, rather than transactions
themselves.

On 5 March 2011 09:40, Tom Hobbs <tv...@googlemail.com> wrote:

> I have no preconceived idea about what "persisted safely" means.
> Assuming some implementation like an RS orchestration layer on top of
> a series of normal Java Spaces, part of the configuration might be
> "safely persisted means that the entry exists in at least x available
> underlying spaces".
>
> You're right about transactions being closely related and should be
> considered along with everything else.  I'm (maybe more) keen to talk
> about other kind of features like;
>
> - should those transactions be controlled by the client using the RS
> or should the RS use them transparently
> - the "optimistic writing" etc that I mentioned in the space
>
> Without understanding more about how the thing will/might be used I'm
> less inclined to start thinking about how to build it.
>
> But like I said in the Jira, these things are what I naively think are
> important - that doesn't mean that I didn't miss some other big stuff.
>
> Cheers,
>
> Tom
>
>
> On Fri, Mar 4, 2011 at 1:47 PM, Patricia Shanahan <pa...@acm.org> wrote:
> > On 3/4/2011 3:47 AM, Tom Hobbs (JIRA) wrote:
> >
> >>  - Block the write method call until the RS is happy the entry is
> >> persisted safely
> >
> > Could you define what you mean by "persisted safely"? Do you count
> getting
> > it to non-volatile storage, or does it need to be stored on multiple
> > servers? If the transaction is in non-volatile storage but that storage
> is
> > attached to a dead server, the entry exists only in a very theoretical
> > sense, and attempts to read or take it would fail.
> >
> > I feel that distributed transactions are sufficiently closely related
> that
> > they should be discussed in the same Jira.
> >
> > Patricia
> >
>

Re: [jira] Created: (RIVER-393) Distributed Java Space

Posted by Tom Hobbs <tv...@googlemail.com>.

I have no preconceived idea about what "persisted safely" means.
Assuming some implementation like an RS orchestration layer on top of
a series of normal Java Spaces, part of the configuration might be
"safely persisted means that the entry exists in at least x available
underlying spaces".

You're right about transactions being closely related and should be
considered along with everything else.  I'm (maybe more) keen to talk
about other kind of features like;

- should those transactions be controlled by the client using the RS
or should the RS use them transparently
- the "optimistic writing" etc that I mentioned in the space

Without understanding more about how the thing will/might be used I'm
less inclined to start thinking about how to build it.

But like I said in the Jira, these things are what I naively think are
important - that doesn't mean that I didn't miss some other big stuff.

Cheers,

Tom

On Fri, Mar 4, 2011 at 1:47 PM, Patricia Shanahan <pa...@acm.org> wrote:
> On 3/4/2011 3:47 AM, Tom Hobbs (JIRA) wrote:
>
>>  - Block the write method call until the RS is happy the entry is
>> persisted safely
>
> Could you define what you mean by "persisted safely"? Do you count getting
> it to non-volatile storage, or does it need to be stored on multiple
> servers? If the transaction is in non-volatile storage but that storage is
> attached to a dead server, the entry exists only in a very theoretical
> sense, and attempts to read or take it would fail.
>
> I feel that distributed transactions are sufficiently closely related that
> they should be discussed in the same Jira.
>
> Patricia
>